《Abstract Interpretation with Applications to Timing Validation.ppt》由会员分享,可在线阅读,更多相关《Abstract Interpretation with Applications to Timing Validation.ppt(118页珍藏版)》请在taowenge.com淘文阁网|工程机械CAD图纸|机械工程制图|CAD装配图下载|SolidWorks_CaTia_CAD_UG_PROE_设计图分享下载上搜索。
1、Timing Analysis-timing guarantees for hard real-time systems-Reinhard WilhelmSaarland UniversitySaarbrckenTexPoint fonts used in EMF.Read the TexPoint manual before you delete this box.:AAAAAStructure of the Lecture1.Introduction2.Static timing analysis1.the problem2.our approach3.the success4.tool
2、architecture3.Cache analysis4.Pipeline analysis5.Value analysis6.Worst-case path determination7.Timing Predictabilitycachesnon-cache-like devicesfuture architectures8.ConclusionIndustrial NeedsHard real-time systems,often in safety-critical applications aboundAeronautics,automotive,train industries,
3、manufacturing control Wing vibration of airplane,sensing every 5 mSecSideairbag in car,Reaction in 10 mSeccrankshaft-synchronous taskshave very tight deadlines,45uSTiming Analysis Embedded controllers are expected to finish their tasks reliably within time bounds.The problem:Given1.a software to pro
4、duce some reaction,2.a hardware platform,on which to execute the software,3.required reaction time.Derive:a guarantee for timeliness.Timing Analysisprovides parameters for schedulability analysis:Execution time,Ci,of tasks,and if that is impossible,upper bounds and maybe also lower bounds on executi
5、on times of tasks,often called Worst-Case Execution Times(WCET)and Best-Case Execution Times(BCET).Architecture(constant executiontimes)Timing Analysis the Search Spaceall control-flow paths(through the binary executable)depending on the possible inputs.Feasible as search for a longest path if itera
6、tion and recursion are bounded,execution time of instructions are(positive)constants.Elegant method:Timing Schemata(Shaw 89)inductive calculation of upper bounds.SoftwareInputub(if if b thenthen S1 elseelse S2):=ub(b)+max(ub(S1),ub(S2)High-Performance Microprosessorsincrease(average-case)performance
7、 by using:Caches,Pipelines,Branch Prediction,SpeculationThese features make timing analysis difficult:Execution times of instructions vary widelyBest case-everything goes smoothly:no cache miss,operands ready,resources free,branch correctly predictedWorst case-everything goes wrong:all loads miss th
8、e cache,resources are occupied,operands not readySpan may be several hundred cyclesVariability of Execution TimesLOAD r2,_aLOAD r1,_bADD r3,r2,r1PPC 755x=a+b;In most cases,executionwill be fast.So,assuming the worst caseis safe,but very pessimistic!AbsInts WCET Analyzer aiT IST Project DAEDALUS fina
9、l review report:The AbsInt tool is probably thebest of its kind in the world and it is justified to consider this result as a breakthrough.”Several time-critical subsystems of the Airbus A380 have been certified using aiT;aiT is the only validated tool for these applications.Tremendous Progressdurin
10、g the 10 years from 1998 to 2008199520022005over-estimation20-30%15%30-50%42560200cache-miss penaltyLim et al.Thesing et al.Souyris et al.The explosion of penalties has been compensated by the improvement of the analyses!10%25%State-dependent Execution TimesExecution time depend on the execution sta
11、te.Execution state results from the execution history.semantics state:values of variablesexecution state:occupancy of resourcesstateArchitectureTiming Analysis the Search Spacewith State-dependent Execution Timesall control-flow paths depending on the possible inputsall paths through the architectur
12、e for potential initial statesSoftwareInputinitialstatemul rD,rA,rB execution states for paths reaching this program pointinstructionin I-cacheinstructionnot in I-cache1bus occupiedbus not occupiedsmall operandslarge operands14 40ArchitectureTiming Analysis the Search Spacewith out-of-order executio
13、nall control-flow paths depending on the possible inputsall paths through the architecture for potential initial statesincluding different schedules for instruction sequencesSoftwareInputinitialstateArchitectureTiming Analysis the Search Spacewith multi-threadingall control-flow paths depending on t
14、he possible inputsall paths through the architecture for potential initial statesincluding different schedules for instruction sequencesincluding different interleavings of accesses to shared resourcesSoftwareInputinitialstateWhy Exhaustive Exploration?Naive attempt:follow local worst-case transitio
15、ns onlyUnsound in the presence of Timing Anomalies:A path starting with a local worst case may have a lower overall execution time,Ex.:a cache miss preventing a branch mis-predictionCaused by the interference between processor components:Ex.:cache hit/miss influences branch prediction;branch predict
16、ion causes prefetching;prefetching pollutes the I-cache.State Space Explosion in Timing Analysisconstantexecutiontimesstate-dependentexecution timesout-of-orderexecutionpreemptiveschedulingconcurrency+shared resourcesyears+methods199520002010Timing schemata Static analysis?Caches,pipelines,speculati
17、on:combined cache andpipeline analysisSuperscalar processors:interleavingsof all schedulesMulti-core withshared resources:interleavingsof several threadsNotions in Timing AnalysisHard or impossible to determineDetermine upper bounds instead High-Level Requirements for Timing AnalysisUpper bounds mus
18、t be safe,i.e.not underestimatedUpper bounds should be tight,i.e.not far away from real execution timesAnalogous for lower boundsAnalysis effort must be tolerableNote:all analyzed programs are terminating,loop bounds need to be known no decidability problem,but a complexity problem!Timing Accidents
19、and PenaltiesTiming Accident cause for an increase of the execution time of an instructionTiming Penalty the associated increaseTypes of timing accidentsCache missesPipeline stallsBranch mispredictionsBus collisionsMemory refresh of DRAMTLB missExecution Time is History-SensitiveContribution of the
20、execution of an instruction to a programs execution time depends on the execution state,e.g.the time for a memory access depends on the cache statethe execution state depends on the execution historyneeded:an invariant about the set of execution states produced by all executions reaching a program p
21、oint.We use abstract interpretation to compute these invariants.Deriving Run-Time GuaranteesOur method and tool,aiT,derives Safety Properties from these invariants:Certain timing accidents will never happen.Example:At program point p,instruction fetch will never cause a cache miss.The more accidents
22、 excluded,the lower the upper bound.MurphysinvariantFastestVariance of execution timesSlowestAbstract Interpretation in Timing AnalysisAbstract interpretation statically analyzes a program for a given property without executing it.Derived properties therefore hold for all executions.It is based on t
23、he semantics of the analyzed language.A semantics of a programming language that talks about time needs to incorporate the execution platform!Static timing analysis is thus based on such a semantics.The Architectural Abstraction inside the Timing AnalyzerTiming analyzerArchitectural abstractionsCach
24、eAbstractionPipeline AbstractionValue Analysis,Control-FlowAnalysis,Loop-BoundAnalysisabstractions ofthe processorsarithmeticAbstract Interpretation in Timing AnalysisDeterminesinvariants about the values of variables(in registers,on the stack)to compute loop boundsto eliminate infeasible pathsto de
25、termine effective memory addressesinvariants on architectural execution stateCache contents predict hits&missesPipeline states predict or exclude pipeline stallsTool ArchitectureAbstract InterpretationsAbstract InterpretationInteger LinearProgrammingValue Analysis Determines enclosingintervals for t
26、he set of values in registers and local variables,used fordetermining addresses.Loop boundanalysisDetermines loop boundsControl FlowAnalysisDetermines infeasible pathsThe Story in DetailTool ArchitectureValue AnalysisMotivation:Provide access information to data-cache/pipeline analysisDetect infeasi
27、ble pathsDerive loop boundsMethod:calculate intervals at all program points,i.e.lower and upper bounds for the set of possible values occurring in the machine program(addresses,register contents,local and global variables)(Cousot/Halbwachs78)Value Analysis II Intervals are computed along the CFG edg
28、es At joins,intervals are unioned“D1:-2,+2D1:-4,0D1:-4,+2move.l#4,D0add.l D1,D0move.l(A0,D0),D1D1:-4,4,A0 x1000,0 x1000D04,4,D1:-4,4,A0 x1000,0 x1000D00,8,D1:-4,4,A0 x1000,0 x1000access 0 x1000,0 x1008Which address is accessed here?Value Analysis(Airbus Benchmark)1Ghz Athlon,Memory usage Cache sets
29、are independent:Everything explained in terms of one setLRU-Replacement Strategy:Replace the block that has been Least Recently UsedModeled by AgesExample:4-way set associative cacheage0123m0m1Access m4 (miss)m4m2m1Access m1 (hit)m0m4m2m1m5Access m5 (miss)m4m0m0 m1 m2 m3Cache AnalysisHow to statical
30、ly precompute cache contents:Must Analysis:For each program point(and context),find out which blocks are in the cache prediction of cache hitsMay Analysis:For each program point(and context),find out which blocks may be in the cacheComplement says what is not in the cache prediction of cache missesI
31、n the following,we consider must analysis until otherwise stated.(Must)Cache AnalysisConsider one instruction in the program.There may be many paths leading to this instruction.How can we compute whether a will always be in cache independently of which path execution takes?load a Question:Is the acc
32、ess to a always a cache hit?Determine Cache-Information(abstract cache states)at each Program Pointa,bxyoungest age-0oldest age-3Interpretation of this cache information:describes the set of all concrete cache states in which x,a,and b occur x with an age not older than 1 a and b with an age not old
33、er than 2,Cache information contains 1.only memory blocks guaranteed to be in cache.2.they are associated with their maximal age.Cache Analysis how does it work?How to compute for each program point an abstract cache state representing a set of memory blocks guaranteed to be in cache each time execu
34、tion reaches this program point?Can we expect to compute the largest set?Trade-off between precision and efficiency quite typical for abstract interpretation(Must)Cache analysis of a memory accessa,bxaccess to ab,xaAfter the access to a,a is the youngest memory block in cache,and we must assume that
35、 x has aged.What about b?baaccess to ab axyyxconcretetransferfunction(cache)abstracttransferfunction(analysis)Combining Cache InformationConsider two control-flow paths to a program point:for one,prediction says,set of memory blocks S1 in cache,for the other,the set of memory blocks S2.Cache analysi
36、s should not predict more than S1 S2 after the merge of paths.the elements in the intersection should have their maximal age from S1 and S2.Suggests the following method:Compute cache information along all paths to a program point and calculate their intersection but too many paths!More efficient me
37、thod:combine cache information on the way,iterate until least fixpoint is reached.There is a risk of losing precision,not in case of distributive transfer functions.What happens when control-paths merge?a c,f d c e a d a,c d“intersection +maximal age”We canguaranteethis contenton this path.We cangua
38、ranteethis contenton this path.Which contentcan weguaranteeon this path?combine cache information at each control-flow merge pointMust-Cache and May-Cache-InformationThe presented cache analysis is a Must Analysis.It determines safe information about cache hits.Each predicted cache hit reduces the u
39、pper bound.We can also perform a May Analysis.It determines safe information about cache misses Each predicted cache miss increases the lower bound.(May)Cache analysis of a memory accessa,bxaccess to axaWhy?After the access to a a is the youngest memory block in cache,and we must assume that x,y and
40、 b have aged.b,zyzyCache Analysis:Join(may)a c,f d c e a d a,c e f d“union +minimal age”Join(may)Abstract Domain:Must Cachezsxaxsztzsxtszxtztxs AbstractionRepresenting sets of concrete caches by their descriptionconcrete caches z,xsabstract cacheAbstract Domain:Must Cache z,xz,xs s Concretizationsz,
41、xz,x Sets of concrete caches described by an abstract cacheremaining line filled upwith any other blockconcrete cachesabstract cacheover-approximation!Abstract Domain:May Cachezsxaxsztzsxtszxtztxsz,s,x t a Abstractionabstract cacheconcrete cachesAbstract Domain:May Cache Concretizationz,s,x t a abst
42、ract may-caches saywhat definitely is not in cacheand what the minimal age of those is that may be in cache.z,s,xz,s,x,tz,s,x,tz,s,x,t,aconcrete cachesabstract cacheLessons LearnedCache analysis,an important ingredient of static timing analysis,provides for abstract domains,which proved to be suffic
43、iently precise,have compact representation,have efficient transfer functions,which are quite natural.Problem Solved?We have shown a solution for LRU caches.LRU-cache analysis works smoothlyFavorable structure“of domainEssential information can be summarized compactlyLRU is the best strategy under se
44、veral aspectsperformance,predictability,sensitivity and yet:LRU is not the only strategyPseudo-LRU(PowerPC 755 Airbus)FIFOworse under almost all aspects,but average-case performance!Contribution to WCETwhile .do max n .ref to s .odtimetmissthitloop timen tmissn thittmiss (n 1)thitthit (n 1)tmissCont
45、extsCache contents depends on the Context,i.e.calls and loopswhile cond do join(must)First Iteration loads the cache=Intersection loses most of the information!Distinguish basic blocks by contextsTransform loops into tail recursive proceduresTreat loops and procedures in the same wayUse interprocedu
46、ral analysis techniques,VIVU virtual inlining of proceduresvirtual unrolling of loopsDistinguish as many contexts as useful1 unrolling for caches1 unrolling for branch prediction(pipeline)Structure of the Lectures1.Introduction2.Static timing analysis1.the problem2.our approach3.the success4.tool ar
47、chitecture3.Cache analysis4.Pipeline analysis5.Value analysis6.Worst-case path analysis7.Timing Predictabilitycachesnon-cache-like devicesfuture architectures8.ConclusionTool ArchitectureAbstract InterpretationsAbstract InterpretationInteger LinearProgrammingPipelinesHardware Features:PipelinesIdeal
48、 Case:1 Instruction per CycleFetchDecodeExecuteWBFetchDecodeExecuteWBInst 1Inst 2Inst 3Inst 4FetchDecodeExecuteWBFetchDecodeExecuteWBFetchDecodeExecuteWBPipelinesInstruction execution is split into several stagesSeveral instructions can be executed in parallelSome pipelines can begin more than one i
49、nstruction per cycle:VLIW,SuperscalarSome CPUs can execute instructions out-of-orderPractical Problems:Hazards and cache missesPipeline HazardsPipeline Hazards:Data Hazards:Operands not yet available(Data Dependences)Resource Hazards:Consecutive instructions use same resourceControl Hazards:Conditio
50、nal branchInstruction-Cache Hazards:Instruction fetch causes cache missCache analysis:prediction of cache hits on instruction or operand fetch or storeStatic exclusion of hazardslwz r4,20(r1)HitDependence analysis:elimination of data hazardsResource reservation tables:elimination of resource hazards