《(1.1.2)--Samuel's Ch1编译原理与实践英文版.ppt》由会员分享,可在线阅读,更多相关《(1.1.2)--Samuel's Ch1编译原理与实践英文版.ppt(30页珍藏版)》请在taowenge.com淘文阁网|工程机械CAD图纸|机械工程制图|CAD装配图下载|SolidWorks_CaTia_CAD_UG_PROE_设计图分享下载上搜索。
1、Chapter 1 Chapter 1 Introduction Introduction Bootstrapping and PortingBootstrapping and PortingThere is a strong interaction between the algorithms used by the phases of a compiler and the data structures that support these phases.Algorithms need to be implemented in efficient manner.The choice of
2、data structures is importantCompilerS2Major Data Structures in CompilerMajor Data Structures in CompilerCompilerS3Temporary FilesTemporary FilesIntermediate CodeIntermediate CodeThe Literal TableThe Literal TableThe Symbol TableThe Symbol TableThe Syntax TreeThe Syntax TreeTokensTokensData Structure
3、sData StructuresWhen a scanner collects characters into a token,it represents the token symbolically as a value of an enumerated data type representing a set of tokens of the source languageSometimes,it is necessary to preserve the character string itself or other information derived from itThe name
4、 associated with an identifier tokenThe value of a number tokenIn most languages the scanner needs to generate one token at a time(single symbol single symbol lookaheadlookahead)A single global variable can be used to hold the token information.CompilerS4TokensTokensThe syntax tree is constructed as
5、 a standard pointer-based structure that is dynamically allocated as parsing proceeds.The tree can be kept as a single variable pointing to the root node.Each node is a record.Its fields represent the information collected by the parser and the semantic analyzer.Sometimes these fields are dynamicall
6、y allocatedCompilerS5The Syntax TreeThe Syntax TreeThis data structure keeps information associated with identifiers,functions,variables,constants,and data types.The symbol table interacts with almost every phase of the compiler.The insertion,deletion access operations need to be efficient.A standar
7、d data type for this purpose is the hash table.CompilerS6The Symbol TableThe Symbol TableStores constants and strings used in the program.Quick insertion and lookup are essential.Need not allow deletions.CompilerS7The Literal TableThe Literal TableDepending on the kind of intermediate code,it may be
8、 kept as An array of text stringsA temporary text fileLinked list of structuresCompilerS8Intermediate CodeIntermediate CodeComputers did not possess enough memory for an entire program to be kept in memory during compilation.This was solved by using temporary files to hold the products of intermedia
9、te steps.Memory constrains are now much smaller problem.Occasionally,compilers generate intermediate files during some of the steps.CompilerS9Temporary FilesTemporary FilesPasses Language DefinitionError HandlingCompilerS10Other Issues in Compiler StructureOther Issues in Compiler StructureA compile
10、r often processed the entire source program several times before generating code.These repetitions are referred as passespasses.Passes may or may not correspond to phases.Depending on the language,a compiler may be one pass.Efficient compilation,but not efficient target code.Examples:Pascal and C.Mo
11、st compilers with optimizations use more than one pass:1.Scanning and parsing2.Semantic analysis and source-level optimization3.Code generation and target code optimizationCompilerS11Passes Passes The description of the lexical,syntactic,and semantics of a programming language is collected in a lang
12、uage reference manuallanguage reference manual,or language definitionlanguage definition.With a new language,a language definition and compiler are often developed together.More common situation is when a compiler is written for well-known language which has an existing language definition.CompilerS
13、12Language DefinitionLanguage DefinitionOne of the most important functions of a compiler.Errors can be detected during almost every phase of compilation.Error reported by a compiler are static(or static(or compile-time)errorscompile-time)errors.It is important to generate meaningful error messages.
14、Error handlerError handler contains different operations for a specific compiler phase and situationCompilerS13Error HandlingError HandlingThe implementation(or hosthost)language has to be machine language.This was how the first compilers were written.Another approach is to write the compiler in ano
15、ther language for which a compiler already exists.We need only to compile the new compiler using the existing compiler to get a running program.What if the existing compiler runs on a machine different from the target machine?Compilation produces a cross compilercross compiler a compiler that genera
16、tes target code for a different machine.CompilerS14Compiler LanguageCompiler LanguageCompilerS15Compiler LanguageCompiler LanguageBootstrappingBootstrappingPortingPortingCompilerS16T-DiagramT-DiagramA compiler written in language H(host language)that translates language S(source language)into langua
17、ge T(target language)is drawn as the following T-diagramT-diagram:This is equivalent to saying that the compiler runs on“machine”H.Typically,we expect H=T.the compiler produces code for the same machine as the one on which it runs.STHCompilerS17Case 1Case 1There are two compilers that run on the sam
18、e machine H.One translates from language A to language B.The other translates from language B to language C.We can combine them by letting the output of the first to be the input to the second.The result is a compiler from A to C on machine H.ABHBCHACH=CompilerS18Case 2Case 2We can use a compiler M
19、from“machine”H to“machine”K to translate the implementation language of another compiler from H to K.ABHHKMABK=CompilerS19Existing LanguageExisting LanguageGiven:Machine HCompiler for a language B written in H that translates B to H.Wanted:To get a compiler on machine H that translates a language A
20、to H We can write a compiler for A using language B.AHBBHHAHH=BHHAHHCompilerS20Cross CompilerCross CompilerGiven:Machine K.(K is a host language here.)A compiler for a language B written in K that translates B to K.Wanted:To get a compiler for a language A.The target machine is H.AHBBKKAHK=BKKAHKCom
21、pilerS21H=S?H=S?Can a compiler be written in the same language that it is to compile?No compiler for the source language exists,so the compiler itself cannot be compiled.There is a circulatory problem here.STSSTHCompilerS22Bootstrapping Bootstrapping 1.Write a“quick and dirty”compiler in assembly la
22、nguage.2.Use this compiler to compile the“good”compiler.3.Recompile the“good”compiler to produce a final version.AHAAHHAHH=AHAAHHAHH=Quick and dirtyInefficient compilerInefficient compilerFinal version Compiler in its own languageCompilerS23Porting to New HostPorting to New HostAKAAHHAKH=AKAAKHAKK=O
23、riginal compilerCross compilerRetargeted compilerCompiler in its own languageCompiler in its own languageCross compilerCompilerS24Sample Language and MachineSample Language and MachineA“real”compiler for a real machine has far too much detail.We will use a small language TINY as a running example.Th
24、e target code is the assembly language for a simple hypothetical processor TM machine.CompilerS25TINY LanguageTINY LanguageSequence of statements separated by semicolon.No procedures and no declarations.All variables are integers,declared by assignment.Two control statementsif-statement with optiona
25、l else part.Must be terminated with the keyword end.repeat-statementread and write statements for I/O.Comments are within curly brackets.;CompilerS26TINY Language(cont)TINY Language(cont)Arithmetic expression involves integer constants,variables,parentheses,and for integer operations+,-,*,/.Boolean
26、expression is two arithmetic expression combined with the two comparison operators and=.May appear only as test in control statements.:=is used for an assignment operator.CompilerS27Factorial ProgramFactorial Program Sample program in TINY language-computes factorial read x;input an integer if 0 x t
27、hen dont compute if x=0 fact:=1;repeat fact:=fact*x;x:=x-1 until x=0;write fact output factorial of x endCompilerS28TM MachineTM MachineTM has some of the properties of RISCs.All arithmetic must take place in registersAddressing modes are limitedThe following example illustrated the code for aindex=
28、6;index is assumed to be at location 10 in memorya is at location 20 in memory.Addressing modes for load operation1.LDC is load constant2.LD is load from memory3.LDA is load addressCompilerS29ExampleExampleAddresses are in the form of“register+offset”10(1)is address computed by the offset 10 to the
29、contents of register 1 LDC 1,0(0)load 0 into reg 1LD 0,10(1)load val at 10+R1 into R0LDC 1,2(0)load 2 into reg 1MUL 0,1,0 put R1*R0 into R0LDC 1,0(0)load 0 into reg 1LDA 1,20(1)load 20+R1 into R1ADD 0,1,0 put R1+R0 into R0LDC 1,6(0)load 6 into reg 1ST 1,0(0)store R1 at 0+R0LDC is load constantLD is
30、load from memoryLDA is load addressaindex=6;index is assumed to be at location 10 in memorya is at location 20 in memory.CompilerS30HomeworkHomework1.71.7 Suppose you have a Pascal-to-C translator written in C and a working C compiler.Use T-diagrams to describe the steps you would take to create a working Pascal compiler.