《指令系统 教学课件.ppt》由会员分享,可在线阅读,更多相关《指令系统 教学课件.ppt(76页珍藏版)》请在taowenge.com淘文阁网|工程机械CAD图纸|机械工程制图|CAD装配图下载|SolidWorks_CaTia_CAD_UG_PROE_设计图分享下载上搜索。
1、Instruction Set & Assembly Language Programming,Jianjian SONGSoftware Institute, Nanjing University,Content,Computer Architecture TaxonomyARM Architecture IntroductionARM Instruction SetARM Assembly Language Programming,1. Computer Architecture Taxonomy,What is architecture?,Architecture & Organizat
2、ion 1,Architecture is those attributes visible to the programmerInstruction set, number of bits used for data representation, I/O mechanisms, addressing techniques.e.g. Is there a multiply instruction?Organization is how features are implementedControl signals, interfaces, memory technology.e.g. Is
3、there a hardware multiply unit or is it done by repeated addition?,Architecture & Organization 2,All Intel x86 family share the same basic architectureThe IBM System/370 family share the same basic architectureThis gives code compatibilityAt least backwardsOrganization differs between different vers
4、ions,von Neumann architecture,Memory holds data, instructions.Central processing unit (CPU) fetches instructions from memory.Separate CPU and memory distinguishes programmable computer.CPU registers help out: program counter (PC), instruction register (IR), general-purpose registers, etc.,CPU + memo
5、ry,memory,CPU,PC,address,data,IR,ADD r5,r1,r3,200,200,ADD r5,r1,r3,Harvard architecture,CPU,PC,data memory,program memory,address,data,address,data,von Neumann vs. Harvard,Harvard cant use self-modifying code.Harvard allows two simultaneous memory fetches.Most DSPs use Harvard architecture for strea
6、ming data:greater memory bandwidth;more predictable bandwidth.,RISC vs. CISC,Complex instruction set computer (CISC):many addressing modes;many operations.Reduced instruction set computer (RISC):load/store;pipelinable instructions.,Load-store Architecture,指令集仅能处理(如ADD、SUB等)寄存器中(或指令中直接指定)的值,而且总是将处理结果
7、放回寄存器中。针对存储器的唯一操作是将存储器的值装入寄存器(load指令),或将寄存器的值存到存储器(store指令)。相比较,典型的CISC处理器允许将存储器中的值加(ADD)到寄存器,有时还允许将寄存器的值加(ADD)到存储器中。,Instruction set characteristics,Fixed vs. variable length.Addressing modes.Number of operands.Types of operands.,Programming model,Programming model: registers visible to the program
8、mer.Some registers are not visible (e.g. IR).,Multiple implementations,Successful architectures have several implementations:varying clock speeds;different bus widths;different cache sizes;etc.,2. ARM Architecture Introduction,ARM (Advanced RISC Machines)ARM公司是一家设计公司,是IP 供应商,靠转让设计许可证由合作伙伴生产各具特色的芯片。W
9、hat is IP?Intellectual Property,ARM的特点,ARM具有RISC体系的一般特点:大量寄存器绝大多数操作都在寄存器中进行,通过Load/Store的在内存和寄存器间传递数据。寻址方式简单采用固定长度的指令格式此外,小体积、低功耗、低成本、高性能16位/32位双指令集全球众多合作伙伴,ARM体系结构的版本和扩充,六个版本ARMv1 ARMv6ARM体系结构的扩充Thumb (T variant): 16位指令集,用以改善指令密度;DSP (E variant): 用于DSP应用的算术运算指令集;Jazeller (J variant): 允许直接执行Java字节码,
10、什么是指令密度?执行同等操作序列的前提下,单位内存空间所容纳的机器指令数。,ARM体系结构版本的命名格式,命名字符串:ARMvx (x: 指令集版本号,16)表示变种的字符 (如 T, E, J )用字符x表示排除某种写功能。,ARM处理器系列,ARM7系列ARM9系列ARM9E系列ARM10系列SecureCore系列Intel StrongARMIntel XScale,3. ARM Instruction Set,ARM assembly languageARM programming modelARM memory organizationARM data operationsARM
11、flow of control,Assembly language,Why assembly language?One-to-one with instructions (more or less).Basic features:One instruction per line.Labels provide names for addresses (usually in first column).Instructions often start in later columns.Columns run to end of line.,ARM assembly language example
12、,label1ADR r4,cLDR r0,r4 ; a commentADR r4,dLDR r1,r4SUB r0,r0,r1 ; comment,ARM指令的一般编码格式,opcode: 指令操作符编码cond: 指令执行条件编码S: 指令的操作是否影响CPSR的值Rn: 包含第一个操作数的寄存器编码Rd: 目标寄存器编码Shifter_operand: 第二个操作数,ARM指令的基本寻址方式,寄存器寻址例:ADD R0 , R1 , R2 ; (R1)+(R2)R0立即数寻址例:ADD R3 , R3 , #2 ; (R3)+2R3寄存器间接寻址例:LDR R0 , R3 ; (R3)
13、R0寄存器变址例:LDR R0 , R1, #4 ; (R1)+4)R0相对寻址例:B rel ; (PC)+relPC,Pseudo-ops,Some assembler directives dont correspond directly to instructions:Define current address.Reserve storage.Constants.,ARM programming model,r0,r1,r2,r3,r4,r5,r6,r7,r8,r9,r10,r11,r12,r13,r14,r15 (PC),CPSR,31,0,Endianness,Relations
14、hip between bit and byte/word ordering defines endianness:,byte 3,byte 2,byte 1,byte 0,byte 0,byte 1,byte 2,byte 3,bit 31,bit 0,bit 0,bit 31,little-endian,big-endian,ARM data types,Word is 32 bits long.Word can be divided into four 8-bit bytes.ARM addresses can be 32 bits long.Address refers to byte
15、.Address 4 starts at byte 4.Can be configured at power-up as either little- or big-endian mode.,ARM status bits,Every arithmetic, logical, or shifting operation sets CPSR bits:N (negative), Z (zero), C (carry), V (overflow).Examples: -1 + 1 = 0: NZCV = 0110.231-1+1 = -231: NZCV = 0101.,Instructions
16、Overview,Data instructionsMove InstructionsLoad/Store instructionsComparison instructionsBranch instructions,ARM data instructions,Basic format:ADD r0,r1,r2Computes r1+r2, stores in r0.Immediate operand:ADD r0,r1,#2Computes r1+2, stores in r0.,ARM data instructions,ADD, ADC : add (w. carry)SUB, SBC
17、: subtract (w. carry)RSB, RSC : reverse subtract (w. carry)MUL, MLA : multiply (and accumulate),AND, ORR, EORBIC : bit clearLSL, LSR : logical shift left/rightASL, ASR : arithmetic shift left/rightROR : rotate rightRRX : rotate right extended with C,Data operation varieties,Logical shift:fills with
18、zeroes.Arithmetic shift:fills with ones.RRX performs 33-bit rotate, including C bit from CPSR above sign bit.,ARM move instructions,MOV, MVN : move (negated)MOV r0, r1 ; sets r0 to r1,ARM load/store instructions,LDR, LDRH, LDRB : load (half-word, byte)STR, STRH, STRB : store (half-word, byte)Address
19、ing modes:register indirect : LDR r0,r1with second register : LDR r0,r1,-r2with constant : LDR r0,r1,#4,ARM comparison instructions,CMP : compareCMN : negated compareTST : bit-wise testTEQ : bit-wise negated testThese instructions set only the NZCV bits of CPSR.,ARM branch instructions,B: BranchBL:
20、Branch and Link,ARM ADR pseudo-op,Cannot refer to an address directly in an instruction.Generate value by performing arithmetic on PC.ADR pseudo-op generates instruction required to calculate address:ADR r1,FOO,Example: C assignments,C: x = (a + b) - c;Assembler:ADR r4,a; get address for aLDR r0,r4;
21、 get value of aADR r4,b; get address for b, reusing r4LDR r1,r4; get value of bADD r3,r0,r1; compute a+bADR r4,c; get address for cLDR r2,r4; get value of c,C assignment, contd.,SUB r3,r3,r2; complete computation of xADR r4,x; get address for xSTR r3,r4; store value of x,Example: C assignment,C:y =
22、a*(b+c);Assembler:ADR r4,b ; get address for bLDR r0,r4 ; get value of bADR r4,c ; get address for cLDR r1,r4 ; get value of cADD r2,r0,r1 ; compute partial resultADR r4,a ; get address for aLDR r0,r4 ; get value of a,C assignment, contd.,MUL r2,r2,r0 ; compute final value for yADR r4,y ; get addres
23、s for ySTR r2,r4 ; store y,Example: C assignment,C:z = (a 2) | (b perform OR,C assignment, contd.,ADR r4,z ; get address for zSTR r1,r4 ; store value for z,Additional addressing modes,Base-plus-offset addressing:LDR r0,r1,#16Loads from location r1+16Auto-indexing increments base register:LDR r0,r1,#
24、16!Post-indexing fetches, then does offset:LDR r0,r1,#16Loads r0 from r1, then adds 16 to r1.,ARM flow of control,All operations can be performed conditionally, testing CPSR:EQ, NE, CS, CC, MI, PL, VS, VC, HI, LS, GE, LT, GT, LEBranch operation:B #100Can be performed conditionally.,Example: if state
25、ment,C: if (a = b, branch to false block,If statement, contd.,; true blockMOV r0,#5 ; generate value for xADR r4,x ; get address for xSTR r0,r4 ; store xADR r4,c ; get address for cLDR r0,r4 ; get value of cADR r4,d ; get address for dLDR r1,r4 ; get value of dADD r0,r0,r1 ; compute yADR r4,y ; get
26、address for ySTR r0,r4 ; store yB after ; branch around false block,If statement, contd.,; false blockfblock ADR r4,c ; get address for cLDR r0,r4 ; get value of cADR r4,d ; get address for dLDR r1,r4 ; get value for dSUB r0,r0,r1 ; compute a-bADR r4,x ; get address for xSTR r0,r4 ; store value of x
27、after .,Example: Conditional instruction implementation,; true blockMOVLT r0,#5 ; generate value for xADRLT r4,x ; get address for xSTRLT r0,r4 ; store xADRLT r4,c ; get address for cLDRLT r0,r4 ; get value of cADRLT r4,d ; get address for dLDRLT r1,r4 ; get value of dADDLT r0,r0,r1 ; compute yADRLT
28、 r4,y ; get address for ySTRLT r0,r4 ; store y,Example: switch statement,C: switch (test) case 0: break; case 1: Assembler:ADR r2,test ; get address for testLDR r0,r2 ; load value for testADR r1,switchtab ; load address for switch tableLDR r15,r1,r0,LSL #2 ; index switch tableswitchtab DCD case0DCD
29、case1.,Example: FIR filter,C:for (i=0, f=0; iN; i+)f = f + ci*xi;Assembler; loop initiation codeMOV r0,#0 ; use r0 for IMOV r8,#0 ; use separate index for arraysADR r2,N ; get address for NLDR r1,r2 ; get value of NMOV r2,#0 ; use r2 for f,FIR filter, cont.d,ADR r3,c ; load r3 with base of cADR r5,x
30、 ; load r5 with base of x; loop bodyloop LDR r4,r3,r8 ; get ciLDR r6,r5,r8 ; get xiMUL r4,r4,r6 ; compute ci*xiADD r2,r2,r4 ; add into running sumADD r8,r8,#4 ; add one word offset to array indexADD r0,r0,#1 ; add 1 to iCMP r0,r1 ; exit?BLT loop ; if i N, continue,ARM subroutine linkage,Branch and l
31、ink instruction:BL fooCopies current PC to r14.To return from subroutine:MOV r15,r14,Nested subroutine calls,Nesting/recursion requires coding convention:f1LDR r0,r13 ; load arg into r0 from stack; call f2()STR r13!,r14 ; store f1s return adrsSTR r13!,r0 ; store arg to f2 on stackBL f2 ; branch and
32、link to f2; return from f1()SUB r13,#4 ; pop f2s arg off stackLDR r13!,r15 ; restore register and return,Summary,Load/store architectureMost instructions are RISCy, operate in single cycle.Some multi-register operations take longer.All instructions can be executed conditionally.,4. ARM Assembly Lang
33、uage Programming,Why and when to use?AT&T format and Intel formatGrammar of ARM assembly languageExamples,Why and when to use?,操作系统内核中的底层程序直接与硬件打交道,需要用到的专用指令。CPU中的特殊指令频繁使用代码的时间效率程序的空间效率(如操作系统的引导程序),Refer to “Linux内核源代码情景分析” (浙江大学出版社)1.5节,AT&T format and Intel format,Grammar of ARM assembly language,
34、语句程序格式,语句,语句指令伪操作宏语句格式 symbol instruction | directive | pseudo-instruction ;comment ,伪操作,符号定义伪操作数据定义伪操作汇编控制伪操作框架描述伪操作信息报告伪操作其它伪操作,关于变量的伪操作,声明一个全局变量,并初始化GBLA, GBLL, GBLS声明一个局部变量,并初始化LCLA, LCLL, LCLS变量赋值SETA, SETL, SETS,Example,GBLA objectsize ;声明一个全局的算术变量objectsize SETA 0xff ;给该变量赋值SPACE objectsize ;
35、使用该变量GBLL statusBstatusB SETL TRUE,关于数据常量的伪操作,EQUname EQU expr , type通常在.inc文件中,分配内存单元,SPACElabel SPACE bye_num分配一块内存单元,并用0初始化DCBlabel DCB expr, expr分配一段字节内存单元,并用expr初始化DCDlabel DCD expr, expr分配一段字内存单元(分配的内存都是字对齐的),并用expr初始化,MACRO and MEND,子程序与宏在子程序比较短,而需要传递的参数比较多的情况下使用宏汇编技术宏定义体MACRO: 宏定义的开始MEND: 宏定
36、义的结束通常在.mac文件中格式 MACRO $label macroname $para1, $para2, . . ;code MEND,Example,MACRO $label xmac $p1 . ;code$label.loop1 ;宏定义体的内部标号 . ;code BGE $label.loop1$label.loop2 ;宏定义体的内部标号 . ;code BL $p1 ;参数p1是一个子程序的名称 BGT $label.loop2 . ;code MEND,Example (contd),“abc xmac subr1”调用宏展开后的结果 . ;codeabcloop1 ;内
37、部标号label被abc代替 . ;code BGE abcloop1 ; 内部标号label被abc代替abcloop2 ;内部标号label被abc代替 . ;code BL subr1 ;参数p1被实际值subr1代替 BGT abcloop2 . ;code,其它伪操作,AREA: 定义一个代码段或数据段AREA sectionname , attr1 , attr2ENTRY: 程序入口点END: 源程序结束,其它伪操作(contd),GET/INCLUDEINCLUDE filenameEXPORTEXPORT symbol WEAKIMPORTIMPORT symbol WEAK
38、,伪指令,ADRADRcond register, expr将基于PC的地址值或基于寄存器的地址值读取到寄存器中 汇编替换成一条指令ADRLADRLcond register, exprADRL伪指令比ADR读取更大的地址范围。汇编替换为两条指令LDRLDRcond register, =expr | label_expr将一个32位的常数或地址值读取到寄存器中NOP空操作,如MOV R0, R0,程序格式,以段为单位组织源文件代码段和数据段AREA伪操作Example,Review,Computer architecture and ARM architectureInstruction setAssembly language programmingProgram structureStatements,