Assembly language  Machine dependant  Low level programming language
App Domain Execution Domain Semantic Gap Link with Chapter-1
App Domain Prog Lang Domain Specification gap Exe Domain Execution gap
Mnemonic operation code Data declarations Symbolic Operands No need to memorize numeric operation codesMemory bindings to these names by assembler Avoids manual conversi on
START 101 READ N MOVER BREG, ONE MOVEM BREG, TERM AGAIN MULT BREG, TERM MOVER CREG, TERM ADD CREG, ONE MOVEM CREG, TERM COMP CREG, N BC LE, AGAIN MOVEM BREG, AGAIN PRINT RESULT STOP N DS 1 RESULT DS 1 ONE DC ‘1’ TERM DS 1 END 101) + 09 0 113 102) + 04 2 115 103) + 05 2 116 104) + 03 2 116 105) + 04 3 116 106) + 01 3 115 107) + 05 3 116 108) + 06 3 113 109) + 07 2 104 110) + 05 2 114 111) + 10 0 114 112) + 00 0 000 113) 114) 115) + 00 0 001 116)
Statement Format Label Opcode Operand specification , Operand Specification OptionalOptional
A simple assembly language Statement Operand1 Operand2 I am always a register..!!! I am always a symbolic name..!!!
MOVER and MOVEM  MOVEM  MOVER MOVEM Source , Dest MOVER Dest , Source
Operand specification Symbolic name + (Displacement) Index register AREA+5(4)
Mnemonic Operation Codes Instruction Opcode Assembly Mnemonic Remarks 00 STOP Stops execution 01 ADD First operand is 02 SUB Modified 03 MULT 04 MOVER Register  memory move 05 MOVEM Memory  register move 06 COMP Sets condition code 07 BC Branch on Condition 08 DIV Analogous to SUB 09 READ Performs reading and 10 PRINT printing
Syntax for BC BC Memory Address Condition code LT, LE, EQ, GT, GE, ANY
Machine instruction format and example Sign(1) Opcode(2) Reg Operand(1) Memory Operand(3)
START 101 READ N MOVER BREG, ONE MOVEM BREG, TERM AGAIN MULT BREG, TERM MOVER CREG, TERM ADD CREG, ONE MOVEM CREG, TERM COMP CREG, N BC LE, AGAIN MOVEM BREG, RESULT PRINT RESULT STOP N DS 1 RESULT DS 1 ONE DC ‘1’ TERM DS 1 END 101) + 09 0 113 102) + 04 2 115 103) + 05 2 116 104) + 03 2 116 105) + 04 3 116 106) + 01 3 115 107) + 05 3 116 108) + 06 3 113 109) + 07 2 104 110) + 05 2 114 111) + 10 0 114 112) + 00 0 000 113) 114) 115) 116)
Imperative Assembler directivesDeclaration An action to be performed during executionFor declaring constants or assigning value Instructs the assembler to perform certain actions
Declaration Statements • Reserves area of memory and associates names with them • Example : A DS 1 DS (Declare Storage) • Construct memory words containing constants • ONE DC ‘1’ DC (Declare constant)
What is the use of constants?  DC doesn’t really implement the constants because…  These values are not protected by assembler  They may be changed by moving the new value in that memory word. ONE DC ‘1’ MOVEM BREG,ONE
Similarities with the implementation of constants in HLL 1. Immediate Operands 2. literals
1. A constant as immediate operand ADD AREG,5 Our assembly language doesn’t support this..!! ADD AREG,FIVE ---- FIVE DC ‘5’ How to write equivalent instructions for this??
2. Literal  Operand with syntax = ‘ value ’ What is the Difference between literal and constant..?? ??? ?? The location of a literal cant be specified. So its value cant be changed like constant… ADD AREG, =‘5’
2. Literal What is the Difference between literal and immediate operand..?? ??? ?? No Architectural provision is needed for literal like immediate operand.. ( Literal ) ADD AREG, =‘5’ ( Imm. operand ) ADD AREG,5
Assembler Directives 1. START <constant> 2. END [<op spec>] The first word of the target program should be placed in the memory word with the address specified.. Indicates the end of the source program..
Advantages of assembly language
START 101 READ N MOVER BREG, ONE MOVEM BREG, TERM AGAIN MULT BREG, TERM MOVER CREG, TERM ADD CREG, ONE MOVEM CREG, TERM COMP CREG, N BC LE, AGAIN MOVEM BREG, RESULT PRINT RESULT STOP N DS 1 RESULT DS 1 ONE DC ‘1’ TERM DS 1 END 101) + 09 0 113 102) + 04 2 115 103) + 05 2 116 104) + 03 2 116 105) + 04 3 116 106) + 01 3 115 107) + 05 3 116 108) + 06 3 113 109) + 07 2 104 110) + 05 2 114 111) + 10 0 114 112) + 00 0 000 113) 114) 115) 116)
START 101 READ N MOVER BREG, ONE MOVEM BREG, TERM AGAIN MULT BREG, TERM MOVER CREG, TERM ADD CREG, ONE MOVEM CREG, TERM COMP CREG, N BC LE, AGAIN DIV BREG, TWO MOVEM BREG, RESULT PRINT RESULT STOP N DS 1 RESULT DS 1 ONE DC ‘1’ TERM DS 1 TWO DC ‘2’ END 101) + 09 0 114 102) + 04 2 116 103) + 05 2 117 104) + 03 2 117 105) + 04 3 117 106) + 01 3 116 107) + 05 3 117 108) + 06 3 114 109) + 07 2 104 110) + 08 2 118 111) + 05 0 115 112) + 10 0 115 113) + 00 0 000 114) 115) 116) 117) 118)
Advantages of assembly language  Machine language program needs to be changed drastically if we modify the original program.  It is more suitable when it is desirable to use specific architectural features of a computer…  Example – special instructions supported by CPU
Design Specification of assembler  Identify the information necessary to perform the task  Design the suitable data structures to record the information  Determine the processing necessary to obtain and manage the information  Determine the information necessary to perform the task
Example Which information do we need to convert this instruction in to the equivalent machine language instruction??? MOVER BREG, ONE 1. Address of the memory word for ONE 2. Machine Operation code for MOVER
2. Machine Operation code for MOVER 1. Address of the memory word for ONE Depends on the source program..!! Doesn’t depend on the source program..!!
Data structure required.. Name Address Mnemonic Opcode Built by the analysis phase… Used during the synthesis phase…
Two phases of assembler 2. Synthesis 1. Analysis
Analysis phase  Main Task : Building of Symbol table  For this, it must determine the address with which the symbolic names are associated  To determine the address of a particular symbolic name, we must fix the address of all elements preceding it Memory allocation..
Data structure to implement Memory allocation Location Counter Contains the address of the next mem. word initialized to constant in START Whenever there is a label, it enters the Label and LC contents in the new entry of symbol table Name Address AGAIN 104
Cont… After this, it finds the number of mem words required by that statement and again updates the LC content It needs to know the length of instructions.! Depends on assembly lang. Processing involved in maintaining LC LC Processing
Data structure of the assembler
ADD 01 1 SUB 02 1 AGAIN 104 N 113 Analysis phase Synthesis phase mnemonic op code length Source Program Target Program Mnemonics table Symbol table symbol address
Tasks of analysis phase 1. Separate label, opcode & operand 2. Build the symbol table 3. Perform LC processing 4. Construct IC Analysis
Tasks of analysis phase 1. Obtain the machine opcode corresponding to the mnemonic 2. Obtain the address of a memory operand from symbol table 3. Synthesize the machine instruction Synthesis
Pass structure of Assembler 2. Two pass assembler 1. Single Pass Assembler
Two pass assembler 1st Pass Performs analysis 2nd Pass Performs synthesis
1st Pass Constructs Intermediate Code 1. Data structures (symbol table) 2. Intermediate form of source prog
Pass 1 Pass 2 Source Program Target Program Data Structures Intermediate Code
Design of a two pass assembler 1. Separate label, opcode & operand 2. Build the symbol table 3. Perform LC processing 4. Construct IC 1st pass
2nd Pass Synthesize the machine instruction 2nd Pass
Single Pass translation Problem of Forward reference Handled by Process called Back patching
START 101 READ N MOVER BREG, ONE MOVEM BREG, TERM AGAIN MULT BREG, TERM MOVER CREG, TERM ADD CREG, ONE MOVEM CREG, TERM COMP CREG, N BC LE, AGAIN MOVEM BREG, AGAIN PRINT RESULT STOP N DS 1 RESULT DS 1 ONE DC ‘1’ TERM DS 1 END 101) + 09 0 113 102) + 04 2 115 103) + 05 2 116 104) + 03 2 116 105) + 04 3 116 106) + 01 3 115 107) + 05 3 116 108) + 06 3 113 109) + 07 2 104 110) + 05 2 114 111) + 10 0 114 112) + 00 0 000 113) 114) 115) 116)
Back patching  The operand field of an instruction is containing forward reference is kept blank initially.  The address of that symbol is put into field when its address is encountered.
Back patching Table of Incomplete Instruction (TII) Instruction address Symbol 101 ONE
After the END statement is processed… Symbol Table Addresses of all the symbols defined in prog TII Information of all forward references
Back patching  The assembler can now process each entry in TII to complete the concerned instruction.  Example – (101,ONE) 1. Obtain the Address of ONE from the symbol table 2. Insert It into the instruction at location 101
Advanced Assembler Directives EQU LTORG ORIGIN
ORIGIN  ORIGIN < Address Specification > 2. constant 1. Operand Specification ORIGIN LAST+1 ORIGIN 217
Cont…  It is useful when your target program does not consist of consecutive memory words.  Operand Specification – Ability to perform Relative LC Processing, not absolute.  Difference between using both the options
EQU  Defines the symbol to represent <add spec>  Symbol EQU < address specification > 2. constant 1. Operand Specification BACK EQU LOOP BACK EQU 200
Difference with DC/DS??
Literal, why LTORG? ADD AREG, =5 ADD AREG,FIVE ---- FIVE DC ‘5’ What is done internally by assembler?
LTORG  Allows programmer to specify where literals should be stored.
What if we don’t write LTORG?
Pass -1 of the Assembler • Table of mnemonic opcodes and its classOPTAB • Contains symbol name, addressSYMTAB • Table of literals used in the programLITTAB
OPTAB  Contains 1. mnemonic opcode 2. class 3. mnemonic information 2. Class IS (Imperative) DS (Declarative) AD
2.Class IS DS AD 3. Mnemonic info (Machine Opcode, Inst Length) 3. Mnemonic info ID of a routine
OPTAB
SYMTAB
LITTAB  Contains two fields. 1. Literal 2. Address
POOLTAB
Intermediate Code Format Address Opcode Operands
Opcode format  (Statement Class, Code)
Operand Specification  (Operand Class, Code) Class C (Constant) L (Literal) S (symbol)
PASS-1 Algorithm
Variants of IC
Variant - 2 of IC
Variant-2 of IC
Variants of IC Extra work in Pass I Simplified Pass II Pass I code occupies more memory than code of Pass II Does not simplify the task of Pass II or save much memory in some situation. IC is less compact Memory required for two passes would be better balanced So better memory utilization
Processing of Declarations & Assembler Directives  Is it necessary to represent the address of each source statement in the intermediate code?  Is it necessary to have a represent of DS statements and assembler directives in the intermediate code?
PASS – 2 of assembler
Error Reporting
Single pass Assembler for Intel x86
Some organizational issues  Tables  Source Program and intermediate code
3-90 General Purpose Registers 15 8 7 0 AX BX CX DX AH AL BH BL CH CL DH DL Accumulator Base Counter Data SP BP SI DI Data Group Pointer and Index Group Stack Pointer Base Pointer Source Index Destination Index
Segment Registers
Segment Registers  Program may consist of many segments.  Each segment may contain a part of programs code, data or stack.  Instruction uses segment register and 16bit offset to Address memory operand.  Each segment is of size 64Kbytes.
Addressing Modes Addressing Mode Example Remarks Immediate MOV SUM, 1234H Data=1234H Register MOV SUM, AX AX contains the data Direct MOV SUM,[1234H] Data disp.=1234H RegisterIndirect MOV SUM, [BX] Data disp.= (BX) Register Indirect MOV SUM,CS:[BX] Segment override: Segment base: CS Data disp.:(BX) Based MOV SUM,12H[BX] Data disp. =12H +(BX) Note: we can use BX or BP BX: Data segment BP: stack segment Indexed MOV SUM,34H[SI] Data disp. =12H +(SI) Note: we can use DI or SI Based and indexed MOV SUM,56H[SI][BX] Data disp.= 56H + (SI)+(BX)
Instruction Formats of Intel 8088 • mod and r/m fields specify the first operand • reg field specify second operand • opcode indicates which instruction format is applicable • d indicates which operand is the destination operand • w indicates whether 8 or 16 bit arithmetic is to be used
r/m Mod=00 Mod=01 Mod=10 W=0 W=1 000 (BX)+(SI) (BX)+(SI)+d8 NOTE2 AL AX 001 (BX)+(DI ) (BX)+(DI) +d8 NOTE2 CL CX 010 (BP)+(SI) (BP)+(SI) +d8 NOTE2 DL DX 011 (BP)+(DI ) (BP)+(DI) +d8 NOTE2 BL BX 100 (SI) (SI) +d8 NOTE2 AH SP 101 (DI) (DI) +d8 NOTE2 CH BP 110 NOTE3 (BP) +d8 NOTE2 DH SI 111 (BX) (BX) +d8 NOTE2 BH DI Note1: d8 denotes an 8-bit displacement Note2: same as in the previous column , except d18 instead of d8 Note3: (BP)+DISP for indirect addressing, d16 for direct First operand Its value can be in a register or in memory.
reg Register 8 bit (w=0) 16 bit (w=1) 000 AL AX 001 CL CX 010 DL DX 011 BL BX 100 AH SP 101 CH BP 110 DH SI 111 BH DI Second operand
Segment overrides Seg Segment Register 00 ES 01 CS 10 SS 11 DS • By Dedfault arithmetic and MOV instruction uses the Data Segment • To use other segment, an instruction has to be preceded by 1 byte : Segment override prefix Its format is: 001 seg 110 Example: ADD AL, CS:12H[SI]
Control Transfer Instructions
Statement Format [Lable:] Opcode Operand(s) ; Comment
Assembler Directives  ORG  EQU  END Same as ORIGIN, EQU, and END respectively LTORG: Is not used as 8088 supports immediate addressing  PURGE  SEGMENT  ENDS  ASSUME
PURGE  Name defined usin EQU can be ‘undefined’ through this directive  Example: XYZ DB ? ABC EQU XYZ ;ABC represents the name xyz PURGE ABC ;ABC no longer represents XYZ ABC EQU 25 ;ABC now represents the value 25
SEGMENT and ENDS  SEGMENT : Indicate beginning of a segment  ENDS: Indicate end of a segment
ASSUME  Informs assembler that the address of the indicated segment would be present in <segment register>  SYNTAX: ASSUME <Segment register> : <Segment name>  To nullify the effect of previous ASSUME ASSUME <register> :NOTHING
EXAMPLE
Other Directives  PROC and ENDP: it delimits the body of procedure.  NEAR and FAR: Indicate call to aprocedure is a near or a far call
PUBLIC and EXTERN  Normally, symbol declared in one program can be use only from within that program  Use PUBLIC directive : if the symbol is to be accessed in other program  Use EXTERN directive: If any other program wishing to use the symbol. EXTERN <symbolic name> : <type>
Declarations  DB- Reserve a byte and initialize it  Example: A DB 25  DW - Reserve a word, but do not initialize  Example: B DW ? • DW - Reserve & initialize word to A’s offset  Example: ADD_A DW A  DD – Double words, all initialized to 0s • Example: C DB 6DUP(0)
Declarations  DQ and DT reserve and initialize quad-word (area of 8 bytes) whose start address is aligned on a multiple of 8 or 10 bytes respectively.
Analytic Operators  Provides:  Memory address or information regarding type and memory requirement of oprands 1. SEG: segment of an oprand 2. OFFSET: offset of an operand 3. TYPE: numeric code which indicates manner in which operand is defined 4. SIZE: number of units in an operand 5. LENGTH: no. of bytes allocated to the operand
Code Operand type 1 Byte 2 Word 4 Double word 8 Quad word 10 Ten bytes -1 Near instruction -2 Far instruction Example: MOV AX,OFFSET ABC BUFFER DW 100 DUP (0) MOV CX, LENGTH XYZ
Synthetic operator  PTR: defines a new memory operand that has same segment and offset with different data type  THIS: it defines new memory operand to have the same address as the next byte in the program
Problems of Single Pass Assembly  Forward references:  Symbol is used as a data operand ○ Solution: TII(table of incomplete instructions)  Symbol is used as the destination address ○ Solution: use keyword SHORT to indicate short displacement  Use of Segment registers:  Store pair (segment register, segment name) in Segment register table(SRTAB) for statement: ASSUME <segment register> : <segment name>
Problems of Single Pass Assembly Provisions: 1. A new SRTAB is created when an ASSUME statement is encountred. It differs from old SRTAB only in entries for the segment registers named in the ASSUME statement. Many SRTABs may exist at any time. SRTAB_ARRAY is used to store SRTABs.
Problems of Single Pass Assembly2. Instead of using TII, forward reference table (FRT) is used. Each entry of FRT contains: a) Address of the instruction b) Symbol to which forward reference is made c) Kind of reference • T : Analytic operator TYPE • D : Data Address • S : Self Relative Address • L : length • F : offset d) Identity of the SRTAB that should be used for assembling the reference.
Mnemonic Table and Symbol Table
Segment Register Table Array
FRT and CRT
Example
Example
Algorithm  Data structures used by algo: SYMTAB, SRTAB_ARRAY, CRT, FRT and ERRTAB LC :Location counter code_area :Area of Assembling the target program code_area_address :contains address of code_area srtab_no :number of the current SRTAB stmt_no : number of the current statement SYMTAB_segment_entry : SYMTAB entry # of current segment machine_code_buffer :area for constructing code for one statement
Algorithm
Algorithm
Algorithm
Error reporting
Thank you

Ch 3 Assembler in System programming

  • 2.
    Assembly language  Machinedependant  Low level programming language
  • 3.
  • 4.
  • 5.
  • 6.
    START 101 READ N MOVERBREG, ONE MOVEM BREG, TERM AGAIN MULT BREG, TERM MOVER CREG, TERM ADD CREG, ONE MOVEM CREG, TERM COMP CREG, N BC LE, AGAIN MOVEM BREG, AGAIN PRINT RESULT STOP N DS 1 RESULT DS 1 ONE DC ‘1’ TERM DS 1 END 101) + 09 0 113 102) + 04 2 115 103) + 05 2 116 104) + 03 2 116 105) + 04 3 116 106) + 01 3 115 107) + 05 3 116 108) + 06 3 113 109) + 07 2 104 110) + 05 2 114 111) + 10 0 114 112) + 00 0 000 113) 114) 115) + 00 0 001 116)
  • 7.
    Statement Format Label Opcode Operand specification ,Operand Specification OptionalOptional
  • 8.
    A simple assemblylanguage Statement Operand1 Operand2 I am always a register..!!! I am always a symbolic name..!!!
  • 9.
    MOVER and MOVEM MOVEM  MOVER MOVEM Source , Dest MOVER Dest , Source
  • 10.
  • 11.
    Mnemonic Operation Codes Instruction Opcode Assembly Mnemonic Remarks 00STOP Stops execution 01 ADD First operand is 02 SUB Modified 03 MULT 04 MOVER Register  memory move 05 MOVEM Memory  register move 06 COMP Sets condition code 07 BC Branch on Condition 08 DIV Analogous to SUB 09 READ Performs reading and 10 PRINT printing
  • 12.
    Syntax for BC BC Memory Address Conditioncode LT, LE, EQ, GT, GE, ANY
  • 13.
    Machine instruction format andexample Sign(1) Opcode(2) Reg Operand(1) Memory Operand(3)
  • 14.
    START 101 READ N MOVERBREG, ONE MOVEM BREG, TERM AGAIN MULT BREG, TERM MOVER CREG, TERM ADD CREG, ONE MOVEM CREG, TERM COMP CREG, N BC LE, AGAIN MOVEM BREG, RESULT PRINT RESULT STOP N DS 1 RESULT DS 1 ONE DC ‘1’ TERM DS 1 END 101) + 09 0 113 102) + 04 2 115 103) + 05 2 116 104) + 03 2 116 105) + 04 3 116 106) + 01 3 115 107) + 05 3 116 108) + 06 3 113 109) + 07 2 104 110) + 05 2 114 111) + 10 0 114 112) + 00 0 000 113) 114) 115) 116)
  • 15.
  • 16.
    Declaration Statements • Reservesarea of memory and associates names with them • Example : A DS 1 DS (Declare Storage) • Construct memory words containing constants • ONE DC ‘1’ DC (Declare constant)
  • 17.
    What is theuse of constants?  DC doesn’t really implement the constants because…  These values are not protected by assembler  They may be changed by moving the new value in that memory word. ONE DC ‘1’ MOVEM BREG,ONE
  • 18.
    Similarities with the implementationof constants in HLL 1. Immediate Operands 2. literals
  • 19.
    1. A constantas immediate operand ADD AREG,5 Our assembly language doesn’t support this..!! ADD AREG,FIVE ---- FIVE DC ‘5’ How to write equivalent instructions for this??
  • 20.
    2. Literal  Operandwith syntax = ‘ value ’ What is the Difference between literal and constant..?? ??? ?? The location of a literal cant be specified. So its value cant be changed like constant… ADD AREG, =‘5’
  • 21.
    2. Literal What isthe Difference between literal and immediate operand..?? ??? ?? No Architectural provision is needed for literal like immediate operand.. ( Literal ) ADD AREG, =‘5’ ( Imm. operand ) ADD AREG,5
  • 22.
    Assembler Directives 1. START<constant> 2. END [<op spec>] The first word of the target program should be placed in the memory word with the address specified.. Indicates the end of the source program..
  • 23.
  • 24.
    START 101 READ N MOVERBREG, ONE MOVEM BREG, TERM AGAIN MULT BREG, TERM MOVER CREG, TERM ADD CREG, ONE MOVEM CREG, TERM COMP CREG, N BC LE, AGAIN MOVEM BREG, RESULT PRINT RESULT STOP N DS 1 RESULT DS 1 ONE DC ‘1’ TERM DS 1 END 101) + 09 0 113 102) + 04 2 115 103) + 05 2 116 104) + 03 2 116 105) + 04 3 116 106) + 01 3 115 107) + 05 3 116 108) + 06 3 113 109) + 07 2 104 110) + 05 2 114 111) + 10 0 114 112) + 00 0 000 113) 114) 115) 116)
  • 25.
    START 101 READ N MOVERBREG, ONE MOVEM BREG, TERM AGAIN MULT BREG, TERM MOVER CREG, TERM ADD CREG, ONE MOVEM CREG, TERM COMP CREG, N BC LE, AGAIN DIV BREG, TWO MOVEM BREG, RESULT PRINT RESULT STOP N DS 1 RESULT DS 1 ONE DC ‘1’ TERM DS 1 TWO DC ‘2’ END 101) + 09 0 114 102) + 04 2 116 103) + 05 2 117 104) + 03 2 117 105) + 04 3 117 106) + 01 3 116 107) + 05 3 117 108) + 06 3 114 109) + 07 2 104 110) + 08 2 118 111) + 05 0 115 112) + 10 0 115 113) + 00 0 000 114) 115) 116) 117) 118)
  • 26.
    Advantages of assembly language Machine language program needs to be changed drastically if we modify the original program.  It is more suitable when it is desirable to use specific architectural features of a computer…  Example – special instructions supported by CPU
  • 27.
    Design Specification of assembler Identify the information necessary to perform the task  Design the suitable data structures to record the information  Determine the processing necessary to obtain and manage the information  Determine the information necessary to perform the task
  • 28.
    Example Which information dowe need to convert this instruction in to the equivalent machine language instruction??? MOVER BREG, ONE 1. Address of the memory word for ONE 2. Machine Operation code for MOVER
  • 29.
    2. Machine Operation code for MOVER 1. Address of thememory word for ONE Depends on the source program..!! Doesn’t depend on the source program..!!
  • 30.
    Data structure required.. NameAddress Mnemonic Opcode Built by the analysis phase… Used during the synthesis phase…
  • 31.
    Two phases ofassembler 2. Synthesis 1. Analysis
  • 32.
    Analysis phase  MainTask : Building of Symbol table  For this, it must determine the address with which the symbolic names are associated  To determine the address of a particular symbolic name, we must fix the address of all elements preceding it Memory allocation..
  • 33.
    Data structure toimplement Memory allocation Location Counter Contains the address of the next mem. word initialized to constant in START Whenever there is a label, it enters the Label and LC contents in the new entry of symbol table Name Address AGAIN 104
  • 34.
    Cont… After this, itfinds the number of mem words required by that statement and again updates the LC content It needs to know the length of instructions.! Depends on assembly lang. Processing involved in maintaining LC LC Processing
  • 35.
    Data structure ofthe assembler
  • 36.
    ADD 01 1 SUB02 1 AGAIN 104 N 113 Analysis phase Synthesis phase mnemonic op code length Source Program Target Program Mnemonics table Symbol table symbol address
  • 37.
    Tasks of analysisphase 1. Separate label, opcode & operand 2. Build the symbol table 3. Perform LC processing 4. Construct IC Analysis
  • 38.
    Tasks of analysisphase 1. Obtain the machine opcode corresponding to the mnemonic 2. Obtain the address of a memory operand from symbol table 3. Synthesize the machine instruction Synthesis
  • 39.
    Pass structure ofAssembler 2. Two pass assembler 1. Single Pass Assembler
  • 40.
    Two pass assembler 1st Pass Performsanalysis 2nd Pass Performs synthesis
  • 41.
  • 42.
  • 43.
    Design of atwo pass assembler 1. Separate label, opcode & operand 2. Build the symbol table 3. Perform LC processing 4. Construct IC 1st pass
  • 44.
  • 46.
  • 47.
    START 101 READ N MOVERBREG, ONE MOVEM BREG, TERM AGAIN MULT BREG, TERM MOVER CREG, TERM ADD CREG, ONE MOVEM CREG, TERM COMP CREG, N BC LE, AGAIN MOVEM BREG, AGAIN PRINT RESULT STOP N DS 1 RESULT DS 1 ONE DC ‘1’ TERM DS 1 END 101) + 09 0 113 102) + 04 2 115 103) + 05 2 116 104) + 03 2 116 105) + 04 3 116 106) + 01 3 115 107) + 05 3 116 108) + 06 3 113 109) + 07 2 104 110) + 05 2 114 111) + 10 0 114 112) + 00 0 000 113) 114) 115) 116)
  • 48.
    Back patching  Theoperand field of an instruction is containing forward reference is kept blank initially.  The address of that symbol is put into field when its address is encountered.
  • 49.
    Back patching Table ofIncomplete Instruction (TII) Instruction address Symbol 101 ONE
  • 50.
    After the ENDstatement is processed… Symbol Table Addresses of all the symbols defined in prog TII Information of all forward references
  • 51.
    Back patching  Theassembler can now process each entry in TII to complete the concerned instruction.  Example – (101,ONE) 1. Obtain the Address of ONE from the symbol table 2. Insert It into the instruction at location 101
  • 52.
  • 53.
    ORIGIN  ORIGIN <Address Specification > 2. constant 1. Operand Specification ORIGIN LAST+1 ORIGIN 217
  • 54.
    Cont…  It isuseful when your target program does not consist of consecutive memory words.  Operand Specification – Ability to perform Relative LC Processing, not absolute.  Difference between using both the options
  • 56.
    EQU  Defines thesymbol to represent <add spec>  Symbol EQU < address specification > 2. constant 1. Operand Specification BACK EQU LOOP BACK EQU 200
  • 57.
  • 58.
    Literal, why LTORG? ADDAREG, =5 ADD AREG,FIVE ---- FIVE DC ‘5’ What is done internally by assembler?
  • 59.
    LTORG  Allows programmerto specify where literals should be stored.
  • 60.
    What if wedon’t write LTORG?
  • 61.
    Pass -1 ofthe Assembler • Table of mnemonic opcodes and its classOPTAB • Contains symbol name, addressSYMTAB • Table of literals used in the programLITTAB
  • 62.
    OPTAB  Contains 1. mnemonicopcode 2. class 3. mnemonic information 2. Class IS (Imperative) DS (Declarative) AD
  • 63.
    2.Class IS DS AD 3.Mnemonic info (Machine Opcode, Inst Length) 3. Mnemonic info ID of a routine
  • 64.
  • 65.
  • 66.
    LITTAB  Contains twofields. 1. Literal 2. Address
  • 67.
  • 68.
  • 69.
  • 70.
    Operand Specification  (OperandClass, Code) Class C (Constant) L (Literal) S (symbol)
  • 72.
  • 76.
  • 77.
  • 78.
  • 79.
    Variants of IC Extrawork in Pass I Simplified Pass II Pass I code occupies more memory than code of Pass II Does not simplify the task of Pass II or save much memory in some situation. IC is less compact Memory required for two passes would be better balanced So better memory utilization
  • 80.
    Processing of Declarations& Assembler Directives  Is it necessary to represent the address of each source statement in the intermediate code?  Is it necessary to have a represent of DS statements and assembler directives in the intermediate code?
  • 82.
    PASS – 2of assembler
  • 85.
  • 87.
    Single pass Assemblerfor Intel x86
  • 88.
    Some organizational issues Tables  Source Program and intermediate code
  • 89.
    3-90 General Purpose Registers 158 7 0 AX BX CX DX AH AL BH BL CH CL DH DL Accumulator Base Counter Data SP BP SI DI Data Group Pointer and Index Group Stack Pointer Base Pointer Source Index Destination Index
  • 90.
  • 91.
    Segment Registers  Programmay consist of many segments.  Each segment may contain a part of programs code, data or stack.  Instruction uses segment register and 16bit offset to Address memory operand.  Each segment is of size 64Kbytes.
  • 92.
    Addressing Modes Addressing Mode Example Remarks ImmediateMOV SUM, 1234H Data=1234H Register MOV SUM, AX AX contains the data Direct MOV SUM,[1234H] Data disp.=1234H RegisterIndirect MOV SUM, [BX] Data disp.= (BX) Register Indirect MOV SUM,CS:[BX] Segment override: Segment base: CS Data disp.:(BX) Based MOV SUM,12H[BX] Data disp. =12H +(BX) Note: we can use BX or BP BX: Data segment BP: stack segment Indexed MOV SUM,34H[SI] Data disp. =12H +(SI) Note: we can use DI or SI Based and indexed MOV SUM,56H[SI][BX] Data disp.= 56H + (SI)+(BX)
  • 93.
    Instruction Formats ofIntel 8088 • mod and r/m fields specify the first operand • reg field specify second operand • opcode indicates which instruction format is applicable • d indicates which operand is the destination operand • w indicates whether 8 or 16 bit arithmetic is to be used
  • 94.
    r/m Mod=00 Mod=01Mod=10 W=0 W=1 000 (BX)+(SI) (BX)+(SI)+d8 NOTE2 AL AX 001 (BX)+(DI ) (BX)+(DI) +d8 NOTE2 CL CX 010 (BP)+(SI) (BP)+(SI) +d8 NOTE2 DL DX 011 (BP)+(DI ) (BP)+(DI) +d8 NOTE2 BL BX 100 (SI) (SI) +d8 NOTE2 AH SP 101 (DI) (DI) +d8 NOTE2 CH BP 110 NOTE3 (BP) +d8 NOTE2 DH SI 111 (BX) (BX) +d8 NOTE2 BH DI Note1: d8 denotes an 8-bit displacement Note2: same as in the previous column , except d18 instead of d8 Note3: (BP)+DISP for indirect addressing, d16 for direct First operand Its value can be in a register or in memory.
  • 95.
    reg Register 8 bit(w=0) 16 bit (w=1) 000 AL AX 001 CL CX 010 DL DX 011 BL BX 100 AH SP 101 CH BP 110 DH SI 111 BH DI Second operand
  • 96.
    Segment overrides Seg Segment Register 00ES 01 CS 10 SS 11 DS • By Dedfault arithmetic and MOV instruction uses the Data Segment • To use other segment, an instruction has to be preceded by 1 byte : Segment override prefix Its format is: 001 seg 110 Example: ADD AL, CS:12H[SI]
  • 97.
  • 98.
  • 99.
    Assembler Directives  ORG EQU  END Same as ORIGIN, EQU, and END respectively LTORG: Is not used as 8088 supports immediate addressing  PURGE  SEGMENT  ENDS  ASSUME
  • 100.
    PURGE  Name definedusin EQU can be ‘undefined’ through this directive  Example: XYZ DB ? ABC EQU XYZ ;ABC represents the name xyz PURGE ABC ;ABC no longer represents XYZ ABC EQU 25 ;ABC now represents the value 25
  • 101.
    SEGMENT and ENDS SEGMENT : Indicate beginning of a segment  ENDS: Indicate end of a segment
  • 102.
    ASSUME  Informs assemblerthat the address of the indicated segment would be present in <segment register>  SYNTAX: ASSUME <Segment register> : <Segment name>  To nullify the effect of previous ASSUME ASSUME <register> :NOTHING
  • 103.
  • 104.
    Other Directives  PROCand ENDP: it delimits the body of procedure.  NEAR and FAR: Indicate call to aprocedure is a near or a far call
  • 105.
    PUBLIC and EXTERN Normally, symbol declared in one program can be use only from within that program  Use PUBLIC directive : if the symbol is to be accessed in other program  Use EXTERN directive: If any other program wishing to use the symbol. EXTERN <symbolic name> : <type>
  • 106.
    Declarations  DB- Reservea byte and initialize it  Example: A DB 25  DW - Reserve a word, but do not initialize  Example: B DW ? • DW - Reserve & initialize word to A’s offset  Example: ADD_A DW A  DD – Double words, all initialized to 0s • Example: C DB 6DUP(0)
  • 107.
    Declarations  DQ andDT reserve and initialize quad-word (area of 8 bytes) whose start address is aligned on a multiple of 8 or 10 bytes respectively.
  • 108.
    Analytic Operators  Provides: Memory address or information regarding type and memory requirement of oprands 1. SEG: segment of an oprand 2. OFFSET: offset of an operand 3. TYPE: numeric code which indicates manner in which operand is defined 4. SIZE: number of units in an operand 5. LENGTH: no. of bytes allocated to the operand
  • 109.
    Code Operand type 1Byte 2 Word 4 Double word 8 Quad word 10 Ten bytes -1 Near instruction -2 Far instruction Example: MOV AX,OFFSET ABC BUFFER DW 100 DUP (0) MOV CX, LENGTH XYZ
  • 110.
    Synthetic operator  PTR:defines a new memory operand that has same segment and offset with different data type  THIS: it defines new memory operand to have the same address as the next byte in the program
  • 111.
    Problems of SinglePass Assembly  Forward references:  Symbol is used as a data operand ○ Solution: TII(table of incomplete instructions)  Symbol is used as the destination address ○ Solution: use keyword SHORT to indicate short displacement  Use of Segment registers:  Store pair (segment register, segment name) in Segment register table(SRTAB) for statement: ASSUME <segment register> : <segment name>
  • 113.
    Problems of SinglePass Assembly Provisions: 1. A new SRTAB is created when an ASSUME statement is encountred. It differs from old SRTAB only in entries for the segment registers named in the ASSUME statement. Many SRTABs may exist at any time. SRTAB_ARRAY is used to store SRTABs.
  • 114.
    Problems of SinglePass Assembly2. Instead of using TII, forward reference table (FRT) is used. Each entry of FRT contains: a) Address of the instruction b) Symbol to which forward reference is made c) Kind of reference • T : Analytic operator TYPE • D : Data Address • S : Self Relative Address • L : length • F : offset d) Identity of the SRTAB that should be used for assembling the reference.
  • 115.
    Mnemonic Table andSymbol Table
  • 116.
  • 117.
  • 118.
  • 119.
  • 120.
    Algorithm  Data structuresused by algo: SYMTAB, SRTAB_ARRAY, CRT, FRT and ERRTAB LC :Location counter code_area :Area of Assembling the target program code_area_address :contains address of code_area srtab_no :number of the current SRTAB stmt_no : number of the current statement SYMTAB_segment_entry : SYMTAB entry # of current segment machine_code_buffer :area for constructing code for one statement
  • 121.
  • 122.
  • 123.
  • 126.
  • 127.