Assembly Language x86 - Chapter 4 Notes
Data Transfer Instructions
- Operand Types
- Immediate: A constant integer (8, 16, or 32 bits). Value is encoded within the instruction.
- Register: The name of a register. Register name is converted to a number and encoded within the instruction.
- Memory: Reference to a location in memory. Memory address is encoded within the instruction, or a register holds the address of a memory location.
- Instruction Operand Notation
- reg8: 8-bit general-purpose register (AH, AL, BH, BL, CH, CL, DH, DL).
- reg16: 16-bit general-purpose register (AX, BX, CX, DX, SI, DI, SP, BP).
- reg32: 32-bit general-purpose register (EAX, EBX, ECX, EDX, ESI, EDI, ESP, EBP).
- reg: Any general-purpose register.
- sreg: 16-bit segment register (CS, DS, SS, ES, FS, GS).
- imm: 8-, 16-, or 32-bit immediate value.
- imm8: 8-bit immediate byte value.
- imm16: 16-bit immediate word value.
- imm32: 32-bit immediate doubleword value.
- reg/mem8: 8-bit operand, which can be an 8-bit general register or memory byte.
- reg/mem16: 16-bit operand, which can be a 16-bit general register or memory word.
- reg/mem32: 32-bit operand, which can be a 32-bit general register or memory doubleword.
- mem: An 8-, 16-, or 32-bit memory operand.
- Direct Memory Operands
- A direct memory operand is a named reference to storage in memory.
- The named reference (label) is automatically dereferenced by the assembler.
- Example:
assembly
.data
var1 BYTE 10h
.code
mov al,var1 ; AL = 10h
mov al,[var1] ; AL = 10h (alternate format)
- MOV Instruction
- Move from source to destination.
- Syntax:
MOV destination,source - Rules:
- No more than one memory operand permitted.
- CS, EIP, and IP cannot be the destination.
- No immediate to segment moves.
- Example:
assembly
.data
count BYTE 100
wVal WORD 2
.code
mov bl,count
mov ax,wVal
mov count,al
; mov al,wVal ; error
; mov ax,count ; error
; mov eax,count ; error
- Zero Extension
- The
MOVZX instruction fills (extends) the upper half of the destination with zeros when copying a smaller value into a larger destination. - The destination must be a register.
- Example:
assembly
mov bl,10001111b
movzx ax,bl ; zero-extension
- Sign Extension
- The
MOVSX instruction fills the upper half of the destination with a copy of the source operand's sign bit. - The destination must be a register.
- Example:
assembly
mov bl,10001111b
movsx ax,bl ; sign extension
- XCHG Instruction
- XCHG exchanges the values of two operands.
- At least one operand must be a register.
- No immediate operands are permitted.
- Example:
assembly
.data
var1 WORD 1000h
var2 WORD 2000h
.code
xchg ax,bx ; exchange 16-bit regs
xchg ah,al ; exchange 8-bit regs
xchg var1,bx ; exchange mem, reg
xchg eax,ebx ; exchange 32-bit regs
; xchg var1,var2 ; error: two memory operands
- Direct-Offset Operands
- A constant offset is added to a data label to produce an effective address (EA).
- The address is dereferenced to get the value inside its memory location.
- Example:
assembly
.data
arrayB BYTE 10h,20h,30h,40h
.code
mov al,arrayB+1 ; AL = 20h
mov al,[arrayB+1] ; alternative notation
Addition and Subtraction
- INC and DEC Instructions
- Add 1 or subtract 1 from the destination operand.
- Operand may be a register or memory.
INC destination: Destination = Destination + 1DEC destination: Destination = Destination – 1- Examples:
assembly
.data
myWord WORD 1000h
myDword DWORD 10000000h
.code
inc myWord ; 1001h
dec myWord ; 1000h
inc myDword ; 10000001h
mov ax,00FFh
inc ax ; AX = 0100h
mov ax,00FFh
inc al ; AX = 0000h
- ADD and SUB Instructions
ADD destination, source: Destination = Destination + SourceSUB destination, source: Destination = Destination – Source- Same operand rules as for the MOV instruction.
- Examples:
assembly
.data
var1 DWORD 10000h
var2 DWORD 20000h
.code
mov eax,var1 ; 00010000h
add eax,var2 ; 00030000h
add ax,0FFFFh ; 0003FFFFh
add eax,1 ; 00040000h
sub ax,1 ; 0004FFFFh
- NEG (Negate) Instruction
- Reverses the sign of an operand.
- Operand can be a register or memory operand.
- Example:
assembly
.data
valB BYTE -1
valW WORD +32767
.code
mov al,valB ; AL = -1
neg al ; AL = +1
neg valW ; valW = -32767
- The processor implements NEG using the internal operation:
SUB 0, operand. - Any nonzero operand causes the Carry flag to be set.
- Implementing Arithmetic Expressions
- HLL compilers translate mathematical expressions into assembly language.
- Example:
Rval = -Xval + (Yval – Zval)
assembly
.data
Rval DWORD ?
Xval DWORD 26
Yval DWORD 30
Zval DWORD 40
.code
mov eax,Xval
neg eax ; EAX = -26
mov ebx,Yval
sub ebx,Zval ; EBX = -10
add eax,ebx
mov Rval,eax ; -36
- Flags Affected by Arithmetic
- The ALU has a number of status flags that reflect the outcome of arithmetic (and bitwise) operations, based on the contents of the destination operand.
- Zero Flag (ZF): Set when the destination equals zero.
- Sign Flag (SF): Set when the destination is negative.
- Carry Flag (CF): Set when an unsigned value is out of range.
- Overflow Flag (OF): Set when a signed value is out of range.
- The MOV instruction never affects the flags.
- Zero Flag (ZF)
- The Zero flag is set when the result of an operation produces zero in the destination operand.
- A flag is set when it equals 1; a flag is clear when it equals 0.
- Examples:
assembly
mov cx,1
sub cx,1 ; CX = 0, ZF = 1
mov ax,0FFFFh
inc ax ; AX = 0, ZF = 1
inc ax ; AX = 1, ZF = 0
- Sign Flag (SF)
- The Sign flag is set when the destination operand is negative.
- The flag is clear when the destination is positive.
- The sign flag is a copy of the destination's highest bit.
- Examples:
assembly
mov cx,0
sub cx,1 ; CX = -1, SF = 1
add cx,2 ; CX = 1, SF = 0
mov al,0
sub al,1 ; AL = 11111111b, SF = 1
add al,2 ; AL = 00000001b, SF = 0
- Signed and Unsigned Integers
- All CPU instructions operate exactly the same on signed and unsigned integers.
- The CPU cannot distinguish between signed and unsigned integers.
- YOU, the programmer, are solely responsible for using the correct data type with each instruction.
- Overflow and Carry Flags
- How the ADD instruction affects OF and CF:
- CF = (carry \text{ out of the MSB})
- OF = CF \text{ XOR MSB}
- How the SUB instruction affects OF and CF:
- CF = \text{INVERT}(carry \text{ out of the MSB})
- \text{negate the source and add it to the destination}
- OF = CF \text{ XOR MSB}
- MSB = Most Significant Bit (high-order bit)
- XOR = eXclusive-OR operation
- NEG = Negate (same as SUB 0,operand )
- Carry Flag (CF)
- The Carry flag is set when the result of an operation generates an unsigned value that is out of range (too big or too small for the destination operand).
- Examples:
assembly
mov al,0FFh
add al,1 ; CF = 1, AL = 00
; Try to go below zero:
mov al,0
sub al,1 ; CF = 1, AL = FF
- Overflow Flag (OF)
- The Overflow flag is set when the signed result of an operation is invalid or out of range.
- Examples:
assembly
; Example 1
mov al,+127
add al,1 ; OF = 1, AL = ??
; Example 2
mov al,7Fh ; OF = 1, AL = 80h
add al,1
- The two examples are identical at the binary level because 7Fh equals +127.
- To determine the value of the destination operand, it is often easier to calculate in hexadecimal.
- Rule of Thumb for Overflow Flag
- When adding two integers:
- The Overflow flag is only set when two positive operands are added and their sum is negative, or two negative operands are added and their sum is positive.
- OFFSET Operator
- OFFSET returns the distance in bytes of a label from the beginning of its enclosing segment.
- Protected mode: 32 bits
- Real mode: 16 bits
- The Protected-mode programs use only a single segment (flat memory model).
- The value returned by OFFSET is a pointer.
- Relating to C/C++:
c++
// C++ version:
char array[1000];
char * p = array;
assembly
; Assembly language:
.data
array BYTE 1000 DUP(?)
.code
mov esi,OFFSET array
- PTR Operator
- Overrides the default type of a label (variable).
- Provides the flexibility to access part of a variable.
- Little endian order is used when storing data in memory.
- Example:
assembly
.data
myDouble DWORD 12345678h
.code
; mov ax,myDouble ; error – why?
mov ax,WORD PTR myDouble ; loads 5678h
mov WORD PTR myDouble,4321h ; saves 4321h
- Little Endian Order
- Little endian order refers to the way Intel stores integers in memory.
- Multi-byte integers are stored in reverse order, with the least significant byte stored at the lowest address.
- For example, the doubleword 12345678h would be stored as 78 56 34 12.
- When integers are loaded from memory into registers, the bytes are automatically re-reversed into their correct positions.
- TYPE Operator
- The TYPE operator returns the size, in bytes, of a single element of a data declaration.
assembly
.data
var1 BYTE ?
var2 WORD ?
var3 DWORD ?
var4 QWORD ?
.code
mov eax,TYPE var1 ; 1
mov eax,TYPE var2 ; 2
mov eax,TYPE var3 ; 4
mov eax,TYPE var4 ; 8
- LENGTHOF Operator
- The LENGTHOF operator counts the number of elements in a single data declaration.
assembly
.data
byte1 BYTE 10,20,30 ; 3
array1 WORD 30 DUP(?),0,0 ; 32
array2 WORD 5 DUP(3 DUP(?)) ; 15
array3 DWORD 1,2,3,4 ; 4
digitStr BYTE "12345678",0 ; 9
.code
mov ecx,LENGTHOF array1 ; 32
- SIZEOF Operator
- The SIZEOF operator returns a value that is equivalent to multiplying LENGTHOF by TYPE.
assembly
.data
byte1 BYTE 10,20,30 ; 3
array1 WORD 30 DUP(?),0,0 ; 64
array2 WORD 5 DUP(3 DUP(?)) ; 30
array3 DWORD 1,2,3,4 ; 16
digitStr BYTE "12345678",0 ; 9
.code
mov ecx,SIZEOF array1 ; 64
- LABEL Directive
- Assigns an alternate label name and type to an existing storage location.
- LABEL does not allocate any storage of its own.
- Removes the need for the PTR operator.
assembly
.data
dwList LABEL DWORD
wordList LABEL WORD
intList BYTE 00h,10h,00h,20h
.code
mov eax,dwList ; 20001000h
mov cx,wordList ; 1000h
mov dl,intList ; 00h
Indirect Addressing
- Indirect Operands
- An indirect operand holds the address of a variable, usually an array or string.
- It can be dereferenced (just like a pointer).
assembly
.data
val1 BYTE 10h,20h,30h
.code
mov esi,OFFSET val1
mov al,[esi] ; dereference ESI (AL = 10h)
inc esi
mov al,[esi] ; AL = 20h
inc esi
mov al,[esi] ; AL = 30h
- Use PTR to clarify the size attribute of a memory operand.
assembly
.data
myCount WORD 0
.code
mov esi,OFFSET myCount
; inc [esi] ; error: ambiguous
inc WORD PTR [esi] ; ok
- Array Sum Example
assembly
.data
arrayW WORD 1000h,2000h,3000h
.code
mov esi,OFFSET arrayW
mov ax,[esi]
add esi,2; or: add esi,TYPE arrayW
add ax,[esi]
add esi,2
add ax,[esi]; AX = sum of the array
- Indirect operands are ideal for traversing an array.
- Register in brackets must be incremented by a value that matches the array type.
- Indexed Operands
- An indexed operand adds a constant to a register to generate an effective address.
- Two notational forms:
assembly
.data
arrayW WORD 1000h,2000h,3000h
.code
mov esi,0
mov ax,[arrayW + esi] ; AX = 1000h
mov ax,arrayW[esi] ; alternate format
add esi,2
add ax,[arrayW + esi]
; etc.
- Index Scaling
- Scale an indirect or indexed operand to the offset of an array element by multiplying the index by the array's TYPE.
assembly
.data
arrayB BYTE 0,1,2,3,4,5
arrayW WORD 0,1,2,3,4,5
arrayD DWORD 0,1,2,3,4,5
.code
mov esi,4
mov al,arrayB[esi*TYPE arrayB] ; 04
mov bx,arrayW[esi*TYPE arrayW] ; 0004
mov edx,arrayD[esi*TYPE arrayD] ; 00000004
- Pointers
- Declare a pointer variable that contains the offset of another variable.
- Alternate format:
ptrW DWORD OFFSET arrayW assembly
.data
arrayW WORD 1000h,2000h,3000h
ptrW DWORD arrayW
.code
mov esi,ptrW
mov ax,[esi] ; AX = 1000h
JMP and LOOP Instructions
- JMP Instruction
- JMP is an unconditional jump to a label that is usually within the same procedure.
- Syntax:
JMP target - Logic:
EIP = target - Example:
assembly
top:
.
.
jmp top
- A jump outside the current procedure must be to a special type of label called a global label.
- LOOP Instruction
- The LOOP instruction creates a counting loop.
- Syntax:
LOOP target - Logic:
- ECX = ECX – 1
- if ECX != 0, jump to target
- Implementation:
- The assembler calculates the distance, in bytes, between the offset of the following instruction and the offset of the target label (relative offset).
- The relative offset is added to EIP.
- LOOP Example
- The following loop calculates the sum of the integers 5 + 4 + 3 +2 + 1:
assembly
00000000 66 B8 0000 mov ax,0
00000004 B9 00000005 mov ecx,5
00000009 66 03 C1 L1: add ax,cx
0000000C E2 FB loop L1
0000000E
- When LOOP is assembled, the current location = 0000000E (offset of the next instruction).
- -5 (FBh) is added to the current location, causing a jump to location 00000009:
00000009 = 0000000E + FB offset.
- Nested Loop
- If you need to code a loop within a loop, you must save the outer loop counter's ECX value.
assembly
.data
count DWORD ?
.code
mov ecx,100 ; set outer loop count
L1:
mov count,ecx ; save outer loop count
mov ecx,20 ; set inner loop count
L2:
. .
loop L2; repeat the inner loop
mov ecx,count ; restore outer loop count
loop L1 ; repeat the outer loop
- Summing an Integer Array
assembly
.data
intarray WORD 100h,200h,300h,400h
.code
mov edi,OFFSET intarray; address of intarray
mov ecx,LENGTHOF intarray; loop counter
mov ax,0; zero the accumulator
L1:
add ax,[edi]; add an integer
add edi,TYPE intarray; point to next integer
loop L1 ; repeat until ECX = 0
- The preceding code calculates the sum of an array of 16-bit integers.
- Copying a String
assembly
.data
source BYTE "This is the source string",0
target BYTE SIZEOF source DUP(0)
.code
mov esi,0 ; index register
mov ecx,SIZEOF source ; loop counter
L1:
mov al,source[esi] ; get char from source
mov target[esi],al ; store it in the target
inc esi ; move to next character
loop L1 ; repeat for entire string
64-Bit Programming
- MOV instruction in 64-bit mode accepts operands of 8, 16, 32, or 64 bits.
- When you move a 8, 16, or 32-bit constant to a 64-bit register, the upper bits of the destination are cleared.
- When you move a memory operand into a 64-bit register, the results vary:
- 32-bit move clears high bits in destination
- 8-bit or 16-bit move does not affect high bits in destination
- #### MOVSXD
- Sign extends a 32-bit value into a 64-bit destination register
- The OFFSET operator generates a 64-bit address
- LOOP uses the 64-bit RCX register as a counter.
- RSI and RDI are the most common 64-bit index registers for accessing arrays.
- ADD and SUB affect the flags in the same way as in 32-bit mode.
- You can use scale factors with indexed operands.