assembly_tutorial
Assembly Language Tutorial
About Assembly Language
Definition: Assembly language is a low-level programming language specific to a computer architecture, which is translated into machine code by an assembler (e.g., NASM, MASM).
Audience: Aimed at software programmers interested in learning assembly programming starting from scratch.
Prerequisites: Basic understanding of computer programming terminologies and familiarity with any programming language.
Advantages of Assembly Language
Enables direct interaction with the operating system, processor, and BIOS.
Provides better representation of data in memory and interaction with external devices.
Allows for optimization, requiring less memory and execution time.
Ideal for time-critical applications and lower-level hardware manipulation (e.g., interrupt service routines).
Structure and Features of PC Hardware
Components: Processor, memory, registers.
Registers: Hold data and address that the processor manipulates.
Memory: Categorized into bits, bytes, words, and larger units (doublewords, quadwords).
Number Systems
Binary Number System: Base-2 system using 0s and 1s; each bit represents a power of 2.
Example: 255 is represented as
11111111.
Hexadecimal Number System: Base-16 system using digits 0-9 and letters A-F; every two hex digits represent a byte.
Example: Binary
1000 1100is Hex8C.
Assembly Language Environment Setup
Requires an assembler (NASM recommended) and an operating system (Linux).
Basic Syntax
Assembly programs consist of three sections:
Data Section: Initialized data or constants.
Syntax:
section .data
BSS Section: Uninitialized variables.
Syntax:
section .bss
Text Section: Contains actual code and starts with
global main, indicating the entry point.Syntax:
section .text
Assembly Language Statements
Types of statements:
Executable instructions (instructions the processor operates on).
Assembler directives (instructions for the assembler, do not generate machine code).
Macros (text substitution mechanism).
Example of a basic instruction format:
[label] mnemonic [operands] [;comment]
Hello World Program Example
section .text
global main
main:
mov edx, len
mov ecx, msg
mov ebx, 1
mov eax, 4
int 0x80
mov eax, 1
int 0x80
section .data
msg db 'Hello, world!', 0xa
len equ $ - msgMemory Segments in Assembly
Instruction Code: Stored in
.textsegment.Data: Stored in
.dataand.bsssegments.Stack: Used for function calls and local variables.
Registers in Assembly Language
Types include General, Control, and Segment Registers.
Examples:
EAX, EBX: General-purpose registers for data operations.
EIP: Instruction Pointer, points to the next instruction.
ESP, EBP: Stack Pointer and Base Pointer to manage memory segments for function calls.
Assembly System Calls
System calls provide interface between user applications and kernel functionalities (e.g.,
sys_read,sys_write).Example for using
sys_write:
mov eax, 4
mov ebx, 1
mov ecx, msg
mov edx, len
int 0x80Addressing Modes
Types of addressing modes include Register, Immediate, Direct Memory, Indirect.
Example of Register Addressing:
MOV EAX, EBX ; Move content of EBX to EAXAssembly Loops and Conditions
Use conditional and jump instructions (e.g.,
CMP,JE,JMP).Example of a loop:
MOV ECX, 10
L1: DEC ECX
JNZ L1Macros in Assembly
Macros provide a way to define reusable code blocks.
Example:
%macro write_string 2
mov eax, 4
mov ebx, 1
mov ecx, %1
mov edx, %2
int 0x80
%endmacroFile Management in Assembly
System calls for file handling include
sys_open,sys_read, andsys_write.
Memory Management in Assembly
use
sys_brk()for dynamic memory allocation.
These key concepts provide a solid foundation for understanding and writing assembly language programs.