assembly_tutorial

Assembly Language Tutorial

About Assembly Language

  • Definition: Assembly language is a low-level programming language specific to a computer architecture, which is translated into machine code by an assembler (e.g., NASM, MASM).

  • Audience: Aimed at software programmers interested in learning assembly programming starting from scratch.

  • Prerequisites: Basic understanding of computer programming terminologies and familiarity with any programming language.

Advantages of Assembly Language

  • Enables direct interaction with the operating system, processor, and BIOS.

  • Provides better representation of data in memory and interaction with external devices.

  • Allows for optimization, requiring less memory and execution time.

  • Ideal for time-critical applications and lower-level hardware manipulation (e.g., interrupt service routines).

Structure and Features of PC Hardware

  • Components: Processor, memory, registers.

  • Registers: Hold data and address that the processor manipulates.

  • Memory: Categorized into bits, bytes, words, and larger units (doublewords, quadwords).

Number Systems

  • Binary Number System: Base-2 system using 0s and 1s; each bit represents a power of 2.

    • Example: 255 is represented as 11111111.

  • Hexadecimal Number System: Base-16 system using digits 0-9 and letters A-F; every two hex digits represent a byte.

    • Example: Binary 1000 1100 is Hex 8C.

Assembly Language Environment Setup

  • Requires an assembler (NASM recommended) and an operating system (Linux).

Basic Syntax

  • Assembly programs consist of three sections:

    1. Data Section: Initialized data or constants.

      • Syntax: section .data

    2. BSS Section: Uninitialized variables.

      • Syntax: section .bss

    3. Text Section: Contains actual code and starts with global main, indicating the entry point.

      • Syntax: section .text

Assembly Language Statements

  • Types of statements:

    1. Executable instructions (instructions the processor operates on).

    2. Assembler directives (instructions for the assembler, do not generate machine code).

    3. Macros (text substitution mechanism).

  • Example of a basic instruction format:

    • [label] mnemonic [operands] [;comment]

Hello World Program Example

section .text
global main
main:
    mov edx, len
    mov ecx, msg
    mov ebx, 1
    mov eax, 4
    int 0x80
    mov eax, 1
    int 0x80

section .data
msg db 'Hello, world!', 0xa
len equ $ - msg

Memory Segments in Assembly

  • Instruction Code: Stored in .text segment.

  • Data: Stored in .data and .bss segments.

  • Stack: Used for function calls and local variables.

Registers in Assembly Language

  • Types include General, Control, and Segment Registers.

  • Examples:

    1. EAX, EBX: General-purpose registers for data operations.

    2. EIP: Instruction Pointer, points to the next instruction.

    3. ESP, EBP: Stack Pointer and Base Pointer to manage memory segments for function calls.

Assembly System Calls

  • System calls provide interface between user applications and kernel functionalities (e.g., sys_read, sys_write).

  • Example for using sys_write:

mov eax, 4
mov ebx, 1
mov ecx, msg
mov edx, len
int 0x80

Addressing Modes

  • Types of addressing modes include Register, Immediate, Direct Memory, Indirect.

  • Example of Register Addressing:

MOV EAX, EBX    ; Move content of EBX to EAX

Assembly Loops and Conditions

  • Use conditional and jump instructions (e.g., CMP, JE, JMP).

  • Example of a loop:

MOV ECX, 10
L1: DEC ECX
JNZ L1

Macros in Assembly

  • Macros provide a way to define reusable code blocks.

  • Example:

%macro write_string 2
    mov eax, 4
    mov ebx, 1
    mov ecx, %1
    mov edx, %2
    int 0x80
%endmacro

File Management in Assembly

  • System calls for file handling include sys_open, sys_read, and sys_write.

Memory Management in Assembly

  • use sys_brk() for dynamic memory allocation.


These key concepts provide a solid foundation for understanding and writing assembly language programs.