Assembler Directives

One of the main benefits of assemblers is that they allow you to write programs in a more human-readable form but still have them translated to something understandable by the hardware. While assemblers do this translation from "words" to bytes for regular instructions, there are also so-called pseudo instructions in the form of assembler directives. They are called pseudo instructions because they will not end up as actual instructions in the assembled program but rather instruct the assembler to do certain things.

This page gives an overview of the most useful or needed assembler directives. For a full reference, see the documentation of the GNU assembler.

Assembler directives may generally be used at any position in the code. It is, however, advisable to develop and maintain a clear structure (e.g., .global declarations always at the top/always directly before the associated label, .equ directives always at the top, ...).


Section Directives

As explained on the page about Memory, the memory space of a running program is divided into different sections, each with different purposes. To tell the assembler which section some content should be placed in, the following directives are used:

.text
.data
.bss

On encountering such a directive, the assembler will place the subsequent code (until reaching a different section directive) in the defined section. If no directive is specified, it will default to the text section. Each of the directives may be used multiple times in a single file (however, it is advisable to have clear separation and structure of code for different sections).


Defining Constants

.equ NAME, EXPRESSION

The .equ directive can be used to define symbolic names for expressions, such as numeric constants. Any occurrences of the symbolic name are then replaced by the given expression during assembly. An example usage could be as follows:

.equ ARR_SIZE, 1024

    # ...
    subq    $ARR_SIZE, %rsp

Declaring (Global) Variables

.byte VALUE
.word VALUE
.long VALUE
.quad VALUE

The .byte, .word, .long, and .quad directives can be used to reserve and initialize memory for variables and or constants of the given size. Just as instructions are translated into bytes of program memory by the assembler, also these directives will lead to some bytes in the program memory. The .word directive, for instance, reserves two bytes of memory with the value specified as part of the directive. Whether these bytes will be writeable/changeable depends on the section that this directive appears in. Each directive allows for more than one value to be specified in a comma-separated list.

FOO:
    .byte 0xAA, 0xBB, 0xCC      # three bytes, starting at address FOO
BAR:
    .word 2718, 2818            # two words, starting at address BAR
BAZ:
    .long 0xC001CAFE            # a single long, at address BAZ
QUX:
    .quad 0xDEADBEEFBAADF00D    # a singe quad, at address QUX

    # ...
    leaq    FOO(%rip), %rax     # load address of FOO into rax
    movb    1(%rax), %r15b      # move content of second byte of FOO into r15b
A Note on Endianess

Note that the x86 is a little-endian machine, which means that a value like 0x12345678 will actually end up in memory as: 78 56 34 12. Of course, this usually remains unnoticed, as the assembler takes care of endianness for you when moving the requested values to the program memory. Taking endianness into account, it should be clear that the following four statements will give the exact same bytes in memory:

.byte 0x0D, 0xF0, 0xAD, 0xBA, 0xEF, 0xBE, 0xAD, 0xDE
.word 0xFOOD, 0xBAAD, 0xBEEF, 0xDEAD
.long 0xBAADFOOD, 0xDEADBEEF
.quad 0xDEADBEEFBAADFOOD

Reserving Memory

.skip AMOUNT [, FILL]

Sometimes it may be necessary to reserve memory in bigger chunks than quadwords. This can be achieved by the .skip directive, which will simply instruct the assembler to skip as many bytes as indicated before placing the next bytes. The directive can be extended with a comma and a value to fill the space with a certain value.

BUFFER:
    .skip 1024      # reserve 1024 bytes
INITIALIZED_BUFFER:
    .skip 512, 0x1  # reserve 512 bytes and initialize each to 0x1
        
    #...
    leaq    BUFFER(%rip), %rsi    # load address of BUFFER into rsi

Strings

.ascii "STRING"
.asciz "STRING"

Many programs require some form of I/O and for that usually need to use strings. In many higher-level programming languages, including C, strings are simply arrays of characters, terminated by a zero byte. The .ascii and .asciz directives will take the given string and place its ASCII representation (as an array of characters) into the program memory. The two directives only differ in that asciz will append a zero byte to the given string. Thus, the following three directives are equivalent:

.ascii "Hello!"
.byte 0x00

.asciz "Hello!"

.byte 'H', 'e', 'l', 'l', 'o', '!', 0x0

Global Symbols

.global label

Any labels defined in your program are only visible within the same file. However, some symbols should also be made public to the outside. For example, if you are writing a subroutine that should be usable from within other files or, the most prominent example, your main symbol, such that the entry point into your program is known.

The .global directive tells the assembler to publish the following label to the symbol table, which can be seen as a sort of table of contents contained in the binary file.

Last updated