Assembler Directives
One of the main benefits of assemblers is that they allow you to write programs in a more human-readable form but still have them translated to something understandable by the hardware. While assemblers do this translation from "words" to bytes for regular instructions, there are also so-called pseudo instructions in the form of assembler directives. They are called pseudo instructions because they will not end up as actual instructions in the assembled program but rather instruct the assembler to do certain things.
This page gives an overview of the most useful or needed assembler directives. For a full reference, see the documentation of the GNU assembler.
Assembler directives may generally be used at any position in the code. It is, however, advisable to develop and maintain a clear structure (e.g., .global
declarations always at the top/always directly before the associated label, .equ
directives always at the top, ...).
Section Directives
As explained on the page about Memory, the memory space of a running program is divided into different sections, each with different purposes. To tell the assembler which section some content should be placed in, the following directives are used:
On encountering such a directive, the assembler will place the subsequent code (until reaching a different section directive) in the defined section. If no directive is specified, it will default to the text section. Each of the directives may be used multiple times in a single file (however, it is advisable to have clear separation and structure of code for different sections).
Defining Constants
The .equ
directive can be used to define symbolic names for expressions, such as numeric constants. Any occurrences of the symbolic name are then replaced by the given expression during assembly. An example usage could be as follows:
Declaring (Global) Variables
The .byte
, .word
, .long
, and .quad
directives can be used to reserve and initialize memory for variables and or constants of the given size. Just as instructions are translated into bytes of program memory by the assembler, also these directives will lead to some bytes in the program memory. The .word
directive, for instance, reserves two bytes of memory with the value specified as part of the directive. Whether these bytes will be writeable/changeable depends on the section that this directive appears in. Each directive allows for more than one value to be specified in a comma-separated list.
Reserving Memory
Sometimes it may be necessary to reserve memory in bigger chunks than quadwords. This can be achieved by the .skip
directive, which will simply instruct the assembler to skip as many bytes as indicated before placing the next bytes. The directive can be extended with a comma and a value to fill the space with a certain value.
Strings
Many programs require some form of I/O and for that usually need to use strings. In many higher-level programming languages, including C, strings are simply arrays of characters, terminated by a zero byte. The .ascii
and .asciz
directives will take the given string and place its ASCII representation (as an array of characters) into the program memory. The two directives only differ in that asciz
will append a zero byte to the given string. Thus, the following three directives are equivalent:
Global Symbols
Any labels defined in your program are only visible within the same file. However, some symbols should also be made public to the outside. For example, if you are writing a subroutine that should be usable from within other files or, the most prominent example, your main
symbol, such that the entry point into your program is known.
The .global
directive tells the assembler to publish the following label to the symbol table, which can be seen as a sort of table of contents contained in the binary file.
Last updated