A0: A Running Example

This reference documentation introduced many concepts, definitions, and conventions that are important to know when programming in Assembly. You will likely need to revisit some of the pages once you start working on the assignments. However, most of it should eventually become part of your "inner manual" (also known as your intuition).

To help with this process of understanding and connecting the various concepts you have traversed, this page will discuss an algorithm, transform it into a formal specification, and finally build a running program for the specification.


Step 1: Algorithm

Eratosthenes of Cyrene lived in northern Africa from 276 B.C. to 194 B.C. and he devised what is probably the oldest known algorithm for finding prime numbers: the Sieve of Eratosthenes. The algorithm is not terribly efficient but it is very easy to understand, which is exactly why it is used here.

The algorithm starts by constructing a list of all numbers between 2 and some upper limit, in the case of this example 999:

2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 [...] 997 998 999

As the first step, the first number of the list (2) is taken and all multiples of it are removed from the list, except 2 itself:

2 3 . 5 . 7 . 9 .. 11 .. 13 .. 15 .. 17 .. 19 .. 21 [...] 997 ... 999

The process is repeated with the next number in the list (3), removing all multiples, except 3 itself:

2 3 . 5 . 7 . . .. 11 .. 13 .. .. .. 17 .. 19 .. .. [...] 997 ... ...

The algorithm finishes after repeating the previous step, selecting the next number still in the list, until reaching the end of the list. After that, the list consists of all prime numbers until the upper limit.


Step 2: Specification

The next step on the way to a running program is to transform this algorithm into a formal specification. Why is this necessary? While the previous step describes the working steps of the algorithm, there are many technical details left out that need to be addressed before starting the implementation phase.

One of those details is the choice of representation: How shall the list of numbers be represented? One option would be to use a linked list structure, another option is to use a list of fixed size, such as an array. What implications for the complexity of the program does one choice have compared to the other? Resolving questions like this is a (critical) part of the creative challenge of programming, so usually you will have to decide on these matters yourself.

For this example, we will, instead of maintaining a list of remaining numbers, use an array of Boolean values, denoting for each index whether or not the number is still present. All entries of the array are initialized to true; a number is removed by setting its corresponding table entry to false.

The algorithm can now be translated to the following pseudocode description, written to closely resemble the C syntax:

define: SIZE 1000
bool numbers[SIZE]

main() {
    // Initialize the number table
    for (int i = 0; i < SIZE; i++)
        numbers[i] = true

    // Main Loop - Loop through all positions
    for (int current = 2; current < SIZE; current++) {

        // Check if the current number is still in the list
        if (numbers[current]) {
            // It is? It's a prime, print it
            print(current)

            int multiple = 2 * current
            // Nested Loop - Cross out all multiples of the number
            while (multiple < SIZE) {
                numbers[multiple] = false
                multiple *= 2
            }
        } 
    }
}

This specification uses only simple operations on simple data types (integers and Booleans) and basic programming constructs such as loops and conditions. With the information given in this reference documentation, these constructs can easily be translated to the corresponding Assembly implementation, as will be shown in the next step.


Step 3: Implementation

The final step for getting a running program for the given algorithm is... well... to turn it into a running program. As the formal specification is already defined, this step can focus purely on the how of implementation, and no longer needs to address the what. As assembly languages are verbose by nature and require carefully thought-out instruction sequences, working with a formal specification greatly reduces the (reworking) effort for the implementation part.

Below is the x86-64 Assembly implementation of Eratosthenes' Sieve algorithm as defined in the formal specification of step 2. Try to read along and understand what is done and why, using the comments in the code and the subsequent explanations as a guide. Do not be intimidated by this example - the first programs you will write will be much simpler. However, after finishing all mandatory assignments, you should be able to follow and understand all parts of this program.

.global main

.equ SIZE, 1000

.bss
numbers:
    .skip SIZE                      # space for `numbers` table

.text
int_fmt: 
    .asciz "%d\n"                   # format string for printing an integer and a newline

main:
    pushq   %rbp                    # save previous base pointer
    movq    %rsp, %rbp              # set base pointer for new stack frame

    movq    $0, %rcx                # init `i` to 0
    leaq    numbers(%rip), %rax     # load address of `numbers`
init_loop:
    cmpq    $SIZE, %rcx              
    jge     after_init_loop         # exit the `for` loop once `i >= SIZE`
    movb    $1, (%rax, %rcx)        # `numbers[i] = 1`
    incq    %rcx                    # `i++`
    jmp     init_loop               # jump to condition check

after_init_loop:
    pushq   $2                      # initialize `current` to 2 (on the stack at rbp - 8)
    subq    $8, %rsp                # keep stack 16-byte aligned
    movq    -8(%rbp), %rcx          # load `current` into rcx

main_loop:
    cmpq    $SIZE, %rcx             # compare `number` to SIZE
    jge     after_main_loop
    leaq    numbers(%rip), %rax     # load address of `numbers`
    cmpb    $1, (%rax, %rcx)        # compare `numbers[current]` to true (1)
    jne     main_loop_end           # jump to `main_loop_end` if `numbers[current] != true`
    leaq    int_fmt(%rip), %rdi     # load the format string as first argument
    movq    %rcx, %rsi              # load `current` as second argument
    movb    $0, %al                 # 0 vector registers used for printf
    call    printf                  # print current number

    movq    -8(%rbp), %rcx          # `multiple = current`
    shlq    $1, %rcx                # `multiple *= 2`
nested_loop:
    cmpq    $SIZE, %rcx             # compare `multiple` with SIZE
    jge     main_loop_end           # jump to `main_loop_end` if `multiple >= SIZE`
    leaq    numbers(%rip), %rax     # load address of `numbers`
    movb    $0, (%rax, %rcx)        # `numbers[multiple] = false`
    addq    -8(%rbp), %rcx          # `multiple += number`
    jmp     nested_loop             # jump to condition check

main_loop_end:
    movq    -8(%rbp), %rcx          # load `number` into rcx
    incq    %rcx                    # increment number
    movq    %rcx, -8(%rbp)          # save incremented `number`
    jmp     main_loop                 

after_main_loop:
    movq    $0, %rax                # return code 0
    movq    %rbp, %rsp              # reset stack pointer (-> reset stack)
    popq    %rbp                    # restore previous base pointer
    ret

Note that the program is also part of the Lab Framework as part of the a0 folder, such that you can run/modify/test it on your machine.

Assembler Directives

The code contains various (pseudo-)instructions starting with a . - take a look at the Assembler Directives page and try to understand what each of them is used for.

Registers, Variables, and the Stack

The algorithm specification uses a number of variables to store data, such as current or the loop counter i. The Assembly implementation handles some variables differently than others. Try to understand where each variable is stored and what might be the reason for that choice. It is important that you know the basics of Registers, including the concept of callee- and caller-saved registers, as well as the concepts of storing Local Variables on the stack.

Start & End of the Program

main:
    pushq   %rbp                    # save previous base pointer
    movq    %rsp, %rbp              # set base pointer for new stack frame

The beginning of the program (starting after the main label) shows 2 instructions that are nowhere to be found in the pseudocode specification. These instructions are the Prologue of the main routine. Their functioning is explained on the page about Subroutines.

The counterpart to the Prologue is the Epilogue, which constitutes the last 3 lines of the program:

    movq    %rbp, %rsp              # reset stack pointer (-> reset stack)
    popq    %rbp                    # restore previous base pointer
    ret

I/O

If you examine the program code closely and try to follow along with the pseudocode, you will notice that the print(current) instruction was translated to the following:

    leaq    int_fmt(%rip), %rdi     # load the format string as first argument
    movq    %rcx, %rsi              # load `current` as second argument
    movb    $0, %al                 # 0 vector registers used for printf
    call    printf                  # print current number

This is the Assembly implementation of the C instruction:

printf("%d\n", current);

printing the value of current as a signed integer followed by a newline character. The specifics of this call are explained on the Printing to the Terminal page of this Manual.

Programming Constructs

The pseudocode specification uses two for and one while-loop. These loops are more or less directly translated from pseudocode to their typical Assembly representations. In the latter, however, they consist of multiple instructions and labels. The basics of these concepts are explained on the page about Programming Constructs.


Step 4: Your Turn!

Once you have arrived here, you have hopefully followed all preceding pages of this Lab Manual. Great! Actively reading the required material is the first, and arguably hardest, step on the journey through this Lab course. Even though all the concepts might still be a bit of a blurry mess in your mind, you can now start working on the practical assignments (maybe take a break first though). You may revisit many pages of this reference documentation throughout your work, but over time many concepts that seem complicated right now will start becoming natural to you.

Last updated