Stephen Smith's Blog

Musings on Machine Learning…

Posts Tagged ‘arm assembler

Assembly Language on the Raspberry Pi Pico

with 12 comments

Introduction

The Raspberry Pi Pico is the Raspberry Foundation’s first entry into the domain of Arduino style microcontrollers. The board contains Raspberry’s own designed SoC (System on a Chip) containing a dual core ARM Cortex-M0+ CPU along with memory and a collection of I/O circuitry. There are no keyboard, mouse or monitor ports on the board, only a micro-USB to connect to a host computer, a number of GPIO pins and three debug pins. This SoC is called the RP2040 and is licensed to other companies to use in their own boards. Raspberry supports programming this board in either C/C++ or MicroPython. The C/C++ SDK also supports Assembly Language programming to some degree and this article is a look at my first attempt to write an Assembly Language program for this board. I ran into a few problems and still have a few things to figure out and we’ll explain those in the article. We’ll write an Assembly Language version of the program we wrote in C last time to flash three connected LEDs.

ARM Cortex-M0+ Assembly Language

I blogged about 32-bit ARM Assembly Language here, and then presented the flashing LED Assembly Language program for the Raspberry Pi here. Further I wrote a whole book on 32-bit ARM Assembly Language Programming: “Raspberry Pi Assembly Language Programming”. These are all oriented to ARM’s full A-series processors which include floating point units (FPU), vector processors, virtual memory support and much more. The ARM M-series processors are a subset of these, designed to be low cost, use little memory and be very power efficient. The ARM M-series processors only contain what are called the ARM “thumb” instructions. Normally, on an A-series processor, each instruction takes 32-bits, but for some applications this uses too much memory, so ARM came up with “thumb” instructions where if the processor is operating in “thumb” mode then each instruction is only 16-bits in length, thus only using half the memory. The original set of “thumb” instructions was too limited, so ARM added a way to run some 32-bit instructions in with the 16-bit instructions and that makes the modern “thumb” instructions set used by the M-series processors. One consequence of using the “thumb” instructions is that registers R8 to R12 are not accessible and hence not implemented on the chip, thus saving circuitry. The registers you do have are all 32-bit and the Raspberry RP2040 has special multiplication and division circuitry to perform these operations quickly.

Code

This program uses the C/C++ SDK to access the GPIO pins, this means this Assembly Language program is quite similar to last week’s C program. To call a routine in Assembly, you put the first parameter in R0, the second in R1 and then call Branch with Link (BL). BL places the address of the next instruction into the LR register, so the called return returns by branching to the address contained in the LR register. When calling functions there is a convention on who has to save which register on the stack, but we don’t use any register over the function calls, so we don’t need to do this. This program is set up as an infinite loop, since there is nothing for the main routine to return to and if it does return the processor halts.

Assembly Language code:

@
@ Assembler program to flash three LEDs connected to the
@ Raspberry Pi Pico GPIO port using the Pico SDK.
@
@

.EQU LED_PIN1, 18
.EQU LED_PIN2, 19
.EQU LED_PIN3, 20
.EQU GPIO_OUT, 1
.EQU sleep_time, 200

.global main_asm             @ Provide program starting address to linker
main_asm:

MOV R0, #LED_PIN1
BL gpio_init
MOV R0, #LED_PIN1
MOV R1, #GPIO_OUT
BL link_gpio_set_dir
MOV R0, #LED_PIN2
BL gpio_init
MOV R0, #LED_PIN2
MOV R1, #GPIO_OUT
BL link_gpio_set_dir
MOV R0, #LED_PIN3
BL gpio_init
MOV R0, #LED_PIN3
MOV R1, #GPIO_OUT
BL link_gpio_set_dir
loop:   MOV R0, #LED_PIN1
MOV R1, #1
BL link_gpio_put
LDR R0, =sleep_time
BL sleep_ms
MOV R0, #LED_PIN1
MOV R1, #0
BL link_gpio_put
MOV R0, #LED_PIN2
MOV R1, #1
BL link_gpio_put
LDR R0, =sleep_time
BL sleep_ms
MOV R0, #LED_PIN2
MOV R1, #0
BL link_gpio_put
MOV R0, #LED_PIN3
MOV R1, #1
BL link_gpio_put
LDR R0, =sleep_time
BL sleep_ms
MOV R0, #LED_PIN3
MOV R1, #0
BL link_gpio_put
B       loop

.data

      .align  4 @ necessary alignment

I didn’t intend to include any C code, but I ran into a couple of problems that require it. One is that a large number of SDK functions are inline C functions which means they can’t be called from outside of C. In our case two functions gpio_set_dir and gpio_put are inline and required wrapping. The other problem is that if the main program is Assembly Language then the code to initialize the board doesn’t seem to be called. I think this is a matter of setting the correct CMake options, but I haven’t had a chance to figure it out yet. For now we have main in the C code and then call the Assembly Language main routine.

C code:

#include “hardware/gpio.h”

void link_gpio_set_dir(int pin, int dir)
{
gpio_set_dir(pin, dir);
}

void link_gpio_put(int pin, int value)
{
gpio_put(pin, value);
}

void main()
{
main_asm();
}

The Raspberry Pi Pico SDK uses the CMake system to manage builds. The SDK provides a large set of build rules. You run CMake and then it creates a makefile that compiles your program.

CMake file:

cmake_minimum_required(VERSION 3.13)

include(pico_sdk_import.cmake)

project(test_project C CXX ASM)

set(CMAKE_C_STANDARD 11)
set(CMAKE_CXX_STANDARD 17)

pico_sdk_init()

include_directories(${CMAKE_SOURCE_DIR})

add_executable(flashledsasm
  mainmem.S
  sdklink.c
)

pico_enable_stdio_uart(flashledsasm 1)
pico_add_extra_outputs(flashledsasm)
target_link_libraries(flashledsasm pico_stdlib)

Still To-Do

The program works, but there are a few things I’m not happy about. The Raspberry Pi Pico SDK is pretty new, so there aren’t a lot of answers on StackOverflow yet. The good thing is that it is all open source, so it is just a matter of time to figure out the code. Here is what I’ll be working on:

  1. How to have main be in Assembly Language and have the board properly initialized. Match the C startup sequence.
  2. Figure out the details of the GPIO registers and have Assembly Language versions of the inline C code that accesses these. They are similar to those on the full Raspberry Pi, but different.
  3. How to get constants from the C include file, on first try this didn’t work and gave syntax errors, but the SDK says they should be usable from Assembly Language. They might need a couple of fixes.

Summary

I planned to write a 100% Assembly Language program, but didn’t quite make it. At least the program works, showing you can include Assembly Language in your RP2040 projects. The support to build using the GCC macro assembler is all there and besides some interactions with the SDK all seems to work well. Of course the Raspberry Pi Pico SDK is pretty new so there will be a lot of updates and there are still a number of undocumented holes to investigate.

Written by smist08

April 16, 2021 at 9:43 am

Apple M1 Assembly Language Hello World

with 15 comments

Introduction

Last week, we talked about using a new Apple M1 based Macintosh as a development workstation and how installing Apple’s development system XCode also installed a large number of open source development tools including LLVM and make. This week, we’ll cover how to compile and run a simple command line ARM Assembly Language Hello World program.

Thanks to Alex vonBelow

My book “Programming with 64-Bit ARM Assembly Language” contains lots of sample self contained Assembly Language programs and a number of iOS and Android samples. The command line utilities are compiled for Linux using the GNU tool set. Alex vonBelow took all of these and modified them to work with the LLVM tool chain and to work within Apple’s development environment. He dealt with all the differences between Linux and MacOS/iOS as well. His version of the source code for my book, but modified for Apple M1 is available here:

https://github.com/below/HelloSilicon.

Differences Between MacOS and Linux

Both MacOS and Linux are based on Unix and are more similar than different. However there are a few differences of note:

  • MacOS uses LLVM by default whereas Linux uses GNU GCC. This really just affects the command line arguments in the makefile for the purposes of this article. You can use LLVM on Linux and GCC should be available for Apple M1 shortly.
  • The MacOS linker/loader doesn’t like doing relocations, so you need to use the ADR rather than LDR instruction to load addresses. You could use ADR in Linux and if you do this it will work in both.
  • The Unix API calls are nearly the same, the difference is that Linux redid the function numbers when they went to 64-bit, but MacOS kept the function numbers the same. In the 32-bit world they were the same, but now they are all different.
  • When calling a Linux service the function number goes in X16 rather than X8.
  • Linux installs the various libraries and includes files under /usr/lib and /usr/include, so they are easy to find and use. When you install XCode, it installs SDKs for MacOS, iOS, iPadOS, iWatchOS, etc. with the option of installing lots for versions. The paths to the libs and includes are rather complicated and you need a tool to find them.
  • In MacOS the program must start on a 64-bit boundary, hence the listing has an “.align 2” directive near top.
  • In MacOS you need to link in the System library even if you don’t make a system call from it or you get a linker error. This sample Hello World program uses software interrupts to make the system calls rather than the API in the System library and so shouldn’t need to link to it.
  • In MacOS the default entry point is _main whereas in Linux it is _start. This is changed via a command line argument to the linker.

Hello World Assembly File

Below is the simple Assembly Language program to print out “Hello World” in a terminal window. For all the gory details on these instructions and the architecture of the ARM processor, check out my book.

//
// Assembler program to print "Hello World!"
// to stdout.
//
// X0-X2 - parameters to linux function services
// X16 - linux function number
//
.global _start             // Provide program starting address to linker
.align 2

// Setup the parameters to print hello world
// and then call Linux to do it.

_start: mov X0, #1     // 1 = StdOut
adr X1, helloworld // string to print
mov X2, #13     // length of our string
mov X16, #4     // MacOS write system call
svc 0     // Call linux to output the string

// Setup the parameters to exit the program
// and then call Linux to do it.

mov     X0, #0      // Use 0 return code
       mov     X16, #1     // Service command code 1 terminates this program
       svc     0           // Call MacOS to terminate the program

helloworld:      .ascii  "Hello World!\n"

Makefile

Here is the makefile, the command to assemble the source code is simple, then the command to link is a bit more complicated.

HelloWorld: HelloWorld.o
ld -macosx_version_min 11.0.0 -o HelloWorld HelloWorld.o -lSystem -syslibroot
`xcrun -sdk macosx --show-sdk-path` -e _start -arch arm64

HelloWorld.o: HelloWorld.s
as -o HelloWorld.o HelloWorld.s

The xcrun command is Apple’s command to run or find the various SDKs. Here is a sample of running it:

stephensmith@Stephens-MacBook-Air-2 ~ % xcrun -sdk macosx –show-sdk-path
objc[42012]: Class AMSupportURLConnectionDelegate is implemented in both ?? (0x1edb5b8f0) and ?? (0x122dd02b8). One of the two will be used. Which one is undefined.
objc[42012]: Class AMSupportURLSession is implemented in both ?? (0x1edb5b940) and ?? (0x122dd0308). One of the two will be used. Which one is undefined.
objc[42013]: Class AMSupportURLConnectionDelegate is implemented in both ?? (0x1edb5b8f0) and ?? (0x1141942b8). One of the two will be used. Which one is undefined.
objc[42013]: Class AMSupportURLSession is implemented in both ?? (0x1edb5b940) and ?? (0x114194308). One of the two will be used. Which one is undefined.
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX11.1.sdk
stephensmith@Stephens-MacBook-Air-2 ~ %

After the ugly warnings from Objective-C, the path to the MacOS SDK is displayed.

Now we can compile and run our program.

stephensmith@Stephens-MacBook-Air-2 Chapter 1 % make -B
as -o HelloWorld.o HelloWorld.s
objc[42104]: Class AMSupportURLConnectionDelegate is implemented in both ?? (0x1edb5b8f0) and ?? (0x1145342b8). One of the two will be used. Which one is undefined.
objc[42104]: Class AMSupportURLSession is implemented in both ?? (0x1edb5b940) and ?? (0x114534308). One of the two will be used. Which one is undefined.
ld -macosx_version_min 11.0.0 -o HelloWorld HelloWorld.o -lSystem -syslibroot `xcrun -sdk macosx –show-sdk-path` -e _start -arch arm64 
stephensmith@Stephens-MacBook-Air-2 Chapter 1 % ./HelloWorld 
Hello World!
stephensmith@Stephens-MacBook-Air-2 Chapter 1 %

Summary

The new Apple M1 Macintoshes are running ARM processors as part of all that Apple Silicon and you can run standard ARM 64-bit Assembly Language. LLVM is a standard open source development tool which contains an Assembler that is similar to the GNU Assembler. Programming MacOS is similar to Linux since both are based on Unix and if you are familiar with Linux, most of your knowledge is directly applicable.

Written by smist08

January 8, 2021 at 10:31 am

Flashing LEDs in Assembler

with 9 comments

Introduction

Previously I wrote an article on an introduction to Assembler programming on the Raspberry Pi. This was quite a long article without much of a coding example, so I wanted to produce an Assembler  language version of the little program I did in Python, Scratch, Fortran and C to flash three LEDs attached to the Raspberry Pi’s GPIO port on a breadboard. So in this article I’ll introduce that program.

This program is fairly minimal. It doesn’t do any error checking, but it does work. I don’t use any external libraries, and only make calls to Linux (Raspbian) via software interrupts (SVC 0). I implemented a minimal GPIO library using Assembler Macros along with the necessary file I/O and sleep Linux system calls. There probably aren’t enough comments in the code, but at this point it is fairly small and the macros help to modularize and explain things.

Main Program

Here is the main program, that probably doesn’t look structurally that different than the C code, since the macro names roughly match up to those in the GPIO library the C function called. The main bit of Assembler code here is to do the loop through flashing the lights 10 times. This is pretty straight forward, just load 10 into register r6 and then decrement it until it hits zero.

 

@
@ Assembler program to flash three LEDs connected to the
@ Raspberry Pi GPIO port.
@
@ r6 - loop variable to flash lights 10 times
@

.include "gpiomacros.s"

.global _start             @ Provide program starting address to linker

_start: GPIOExport  pin17
        GPIOExport  pin27
        GPIOExport  pin22

        nanoSleep

        GPIODirectionOut pin17
        GPIODirectionOut pin27
        GPIODirectionOut pin22

        @ setup a loop counter for 10 iterations
        mov         r6, #10

loop:   GPIOWrite   pin17, high
        nanoSleep
        GPIOWrite   pin17, low
        GPIOWrite   pin27, high
        nanoSleep
        GPIOWrite   pin27, low
        GPIOWrite   pin22, high
        nanoSleep
        GPIOWrite   pin22, low

        @decrement loop counter and see if we loop
        subs    r6, #1      @ Subtract 1 from loop register setting status register
        bne     loop        @ If we haven't counted down to 0 then loop

_end:   mov     R0, #0      @ Use 0 return code
        lsl     R0, #2      @ Shift R0 left by 2 bits (ie multiply by 4)
        mov     R7, #1      @ Service command code 1 terminates this program
        svc     0           @ Linus command to terminate program

pin17:      .asciz  "17"
pin27:      .asciz  "27"
pin22:      .asciz  "22"
low:        .asciz  "0"
high:       .asciz  "1"

 

GPIO and Linux Macros

Now the real guts of the program are in the Assembler macros. Again it isn’t too bad. We use the Linux service calls to open, write, flush and close the GPIO device files in /sys/class/gpio. Similarly nanosleep is also a Linux service call for a high resolution timer. Note that ARM doesn’t have memory to memory or operations on memory type instructions, so to do anything we need to load it into a register, process it and write it back out. Hence to copy the pin number to the file name we load the two pin characters and store them to the file name memory area. Hard coding the offset for this as 20 isn’t great, we could have used a .equ directive, or better yet implemented a string scan, but for quick and dirty this is fine. Similarly we only implemented the parameters we really needed and ignored anything else. We’ll leave it as an exercise to the reader to flush these out more. Note that when we copy the first byte of the pin number, we include a #1 on the end of the ldrb and strb instructions, this will do a post increment by one on the index register that holds the memory location. This means the ARM is really very efficient in accessing arrays (even without using Neon) we combine the array read/write with the index increment all in one instruction.

If you are wondering how you find the Linux service calls, you look in /usr/include/arm-linux-gnueabihf/asm/unistd.h. This C include file has all the function numbers for the Linux system calls. Then you Google the call for its parameters and they go in order in registers r0, r1, …, r6, with the return code coming back in r0.

 

@ Various macros to access the GPIO pins
@ on the Raspberry Pi.

@ R5 is used for the file descriptor

.macro  openFile    fileName
        ldr         r0, =\fileName
        mov         r1, #01     @ O_WRONLY
        mov r7,     #5          @ 5 is system call number for open
        svc         0
.endm

.macro  writeFile   buffer, length
        mov         r0, r5      @ file descriptor
        ldr         r1, =\buffer
        mov         r2, #\length
        mov         r7, #4 @ 4 is write
        svc         0
.endm

.macro  flushClose
@fsync syscall
        mov         r0, r5
        mov         r7, #118    @ 118 is flush
        svc         0

@close syscall
        mov         r0, r5
        mov         r7, #6      @ 6 is close
        svc         0
.endm

@ Macro nanoSleep to sleep .1 second
@ Calls Linux nanosleep entry point which is function 162.
@ Pass a reference to a timespec in both r0 and r1
@ First is input time to sleep in seconds and nanoseconds.
@ Second is time left to sleep if interrupted (which we ignore)

.macro  nanoSleep
        ldr         r0, =timespecsec
        ldr         r1, =timespecsec
        mov         r7, #162    @ 162 is nanosleep
        svc         0
.endm

.macro  GPIOExport  pin
        openFile    gpioexp
        mov         r5, r0      @ save the file descriptor
        writeFile   \pin, 2
        flushClose
.endm

.macro  GPIODirectionOut   pin
        @ copy pin into filename pattern
        ldr         r1, =\pin
        ldr         r2, =gpiopinfile
        add         r2, #20
        ldrb        r3, [r1], #1 @ load pin and post increment
        strb        r3, [r2], #1 @ store to filename and post increment
        ldrb        r3, [r1]
        strb        r3, [r2]
        openFile    gpiopinfile
        writeFile   outstr, 3
        flushClose
.endm

.macro  GPIOWrite   pin, value
        @ copy pin into filename pattern
        ldr         r1, =\pin
        ldr         r2, =gpiovaluefile
        add         r2, #20
        ldrb        r3, [r1], #1    @ load pin and post increment
        strb        r3, [r2], #1    @ store to filename and post increment
        ldrb        r3, [r1]
        strb        r3, [r2]
        openFile    gpiovaluefile
        writeFile   \value, 1
        flushClose
.endm

.data
timespecsec:   .word   0
timespecnano:  .word   100000000
gpioexp:    .asciz  "/sys/class/gpio/export"
gpiopinfile: .asciz "/sys/class/gpio/gpioxx/direction"
gpiovaluefile: .asciz "/sys/class/gpio/gpioxx/value"
outstr:     .asciz  "out"
            .align  2          @ save users of this file having to do this.
.text

Makefile

Here is a simple makefile for the project if you name the files as indicated. Again note that WordPress and Google Docs may mess up white space and quote characters so these might need to be fixed if you copy/paste.

model: model.o
    ld -o model model.o

model.o: model.s gpiomacros.s
    as -ggdb3 -o model.o model.s

clean:
    rm model model.o

 

IDE or Not to IDE

People often do Assembler language development in an IDE like Code::Blocks. Code::Blocks doesn’t support Assembler language projects, but you can add Assembler language files to C projects. This is a pretty common way to do development since you want to do more programming in a higher level language like C. This way you also get full use of the C runtime. I didn’t do this, I just used a text editor, make and gdb (command line). This way the above program has no extra overhead the executable is quite small since there is no C runtime or any other library linked to it. The debug version of the executable is only 2904 bytes long and non debug is 2376 bytes. Of course if I really wanted to reduce executable size, I could have used function calls rather than Assembler macros as the macros duplicate the code everywhere they are used.

Summary

Assembler language programming is kind of fun. But I don’t think I would want to do too large a project this way. Hats off to the early personal computer programmers who wrote spreadsheet programs, word processors and games entirely in Assembler. Certainly writing a few Assembler programs gives you a really good understanding of how the underlying computer hardware works and what sort of things your computer can do really efficiently. You could even consider adding compiler optimizations for your processor to GCC, after all compiler code generation has a huge effect on your computer’s performance.

Written by smist08

January 7, 2018 at 7:08 pm

Raspberry Pi Assembly Programming

with 6 comments

Introduction

Most University Computer Science programs now do most of their instruction in quite high level languages like Java which don’t require you to need to know much about the underlying computer architecture. I think this tends to create quite large gaps in understanding which can lead to quite bad program design mistakes. Learning to program C puts you closer to the metal since you need to know more about memory and low level storage, but nothing really teaches you how the underlying computer works than doing some Assembler Language Programming, The Raspbian operating system comes with the GNU Assembler pre-installed, so you have everything you need to try Assembler programming right out of the gate. Learning a bit of Assembler teaches you how the Raspberry Pi’s ARM processor works, how a modern RISC architecture processes instructions and how work is divided by the ARM CPU and the various co-processors which are included on the same chip.

A Bit of History

The ARM processor was originally developed to be a low cost processor for the British educational computer, the Acorn. The developers of the ARM felt they had something and negotiated a split into a separate company selling CPU designs to hardware manufacturers. Their first big sale was to Apple to provide a CPU for Apple’s first PDA the Newton. Their first big success was the inclusion of their chip design in Apple’s iPods. Along the way many chip makers like TI which had given up competing on CPUs built single chip computers around the ARM. These ended up being included in about every cell phone including those from Nokia and Apple. Nowadays pretty up every Android phone is also built around ARM designs.

ARM Assembler Instructions

There have been a lot of ARM designs from early simple 16 bit processors up through 32 bit processors to the current 64bit designs. In this article I’m just going to consider the ARM processor in the Raspberry Pi, this processor is a 64 bit processor, but Raspbian is still a 32 bit operating system, so I’ll just be talking about the 32 bit processing used here (and I’ll ignore the 16 bit “thumb” instructions for now).

ARM is a RISC processor which means the idea is that it executes very simple instructions very quickly. The idea is to keep the main processor simple to reduce complexity and power usage. Each instruction is 32 bits long, so the processor doesn’t need to think about how much to increment the program counter for each instruction. Interestingly nearly all the instructions are in the same format, where you can control whether it sets the status bits, can test the status bits on whether to do anything and can have four register parameters (or 2 registers and an immediate constant). One of the registers can also have a shift operation applied. So how is all this packed into on 32 bit instruction? There are 16 registers in the ARM CPU (these include the program counter, link return register and the stack pointer. There is also the status register that can’t be used as a general purpose register. This means it takes 4 bits to specify a register, so specifying 4 registers takes 16 bits out of the instruction.

Below are the formats for the main types of ARM instructions.

Now we break out the data processing instructions in more detail since these comprise quite a large set of instructions and are the ones we use the most.

Although RISC means a small set of simple instructions, we see that by cleverly using every bit in those 32 bits for an instruction that we can pack quite a bit of information.

Since the ARM is a 32-Bit processor meaning among other things that it can address a 32-Bit address space this does lead to some interesting questions:

  1. The processor is 32-Bits but immediate constants are 16-Bits. How do we load arbitrary 32-Bit quantities? Actually the immediate instruction is 12-Bits of data and 4 Bits of a shift amount. So we can load 12 Bits and then shift it into position. If this works great. If not we have to be tricky by loading and adding two quantities.
  2. Since the total instruction is 32-Bits how do we load a 32-Bit memory address? The answer is that we can’t. However we can use the same tricks indicated in number 1. Plus you can use the program counter. You can add an immediate constant to the program counter. The assembler often does this when assembling load instructions. Note that since the ARM is pipelined the program counter tends to be a couple of instructions ahead of the current one, so this has to properly be taken into account.
  3. Why is there the ability to conditionally execute all the data processing instructions? Since the ARM processor is pipelines, doing a branch instruction is quite expensive since the pipeline needs to be discarded and then reloaded at the new execution point. Having individual instructions conditionally execute can save quite a few branch instructions leading to faster execution. (As long as your compiler is good at generating RISC type code or you are hand coding in Assembler).
  4. The ARM processor has a multiply instruction, but no divide instruction? This seems strange and unbalanced. Multiply (and multiply and add) are recent additions to the ARM processor. Divide is still considered too complicated and slow, plus is it really used that much? You can do divisions on either the floating point coprocessor or the Neon SIMD coprocessor.

A Very Small Example

Assembler listings tend to be quite long, so as a minimal set let’s start with a program that just exits when run returning 17 to the command line (you can see this if you type echo $@ after running it). Basically we have an assembler directive to define the entry point as a global. We move the immediate value 17 to register R0. Shift it left by 2 bits (Linux expects this). Move 1 to register R7 which is the Linux command code to exit the program and then call service zero which is the Linux operating system. All calls to Linux use this pattern. Set some parameters in registers and then call “svc 0” to do the call. You can open files, read user input, write to the console, all sorts of things this way.

 

.global _start              @ Provide program starting address to linker
_start: mov     R0, #17     @ Use 17 for test example
        lsl     R0, #2      @ Shift R0 left by 2 bits (ie multiply by 4)
        mov     R7, #1      @ Service command code 1 terminates this program
        svc     0           @ Linux command to terminate program

 

Here is a simple makefile to compile and link the preceding program. If you save the above as model.s and then the makefile as makefile then you can compile the program by typing “make” and run it by typing “./model”. as is the GNU-Assembler and ld is the linker/loader.

 

model: model.o
     ld -o model model.o

model.o: model.s
     as -g -o model.o model.s

clean:
     rm model model.o

 

Note that make is very particular about whitespace. It must be a true “tab” character before the commands to execute. If Google Docs or WordPress changes this, then you will need to change it back. Unfortunately word processors have a bad habit of introducing syntax errors to program listings by changing whitespace, quotes and underlines to typographic characters that compilers (and assemblers) don’t recognize.

Summary

Although these days you can do most things with C or a higher level language, Assembler still has its place in certain applications that require performance or detailed hardware interaction. For instance perhaps tuning the numpy Python library to use the Neon coprocessor for faster operation of vector type operations. I do still think that every programmer should spend some time playing with Assembler language so they understand better how the underlying processor and architecture they use day to day really works. The Raspberry Pi offers and excellent environment to do this with the good GNU Macro Assembler, the modern ARM RISC architecture and the various on-chip co-processors.

Just the Assembler introductory material has gotten fairly long, so we won’t get to an assembler version of our flashing LED program. But perhaps in a future article.

 

Written by smist08

January 4, 2018 at 10:45 pm