Stephen Smith's Blog

Musings on Machine Learning…

Archive for the ‘raspberry pi’ Category

I/O Co-processing on the Raspberry Pi Pico

with 4 comments

Introduction

Last time we looked at how to access the RP2040’s GPIO registers directly from the CPU in Assembly Language. This is a common technique to access and control hardware wired up to a microcontroller’s GPIO pins; however, the RP2040 contains a number of programmable I/O (PIO) coprocessors that can be used to offload this work from the main ARM CPUs. In this article we’ll give a quick overview of the PIO coprocessors and present an example that moves the LED blinking logic from the CPU over to the coprocessors, freeing the CPU to perform other work. There is a PIO blink program in the SDK samples, which blinks three LEDs at different frequencies, we’ll take that program and modify it to blink the LEDs in turn so that it works the same as the examples we’ve been working with.

PIO Overview

There are eight PIO coprocessors divided into two banks for four. Each bank has a single 32 word instruction memory that contains the program(s) that run on the coprocessors. 32 instructions aren’t very many, but you can do quite a bit with these. The SDK contains samples that implement quite a few communication protocols as well as showing how to do video output. 

Each PIO has an input and output FIFO buffer for exchanging data with the main CPUs.

The PIO coprocessors execute their own Assembly Language which the Raspberry folks call a state machine, though they also say they think it is Turing-complete. Below is a diagram showing one of the banks of four. This block is then duplicated twice in the RP2040 package.

Each processor has an X and Y 32-bit general purpose register, input and output shift registers for transferring data to and from the FIFOs, a clock divider register to help control timing, a program counter and then the register to hold the executing instruction as shown in the following diagram.

Each instruction can contain a few bits that specify a delay value, so for many protocols you can control the timing just by adding a timing delay to each instruction. Combine this with the clock divider register to slow down processing and you have a lot of control of timing without using extra instructions.

Sample LED Blinking Program

You write the Assembly Language PIO part of the program into a .pio file which is then compiled by the PIO Assembler into a .h file to include into your program. You can also include C helper functions here and the Pico SDK recommends including an initialization function. The various RP2040 SDK functions to support this are pretty standard and you tend to copy/paste these from the SDK samples.

We are blinking the LEDS using a 200ms delay time which by computer speeds is very slow, but for humans is quite quick. This means we can’t use the clock divider functionality and instruction delays as they don’t go this slow. Instead we have to rely on an old fashioned delay loop. We calculated the delay value in the main function using the frequency of the processor and then doing a loop. We do this delay loop twice because we need to wait for two other LEDs to flash before it’s our turn again. The pull instruction pulls the delay from the read FIFO, then out transfers it to the y register. We move y to x, turn on the pin and then do the delay loop decementing x until its zero. Then we turn the pin off and do the delay loop twice.

.program blink
    pull block
    out y, 32
.wrap_target
    mov x, y
    set pins, 1   ; Turn LED on
lp1:
    jmp x– lp1   ; Delay for (x + 1) cycles, x is a 32 bit number
    mov x, y
    set pins, 0   ; Turn LED off
lp2:
    jmp x– lp2   ; Delay for the same number of cycles again
    mov x, y
lp3:   ; Do it twice since need to wait for 2 other leds to blink
    jmp x– lp3   ; Delay for the same number of cycles again
.wrap             ; Blink forever!

% c-sdk {
// this is a raw helper function for use by the user which sets up the GPIO output, and configures the SM to output on a particular pin

void blink_program_init(PIO pio, uint sm, uint offset, uint pin) {
   pio_gpio_init(pio, pin);
   pio_sm_set_consecutive_pindirs(pio, sm, pin, 1, true);
   pio_sm_config c = blink_program_get_default_config(offset);
   sm_config_set_set_pins(&c, pin, 1);
   pio_sm_init(pio, sm, offset, &c);
}
%}

Now the main C program. In this one we configure the pins to use. Note that we will use a coprocessor for each pin, so three coprocessors but each one executing the same program. We start a pin flashing, sleep 200ms and then start the next  one. This way we achieve the same effect as we did in our previous programs.

After we get the LED flashing running on the coprocessors, we have an infinite loop that just prints a counter out to the serial port. This is to demonstrate that the CPU can go on and do anything it wants and the LEDs will keep flashing independently without any of the CPU’s attention.

#include <stdio.h>

#include “pico/stdlib.h”
#include “hardware/pio.h”
#include “hardware/clocks.h”
#include “blink.pio.h”

const uint LED_PIN1 = 18;
const uint LED_PIN2 = 19;
const uint LED_PIN3 = 20;
#define SLEEP_TIME 200

void blink_pin_forever(PIO pio, uint sm, uint offset, uint pin, uint freq);

int main() {
    int i = 0;

    setup_default_uart();

    PIO pio = pio0;
    uint offset = pio_add_program(pio, &blink_program);
    printf(“Loaded program at %d\n”, offset);
    blink_pin_forever(pio, 0, offset, LED_PIN1, 5);
    sleep_ms(SLEEP_TIME);
    blink_pin_forever(pio, 1, offset, LED_PIN2, 5);
    sleep_ms(SLEEP_TIME);
    blink_pin_forever(pio, 2, offset, LED_PIN3, 5);

    while(1)
    {
        i++;
        printf(“Busy counting away i = %d\n”, i);
    }
}

void blink_pin_forever(PIO pio, uint sm, uint offset, uint pin, uint freq) {
    blink_program_init(pio, sm, offset, pin);
    pio_sm_set_enabled(pio, sm, true);
    printf(“Blinking pin %d at %d Hz\n”, pin, freq);
    pio->txf[sm] = clock_get_hz(clk_sys) / freq;
}

Summary

This was a quick introduction to the RP2040’s PIO coprocessors. The goal of any microcontroller is to control other interfaced hardware, whether measurement sensors or communications devices (like Wifi). The PIO coprocessors give the RP21040 programmer a powerful weapon to develop sophisticated integration projects without requiring a lot of specialized hardware to make things easier. It might be nice to have a larger instruction memory, but then in a $4 USD device, you can’t really complain.

For people playing with the Raspberry Pi Pico or another RP2040 based board, you can program in 32-bit ARM Assembly Language and might want to consider my book “Raspberry Pi Assembly Language Programming”.

Written by smist08

April 30, 2021 at 10:02 am

Bit-Banging the Raspberry Pi Pico’s GPIO Registers

with 4 comments

Introduction

Last week, I introduced my first Assembly Language program for the Raspberry Pi Pico. This was a version of my flashing LED program that I implemented in a number of programming languages for the regular Raspberry Pi. In the original article, I required three routines written in C to make things work. Yesterday, I showed how to remove one of these C routines, namely to have the main routine written in Assembly Language. Today, I’ll show how to remove the two remaining C routines, which were wrappers for two SDK routines which are implemented as inline C functions and as a consequence only usable from C code.

In this article, we’ll look at the structure for the GPIO registers on the RP2040 and how to access these. The procedure we are using is called bit-banging because we are using one of the two M0+ ARM CPU cores to loop banging the bits in the GPIO registers to turn them on and off. This isn’t the recommended way to do this on the RP2040. The RP2040 implements eight programmable I/O (PIO) co-processors that you can program to offload this sort of thing from the CPU. We’ll look at how to do that in a future article, but as a first step we are going to explore bit-banging mostly to understand the RP2040 hardware better.

The RP2040 GPIO Hardware Registers

There are 28 programmable GPIO pins on the Pico. There are 40 pins, but the others are ground, power and a couple of specialized pins (see the diagram below).

This means that we can assign each one to a bit in a 32-bit hardware register which is mapped to 32-bits of memory in the RP2040’s address space. The GPIO functions are controlled by writing a 1 bit to the correct position in the GPIO register. There is one register to turn on a GPIO pin and a different register to turn it off, this means you don’t need to read the register, change one bit and then write it back. It’s quite easy to program these since you just place one in a CPU register, shift it over by the pin number and then write it to the correct memory location. These registers start at memory location 0xd0000000 and are defined in sio.h. Note there are two sio.h files, one in hardware_regs which contains the offsets and is better for Assembly Language usage and then one in hardware_structs which contains a C structure to map over the registers. Following are the GPIO registers, note that there are a few other non-GPIO related registers at this location and a few unused gaps in case you are wondering why the addresses aren’t contiguous.

RegisterAddress
gpio_in0xd0000004
gpio_hi_in0xd0000008
gpio_out0xd0000010
gpio_set0xd0000014
gpio_clr0xd0000018
gpio_togl0xd000001c
gpio_oe0xd0000020
gpio_oe_set0xd0000024
gpio_oe_clr0xd0000028
gpio_togl0xd000002c
gpio_hi_out0xd0000030
gpio_hi_set0xd0000034
gpio_hi_clr0xd0000038
gpio_hi_togl0xd000003c
gpio_hi_oe0xd0000040
gpio_hi_oe_set0xd0000044
gpio_hi_oe_clr0xd0000048
gpio_hi_oe_togl0xd000004c

Notice that there are a number of _hi_ registers, perhaps indicating that Raspberry plans to come out with a future version with more than 32 GPIO pins.

In the SDK and my code below we just write one bit at a time, I don’t know if the RP2040’s circuitry can handle writing more bits at once, for instance can we set all three pins to output in one write instruction? Remember hardware registers tend to have minimal functionality to simplify the electronics circuitry behind them so often you can’t get too complicated in what you expect of them.

Bit-Banging the Registers in Assembly

Below is the new updated program that doesn’t require the C file. In our routines to control the GPIO pins, we pass the pin number as parameter 1, which means it is in R0. We place 1 in R3 and then shift it left by the value in R0 (the pin number). This gives the value we need to write. We then load the address of the register we need, which we specified in the .data section and write the value. Note that we need two LDR instructions, once to load the address of the memory address and then the second to load the actual value.

@
@ Assembler program to flash three LEDs connected to the
@ Raspberry Pi GPIO port using the Pico SDK.
@
@

.EQU LED_PIN1, 18
.EQU LED_PIN2, 19
.EQU LED_PIN3, 20
.EQU sleep_time, 200

.thumb_func
.global main             @ Provide program starting address to linker

.align  4 @ necessary alignment

main:

@ Init each of the three pins and set them to output

MOV R0, #LED_PIN1
BL gpio_init
MOV R0, #LED_PIN1
BL gpiosetout
MOV R0, #LED_PIN2
BL gpio_init
MOV R0, #LED_PIN2
BL gpiosetout
MOV R0, #LED_PIN3
BL gpio_init
MOV R0, #LED_PIN3
BL gpiosetout

loop:

@ Turn each pin on, sleep and then turn the pin off

MOV R0, #LED_PIN1
BL gpio_on
LDR R0, =sleep_time
BL sleep_ms
MOV R0, #LED_PIN1
BL gpio_off
MOV R0, #LED_PIN2
BL gpio_on
LDR R0, =sleep_time
BL sleep_ms
MOV R0, #LED_PIN2
BL gpio_off
MOV R0, #LED_PIN3
BL gpio_on
LDR R0, =sleep_time
BL sleep_ms
MOV R0, #LED_PIN3
BL gpio_off

B       loop @ loop forever

gpiosetout:
@ write a 1 bit to the pin position in the output set register
movs r3, #1
lsl r3, r0 @ shift over to pin position
ldr r2, =gpiosetdiroutreg @ address we want
ldr r2, [r2]
str r3, [r2]
bx lr

gpio_on:
movs r3, #1
lsl r3, r0 @ shift over to pin position
ldr r2, =gpiosetonreg @ address we want
ldr r2, [r2]
str r3, [r2]
bx lr

gpio_off:
movs r3, #1
lsl r3, r0 @ shift over to pin position
ldr r2, =gpiosetoffreg @ address we want
ldr r2, [r2]
str r3, [r2]
bx lr

.data
      .align  4 @ necessary alignment
gpiosetdiroutreg: .word   0xd0000024 @ mem address of gpio registers
gpiosetonreg: .word   0xd0000014 @ mem address of gpio registers
gpiosetoffreg: .word   0xd0000018 @ mem address of gpio registers

Having separate functions for gpio_in and gpio_out simplifies our code since we don’t need any conditional logic to load the correct register address.

We loaded the actual address from a shared location. We could have loaded the base address of 0xd000000 and then stored things via an offset, but I did this to be a little clearer. If you look at the disassembly of the SDK routine, it does something rather clever to get the base address. It does:

movs r2, #208 @ 0xd0
lsl r2, r2, #24 @ becomes 0xd0000000

And then uses something like:

str r3, [r2, #40] @ 0x28

To store the value using an index which is the offset to the correct register. I thought this was rather clever on the C compiler’s part and represents the optimizations that the ARM engineers have been adding to the GCC generation of ARM code. This technique takes the same time to execute, but doesn’t require saving any values in memory, saving a few bytes which may be crucial in a larger program.

Summary

Writing to the hardware registers directly on the Raspberry Pi Pico is a bit simpler than the Broadcom implementation in the full Raspberry Pi. With these routines we wrote our entire program in Assembly Language. There is still C code in the SDK which will be linked into our program and we are still calling both gpio_init and sleep_ms in the SDK. We could look at the source code in the SDK and reimplement these in Assembly Language, but I don’t think there is any need. Between the RP2040 documentation and the SDK’s source code it is possible to figure out a lot about how the Raspberry Pi Pico works.

For people playing with the Raspberry Pi Pico or another RP2040 based board, you can program in 32-bit ARM Assembly Language and might want to consider my book “Raspberry Pi Assembly Language Programming”.

Written by smist08

April 24, 2021 at 11:50 am

Calling Main in Assembly Language on the RP2040

with 2 comments

Introduction

In last week’s article, I presented my first Assembly Language program on the Raspberry Pi Pico. The program worked, but it included some C code that I wasn’t happy with. In this article, I’ll explain why I needed to have the main entry point in C, what I missed and how to correct this problem.

The entry point is a function main() with no parameters or return code called by the RP2040 initialization code after it initializes the RP2040 hardware. In C this worked no problem, but in Assembly Language it resulted in a hardware fault on executing the first instruction in my main() routine. This was a bit of a head scratcher and it took a couple of days before I realized what the problem was. My first thought was that it was alignment, but no it wasn’t that. Perhaps I needed to duplicate the first few instructions in the Assembly Language generated by the C compiler, but no that still caused a hardware fault. Rather mystifying and annoying.

Use the Source

The program you run on the Pico contains pretty much everything in a single executable, that initializes the CPU, peripheral hardware and then runs in an endless loop forever. There is no operating system, just your program. The Raspberry Pi Pico contains a bit of firmware which is activated when you power on with the bootsel button pressed, this allows the Pico to connect as a shareable flash drive to a USB host, and will allow you to copy files into the writable part of the Pico’s flash memory. After that it reboots to let the program run.

One of the good things about the Pico is that the SDK contains the source code for this whole thing, and when you build your program, it actually compiles all this source code alongside your code (there are no libraries in this environment). This means you can build a debug build where everything is debuggable including both your code and the SDK code. This means you can set a breakpoint before your code and single step through the SDK into your code. You can’t start debugging at the very first instruction, you need to let the first bit of the SDK initialize the processor before starting, but you can set a breakpoint fairly early. I found a good place was the platform_entry routine, which is an Assembly Language function in crt0.S. This is the function that initializes the SDK environment and then calls your main() starting point. The code for this routine is fairly innocuous:

platform_entry: // symbol for stack traces
    // Use 32-bit jumps, in case these symbols are moved out of branch range
    // (e.g. if main is in SRAM and crt0 in flash)
    ldr r1, =runtime_init
    blx r1
    ldr r1, =main
    blx r1
    ldr r1, =exit
    blx r1

Nothing special, it just loads the address of our main routine and calls it. Stepping through the C code, it works, stepping through the Assembly Language code, hardware fault.

At some point I thought to look at the documentation for the BLX instruction, why were they calling this rather than BL? This turned out to be the root of the problem.

On a full ARM A-series CPU, like those in a full Raspberry Pi or in your cell phone, it can execute a rich set of instructions, which are the regular ARM 32-bit instruction set, but on the microcontroller M-series CPU like in the Pico it only executes the so called “thumb” instructions. On the A-series CPU you switch back and forth between regular and thumb modes using the BLX instruction. Thumb instructions are 16-bit in length, regular instructions are 32-bit, both have to be aligned, on even bytes the other on 4-byte boundaries. Both of these are even addresses so the true address of any instruction is even, which means the low order bit isn’t really used (it has to be zero). The BLX instruction uses this low order bit to specify whether to switch to thumb mode or not. If it is one, then thumb mode, if even then regular instruction mode. Let’s look at the disassembly for this routine:

1000021a <platform_entry>:
1000021a: 4919      ldr r1, [pc, #100] ; (10000280 <__get_current_exception+0x1a>)
1000021c: 4788      blx r1
1000021e: 4919      ldr r1, [pc, #100] ; (10000284 <__get_current_exception+0x1e>)
10000220: 4788      blx r1
10000222: 4919      ldr r1, [pc, #100] ; (10000288 <__get_current_exception+0x22>)
10000224: 4788      blx r1

10000280: 100012bd .word 0x100012bd   ; runtime_init
10000284: 10000361 .word 0x10000360   ; main
10000288: 100013a9 .word 0x100013a9   ; exit

Notice the address for my main routine is even whereas the other two routines are odd. If I compile with the C routine then main has an odd address as well. I didn’t think of this because the RP2040’s M-series CPU only executes thumb instructions, so why have any functionality to switch between modes? I don’t know but if you do tell it to switch to regular instructions then you get a hardware fault.

The other question is why the author of crt0.S in the SDK calls routines with BLX rather than BL? Afterall the Pico doesn’t support regular instructions, so you are always in thumb mode. If platform_entry used BL instead, then I wouldn’t have had any problem. I wonder if this indicates they developed the SDK on an A-series CPU, perhaps before they obtained real RP2040’s and this indicates how they did early development on the SDK? Or perhaps there is a way to emulate the RP2040 on a full A-series CPU and this is how the developers at the Raspberry Pi foundation operate.

To correct the problem, we just need to indicate our main() routine is a thumb routine. We do this by placing a .thumb_func directive in front of the .global directive.

.thumb_func
.global main             @ Provide program starting address to linker

.align  4 @ necessary alignment

main:

The key point is that this is in front of the .global, since it is really just the linker that needs to process this to set up the correct address when it links in crt0.

Summary

This eliminates the need for the C main() function we had last week. Next time we’ll eliminate the two other C routines we had and explore how the Raspberry Pi Pico’s GPIO control registers work. As with most problems, working through the solution, teaches us a bit more about how the RP2040 works and reminds us that there are consequences of using a subset of the full ARM instruction set.

For people using this SDK, you can program in 32-bit ARM Assembly Language and might want to consider my book “Raspberry Pi Assembly Language Programming”.

Written by smist08

April 23, 2021 at 9:11 am

Assembly Language on the Raspberry Pi Pico

with 12 comments

Introduction

The Raspberry Pi Pico is the Raspberry Foundation’s first entry into the domain of Arduino style microcontrollers. The board contains Raspberry’s own designed SoC (System on a Chip) containing a dual core ARM Cortex-M0+ CPU along with memory and a collection of I/O circuitry. There are no keyboard, mouse or monitor ports on the board, only a micro-USB to connect to a host computer, a number of GPIO pins and three debug pins. This SoC is called the RP2040 and is licensed to other companies to use in their own boards. Raspberry supports programming this board in either C/C++ or MicroPython. The C/C++ SDK also supports Assembly Language programming to some degree and this article is a look at my first attempt to write an Assembly Language program for this board. I ran into a few problems and still have a few things to figure out and we’ll explain those in the article. We’ll write an Assembly Language version of the program we wrote in C last time to flash three connected LEDs.

ARM Cortex-M0+ Assembly Language

I blogged about 32-bit ARM Assembly Language here, and then presented the flashing LED Assembly Language program for the Raspberry Pi here. Further I wrote a whole book on 32-bit ARM Assembly Language Programming: “Raspberry Pi Assembly Language Programming”. These are all oriented to ARM’s full A-series processors which include floating point units (FPU), vector processors, virtual memory support and much more. The ARM M-series processors are a subset of these, designed to be low cost, use little memory and be very power efficient. The ARM M-series processors only contain what are called the ARM “thumb” instructions. Normally, on an A-series processor, each instruction takes 32-bits, but for some applications this uses too much memory, so ARM came up with “thumb” instructions where if the processor is operating in “thumb” mode then each instruction is only 16-bits in length, thus only using half the memory. The original set of “thumb” instructions was too limited, so ARM added a way to run some 32-bit instructions in with the 16-bit instructions and that makes the modern “thumb” instructions set used by the M-series processors. One consequence of using the “thumb” instructions is that registers R8 to R12 are not accessible and hence not implemented on the chip, thus saving circuitry. The registers you do have are all 32-bit and the Raspberry RP2040 has special multiplication and division circuitry to perform these operations quickly.

Code

This program uses the C/C++ SDK to access the GPIO pins, this means this Assembly Language program is quite similar to last week’s C program. To call a routine in Assembly, you put the first parameter in R0, the second in R1 and then call Branch with Link (BL). BL places the address of the next instruction into the LR register, so the called return returns by branching to the address contained in the LR register. When calling functions there is a convention on who has to save which register on the stack, but we don’t use any register over the function calls, so we don’t need to do this. This program is set up as an infinite loop, since there is nothing for the main routine to return to and if it does return the processor halts.

Assembly Language code:

@
@ Assembler program to flash three LEDs connected to the
@ Raspberry Pi Pico GPIO port using the Pico SDK.
@
@

.EQU LED_PIN1, 18
.EQU LED_PIN2, 19
.EQU LED_PIN3, 20
.EQU GPIO_OUT, 1
.EQU sleep_time, 200

.global main_asm             @ Provide program starting address to linker
main_asm:

MOV R0, #LED_PIN1
BL gpio_init
MOV R0, #LED_PIN1
MOV R1, #GPIO_OUT
BL link_gpio_set_dir
MOV R0, #LED_PIN2
BL gpio_init
MOV R0, #LED_PIN2
MOV R1, #GPIO_OUT
BL link_gpio_set_dir
MOV R0, #LED_PIN3
BL gpio_init
MOV R0, #LED_PIN3
MOV R1, #GPIO_OUT
BL link_gpio_set_dir
loop:   MOV R0, #LED_PIN1
MOV R1, #1
BL link_gpio_put
LDR R0, =sleep_time
BL sleep_ms
MOV R0, #LED_PIN1
MOV R1, #0
BL link_gpio_put
MOV R0, #LED_PIN2
MOV R1, #1
BL link_gpio_put
LDR R0, =sleep_time
BL sleep_ms
MOV R0, #LED_PIN2
MOV R1, #0
BL link_gpio_put
MOV R0, #LED_PIN3
MOV R1, #1
BL link_gpio_put
LDR R0, =sleep_time
BL sleep_ms
MOV R0, #LED_PIN3
MOV R1, #0
BL link_gpio_put
B       loop

.data

      .align  4 @ necessary alignment

I didn’t intend to include any C code, but I ran into a couple of problems that require it. One is that a large number of SDK functions are inline C functions which means they can’t be called from outside of C. In our case two functions gpio_set_dir and gpio_put are inline and required wrapping. The other problem is that if the main program is Assembly Language then the code to initialize the board doesn’t seem to be called. I think this is a matter of setting the correct CMake options, but I haven’t had a chance to figure it out yet. For now we have main in the C code and then call the Assembly Language main routine.

C code:

#include “hardware/gpio.h”

void link_gpio_set_dir(int pin, int dir)
{
gpio_set_dir(pin, dir);
}

void link_gpio_put(int pin, int value)
{
gpio_put(pin, value);
}

void main()
{
main_asm();
}

The Raspberry Pi Pico SDK uses the CMake system to manage builds. The SDK provides a large set of build rules. You run CMake and then it creates a makefile that compiles your program.

CMake file:

cmake_minimum_required(VERSION 3.13)

include(pico_sdk_import.cmake)

project(test_project C CXX ASM)

set(CMAKE_C_STANDARD 11)
set(CMAKE_CXX_STANDARD 17)

pico_sdk_init()

include_directories(${CMAKE_SOURCE_DIR})

add_executable(flashledsasm
  mainmem.S
  sdklink.c
)

pico_enable_stdio_uart(flashledsasm 1)
pico_add_extra_outputs(flashledsasm)
target_link_libraries(flashledsasm pico_stdlib)

Still To-Do

The program works, but there are a few things I’m not happy about. The Raspberry Pi Pico SDK is pretty new, so there aren’t a lot of answers on StackOverflow yet. The good thing is that it is all open source, so it is just a matter of time to figure out the code. Here is what I’ll be working on:

  1. How to have main be in Assembly Language and have the board properly initialized. Match the C startup sequence.
  2. Figure out the details of the GPIO registers and have Assembly Language versions of the inline C code that accesses these. They are similar to those on the full Raspberry Pi, but different.
  3. How to get constants from the C include file, on first try this didn’t work and gave syntax errors, but the SDK says they should be usable from Assembly Language. They might need a couple of fixes.

Summary

I planned to write a 100% Assembly Language program, but didn’t quite make it. At least the program works, showing you can include Assembly Language in your RP2040 projects. The support to build using the GCC macro assembler is all there and besides some interactions with the SDK all seems to work well. Of course the Raspberry Pi Pico SDK is pretty new so there will be a lot of updates and there are still a number of undocumented holes to investigate.

Written by smist08

April 16, 2021 at 9:43 am

Raspberry Pi Pico First Project

with 5 comments

Introduction

Last week, I blogged about Adafruit’s Feather RP2040 microcontroller. I’ve now received a Raspberry Pi Pico which is also based on the ARM RP2040 processor. The Adafruit version is oriented around programming with CircuitPython and then interfacing to the various companion boards in Adafruit’s Feather lineup. The Raspberry Pi Pico is oriented around programming in C/C++ using the Raspberry Pico SDK and then designing interfaces to other devices yourself. Although you can do either with both boards, I’m going by the direction of the two companies tutorials and volume of documentation. The Raspberry Pi Pico also favours doing your development from a full Raspberry Pi, although you can use a Windows, Linux or Mac, it is quite a bit harder to setup serial communications and debugging.

In this article, I’m going to look at doing a small project with the Raspberry Pi Pico using the C SDK. I’ll look at the elements you use to program, debug and deploy code. I’ll use the same flashing LEDs project that I’ve written about previously.

More Soldering

Like the Adafruit board, if you want to use a breadboard to set up your electronics project, you need to solder pins to the Pico to attach it. Having practiced on the Adafruit Feather, this was no problem. In addition if you want to do proper C debugging using gdb then you need to solder three wires to the debug ports on the end of the board. Note you need wires that can connect to the Raspberry Pi 4’s GPIO pins. All the soldering went fine and when I wired everything up I got serial communications going as well as played a bit with debugging using the SDK’s sample blink program. I was lucky that I could find all the wires, connectors and pins required from previous various Arduino and Raspberry starter kits.

Setting up the LEDs

If you want to use print statements while you are debugging, then you need to use the separate serial communications pins rather than a serial communications channel through the USB port. This is because when you stop the processor in the debugger, it stops it dead and disconnects any USB connections.

The picture below shows the setup. The LEDs are connected each to a GPIO port and then go through a resistor to ground. Remember LEDs have low resistance and you don’t want to short out your device. Also remember that LEDs are directional and you want to connect the plus side to the GPIO pin. The plus side is usually indicated by the longer lead wire.

In the picture above you can also see the 3 wires at the end going off to the Raspberry Pi as well as the three wires by the USB port that lead to serial communications GPIO pins on the Pi. The USB connects to the Pi to provide power, as well as is used to load any program into the flash memory.

A single picture can be hard to see, here is a video showing the LEDs flashing as well as panning around a bit to see all the connections from different angles: https://youtu.be/ogbcMp5Asu4

Code

The C/C++ SDK for the RP2040 provides a set of libraries to help with your device programming along with support for using the GCC toolchain. This means you can write code in C/C++ and Assembly Language and then debug using gdb. If you look at the C code, it is similar to the code we wrote for the regular Raspberry Pi here, though we had to write some of these library type routines against the Linux device driver ourselves.

Here is the C code:

#include <stdio.h>
#include “pico/stdlib.h”
#include “hardware/gpio.h”
#include “pico/binary_info.h”

const uint LED_PIN1 = 18;
const uint LED_PIN2 = 19;
const uint LED_PIN3 = 20;

int main()
{

    bi_decl(bi_program_description(“Copyright (c) 2021 Stephen Smith”));

    stdio_init_all();
    gpio_init(LED_PIN1);
    gpio_init(LED_PIN2);
    gpio_init(LED_PIN3);
    gpio_set_dir(LED_PIN1, GPIO_OUT);
    gpio_set_dir(LED_PIN2, GPIO_OUT);
    gpio_set_dir(LED_PIN3, GPIO_OUT);

    while (1)
    {
        puts(“Flash Loop\n”);
        gpio_put(LED_PIN1, 1);
        sleep_ms(200);
        gpio_put(LED_PIN1, 0);
        sleep_ms(200);
        gpio_put(LED_PIN2, 1);
        sleep_ms(200);
        gpio_put(LED_PIN2, 0);
        sleep_ms(200);
        gpio_put(LED_PIN3, 1);
        sleep_ms(200);
        gpio_put(LED_PIN3, 0);
        sleep_ms(200);
    }

The SDK defines projects using CMake. When you run CMake it will generate a makefile that you use to build your executable using make. Here is the CMake file:

cmake_minimum_required(VERSION 3.13)

include(pico_sdk_import.cmake)

project(test_project C CXX ASM)
set(CMAKE_C_STANDARD 11)
set(CMAKE_CXX_STANDARD 17)

pico_sdk_init()

add_executable(flashleds
  flashleds.c
)

pico_enable_stdio_uart(flashleds 1)
pico_add_extra_outputs(flashleds)
target_link_libraries(flashleds pico_stdlib)

Print Statements

In the C code there is a puts() function call to send “Flash Loop” to stdout. You can define what this is in the CMake file, in our case it’s set to the serial port (as opposed to the USB port). We can then read this data by monitoring the serial port on the Raspberry Pi 4 that we’ve connected up. One way to display the output is with the minicom program by calling:

minicom -b 115200 -o -D /dev/serial0

Which then displays:

Debugging

The Pico and the C/++ SDK have support for gdb. You run gdb on the Raspberry Pi 4 and then gdb connects remotely through minicom to the Raspberry Pi Pico. The C tutorial has full instructions on how to do this and when running it works quite well, though it takes several long command lines to get debugging going.

Visual Studio Code

The C/C++ SDK has full support for Visual Studio Code. This is the reason Raspberry added Microsoft repositories to the Raspberry Pi OS to seamlessly install and update this. After all the outcry from users, the calls to the Microsoft repositories have been removed and you need to install this by hand, as documented in the C getting started manual. I’m not really keen on Visual Studio Code, but if you like it and want to use it, go for it.

The best thing about Visual Studio Code is that it automates setting up remote debugging, so you can just say debug and Visual Studio Code does all the setup and connecting behind the scenes.

Summary

Both the Raspberry Pi Pico and Adafruit Feather RP2040 are powerful computers and great value as $5 microcontrollers. The programming environments are rich and powerful. You have lots of choices on how to program these with a lot of good supporting libraries and SDKs. The RP2040 exposes all sorts of GPIO ports and interfacing technology for you to make use of in your projects. It will be interesting to see all the RP2040 based DIY projects being showcased.

For people using this SDK, you can program in 32-bit ARM Assembly Language and might want to consider my book “Raspberry Pi Assembly Language Programming”.

Written by smist08

April 8, 2021 at 3:12 pm

Introduction to the RP2040

with 10 comments

Introduction

The Raspberry Pi foundation recently started shipping the Raspberry Pi Pico, a flexible $4 USD microcontroller board. This is Raspberry’s first foray into the world of microcontroller boards, typically Raspberry Pi’s are low cost complete Linux computers and the microcontroller world has been typically dominated by the Arduino line of products, which are typically based on low cost 8-bit processors. One notable part of this announcement is that the processor included on the Pico is a dual core ARM processor with the System on a Chip (SoC) being designed by the Raspberry Pi foundation. In the past all Raspberry Pis have been based on Broadcom SoCs.

This new processor is the RP2040. Not only is this processor being used in the Pi Pico, but the Raspberry Pi foundation is selling this chip to other board manufacturers to use in their own product. Already many companies and foundations that develop Arduino boards are starting to ship products based on the RP2040, but designed to fit into their own eco-system whether it is Arduino or one of its competitors. For instance Arduino, Pimoroni, SparkFun, and Adafruit have all announced either shipping or near shipping products. I haven’t been able to order a Raspberry Pi Pico yet, since they instantly sold out, but I have been able to buy a similar Adafruit board.

All these RP2040 based boards are under $10USD and all of them have lots of GPIO ports to integrate them into your hardware projects. In this article we are going to look into some of the details of the RP2040 chip itself.

RP2040 Overview

The heart of the RP2040 are dual core ARM Cortex-M0+ 32-bit CPUs that operate at 133MHz, along with 264Kb of SRAM. This may not seem like much compared to regular Raspberry Pis or other common inexpensive computers, but in the microcontroller world this is really good. Most microcontrollers are based on slower 8-bit microcontrollers with only a few k of memory, for instance the Arduino Uno operates at 16MHz and has 6kb of memory. There are a sprinkling of 32-bit based microcontrollers based on various ARM or RISC-V models, but the RP2040 is the first to take these mainstream. Keep in mind that the original Apple II operated at 1MHz on an 8-bit CPU and only supported up to 48kb of RAM, yet it was fully capable of word processing, spreadsheets and arcade like video games. The original IBM PC only operated at 4.77MHz, had a 16-bit processor and upto 640Kb of RAM. This $4 microcontroller has far more computing power than any of these early PCs, so it will be pretty amazing to see what people produce.

The ARM Cortex-M0+ core is a low energy 32-bit processor. My book “Raspberry Pi Assembly Language Programming” covers 32-bit Assembly Language programming for a full A series ARM CPU. The M series ARM CPUs for microcontrollers contain a subset of the processing units in the full A series CPUs. For instance, there is no floating point unit (FPU) and no NEON vector processing unit. Further, it doesn’t use the full 32-bit sized instructions, instead it only uses the so called “thumb” instructions which are 16 bit in size. This means there are only eight 32-bit registers for integer arithmetic and other processing (there is only room for three bits to specify a register). With these limitations, the transistor count is greatly reduced for the CPU, reducing power usage and reducing cost. Even with these restrictions, the RP2040 is powerful enough to run Tensorflow Lite.

The RP2040 is manufactured by TSMC using an old 40nm process. This greatly reduces the cost of the chip, since there isn’t much competition for this manufacturing technology. Apple’s latest chips use a 5nm manufacturing process and there is a long waiting list of companies waiting for capacity.

Programming

Typically you program the Raspberry Pi Pico in MicroPython or C/C++ without an operating system. You write your program, compile it and download it to the Pico where it runs as a single program with no operating system. The Raspberry Pi foundation offers full support for MicroPython as well as provides a C/C++ SDK that allows you to compile programs in C, C++ or Assembly Language.

If you are familiar with Arduino and how you program these, then this is exactly how you program the Pico. Which means you need another computer to run the IDE (this could be a regular Raspberry Pi). Then the Pico is connected to this computer via USB.

Of course the Linux folks aren’t sitting still, they take the challenge that Linux should run anywhere quite seriously and have already ported Fuzix, a lightweight Linux to the Pico.

This is more likely to be used for embedded type Linux applications, since connecting monitor, keyboard and mouse are a major challenge that the Pico isn’t designed for.

Will Raspberry Produce Other Chips?

It will be interesting to see if the Raspberry Pi Foundation sticks with Broadcom chips for the regular Raspberry Pis. Will the Raspberry Pi 5 be based on a Pi Foundation designed SoC? If it is, would it allow them to use newer CPU cores? Will it mean they can incorporate an ARM Mali GPU rather than the Broadcom GPU they have been using? Will there be other models of embedded chips beyond the RP2040? Evidently the numbers are significant, the 2 is for the number of cores and the 4 indicates the memory size and then the 0’s are for possible future features. We’ll have to watch Raspberry’s product announcements closely over the next year or two.

Summary

The RP2040 is exciting because it could take the microcontroller market from 8-bit to 32-bit, increasing the programming capabilities of all these DIY electronics projects greatly. Not everyone will move to this, since it will also push the price of 8-bit microcontrollers even lower. But perhaps your next microwave or toaster will contain AI software running on an RP2040.

Written by smist08

March 26, 2021 at 10:38 am

Labists Raspberry Pi Kit Review

with one comment

Introduction

In December, I entered one of those web contests where you like something on social media and for each different like you get an entry to a contest. In this case, the contest was by betanews and the prize was one of three Labists 8Gig Raspberry Pi Starter Kits. The kit includes the Raspberry Pi, a microSD card with Raspberry Pi OS installed, a case, a fan and all the cables you need. You provide a keyboard, mouse and monitor and away you go with a full desktop computer.

Unboxing & Assembly

It came in a sturdy compact cardboard box with cardboard pieces inside to hold everything in place. Each piece was individually wrapped in plastic, which I thought was unnecessary and wasteful. There was a small assembly manual, but it was insufficient and mostly sent you to a youtube link. The problem was that it sent me to the link for a different case, so I needed to google the video for the correct case. The key things you need from the video is which three chips to put the sticky stuff on to attach to the fan and then which two GPIO pins to attach the fan leads to. Everything else was pretty obvious and easy to assemble. The video showed installing the screws with a magnetic screwdriver, but the screwdriver included in the kit wasn’t magnetic and it was too fiddly to install as shown in the video. However, it was easy to start the screws in the fan and then drop the fan in and then tighten the screws.

The Labists box and setup manual.

With all that, I had everything assembled in less than half an hour and turned it on booting into Raspberry Pi OS. Then you answer a few setup questions, namely your timezone and set a new password. After that Raspbian does an operating system update, which took perhaps 45 minutes, then it asked for a reboot and after that I was all up and running.

Impressions

I bought a 4Gig Raspberry Pi 4 when it first came out. I also bought a kit because the cables were all different from the Raspberry Pi 3. The big problem was that when the Pi 4 first came out, the case vendors didn’t realize that ventilation and cooling were a much bigger issue than for the Pi 3. The case my original Pi 4 came with didn’t have sufficient cooling and ended up not using it as there was better ventilation just not having a case. This case/fan combo is well thought out and provides plenty of cooling so I can use it and so far everything is running well. As an added bonus the dual fan has LED lights that change color as you go along. A huge improvement over what I had before.

A common criticism of the Raspberry Pi is that there is no power switch. The Raspberry foundation says this saves a bit of money and is key to keeping the low end model down at $35. Typically, I connect the Pi and its monitor to a power bar with a switch and then use the switch on the power bar to turn it on and off. The Labist kit’s power supply has a switch on the coard near the Pi unit, this switch is similar to what you find on many lamp’s power cords and I find it really convenient. 

The Raspberry Pi can drive two monitors and this kit came with two video cables, so you can do this without having to order the second cable separately. There are also a number of extra screws and spacers to help attach some of the common addons like breadboards or cameras.

My original 4Gig Pi runs pretty good, previously I had a Raspberry Pi 3 which had 1Gig RAM. The Raspberry Pi 3 would often bog down as it ran out of memory when you accidentally got too many programs running. The 4Gig in the Pi 4 fixed this for me and allowed the Raspberry Pi to run full 64-bit versions of Linux such as my favorite Kali Linux. I haven’t had a chance to stress this new 8Gig Raspberry Pi, but I expect I can run even more programs at once as well as run larger data files.

What Do People Use Raspberry Pi’s For?

The Raspberry Pi is an amazingly small and inexpensive, but powerful desktop computer. It runs Raspbian a version of Linux optimized for this hardware. What do people do with these? Here are a few common uses:

  1. Learning to program. The Raspberry Pi runs pretty much every programming language and is a great place to learn programming.
  2. Browse the Internet. The Pi runs many browsers including Chromium and Firefox for excellent browsing. The Pi has built in Wifi as well as a plug for hard wired Internet.
  3. Perform common office activities like writing, spreadsheets or presentations. It comes with LibreOffice pre-installed and this provides most of the functionality included in Microsoft Office. You can also run the Google online office apps in the browser.
  4. Use it for custom hardware control and programming. The Raspberry Pi has its flexible GPIO ports that can be used to interface the Pi to home automation hardware, control greenhouse watering, lighting and heating systems, control inexpensive robots and much more.
  5. Use it as a Pi-Hole or DNS server.
  6. Use it as a media center.
  7. Use it for gaming. There are amazing retro-gaming systems for the Raspberry Pi, if you buy an inexpensive USB game controller you can run pretty much any retro gaming console or PC game on the Pi.

The Microsoft Controversy

There is a recent controversy with the Raspberry Pi calling home to Microsoft which has pissed off quite a few open source purists. What this is, is that the Raspberry Pi OS now lets you easily install Microsoft Visual Studio Code from its package manager and rather than hosting the code in a Raspberry Pi repository, it directly accesses the Microsoft repository. You can see this being accessed in the following screenshot.

This is a convenience as it is another development environment you can learn on the Pi and there is a lot of good functionality there. However it means every time you check for updates, a Microsoft server is pinged. This means Microsoft can gather the IP addresses of every Raspberry Pi, mostly allowing them to know how many Raspberry Pi’s there are out there. Anyway, if you don’t like this, you can edit the list of repo’s checked or use an alternative version of Linux such as Kali or Ubuntu Linux. I’m not too upset about this, but I do think it looks strange seeing Microsoft referenced every time you do an update.

Summary

The Raspberry Pi 4 is an amazing inexpensive but powerful desktop computer. There is tons of software that runs on the Pi and the internet is full of hardware projects you can use the Pi with. The Raspberry Pi 4 was originally released in June, 2019, originally there were a few software problems and overheating problems. Since then all those problems have been fixed and the various Pi kits have been refined with better cases and power supplies. Certainly if you are looking for a complete Raspberry Pi kit, Labists makes a good choice.

Written by smist08

February 12, 2021 at 11:11 am

Blocking Speculative Execution Vulnerabilities on ARM Processors

with 4 comments

Introduction

We blogged previously about the Spectre and Meltdown exploits that use the side effects of a CPU’s speculative execution to learn secrets that should be protected. The other day, ARM put out a warning and white paper for another similar exploit called “Straight-line Speculation”. Spectre and Meltdown took advantage of branch prediction, this is the opposite case where the processor’s speculation mechanism continues on past an unconditional branch instruction.

These side-channel attacks are hard to set up and exploit, but the place where they are most dangerous is in virtual environments, especially in the cloud. This is since it is a case where data leaks from one virtual machine to another and allows you to steal data from another cloud user. Running on a phone or PC isn’t so dangerous, since if you have the ability to run a program, you have way more power to do much more dangerous things, so why bother with this? This means that as ARM is used more often in cloud data centers, they are going to have to be very sensitive to these issues.

Why Does a Processor Do This?

Modern CPU’s process instructions ahead of the current executing instruction for performance reasons. There is capacity for a CPU core to process several instructions simultaneously and if one is delayed, say due to waiting for main memory, other instructions that don’t have a dependency on this can be executed. But why would the processor bother to process instructions after an unconditional branch? Afterall, ARM added the return (RET) instruction to their 64-bit instruction set so that the CPU would know where execution continued next and the pipeline wouldn’t need to be flushed.

First, ARM provides reference designs, and many manufacturers like Apple, take these and improve on them. This means that not all ARM processors work the same in this regard. Some might use a simpler brute-force approach that doesn’t interpret the instructions at all, others may add intelligence and their speculative pipeline interprets instructions and knows how to follow unconditional branches. The smarter the approach and the more silicon is required to implement it and possibly more heat is generated as it does its job.

The bottom line is that not all ARM processors will have this problem. Some will have limited pipelines and are safe, others interpret instructions and are safe. Then a few will implement the brute-force approach and have the vulnerability. The question then: what needs to be done about this?

ARM Instructions to Mitigate the Problem

Some ARM processors let you turn off speculative execution entirely. This is a very heavy handed approach and terrible for performance. This may make sense in some security sensitive situations, but usually is too heavy a hit on performance. The other approach is to use a number of ARM Assembly Language instructions that will halt speculative execution when encountered. Let’s look at them:

  • Data Synchronization Barrier (DSB): completes when all instructions before this instruction complete.
  • Data Memory Barrier (DMB): ensures that all explicit memory accesses before the DMB instruction complete before any explicit memory accesses after the DMB instruction start.
  • Instruction Synchronization Barrier (ISB): flushes the pipeline in the processor, so that all instructions following the ISB are fetched from cache or memory, after the ISB has been completed.
  • Speculation Barrier (SB): bars speculation of any instructions after this one, until after it is executed. Note: this instruction is optional and not available on all processors.
  • Clear the cache (MSR and SYS): using these instructions you can clear or invalidate all or part of the cache. However, these are privileged instructions and invalid from user space. Generally, these instructions are only used in the operating system kernel.

The various ARM security whitepapers have recommendations on how to use these instructions and the various developer tools like GCC and LLVM have compiler options to add these instructions to your code.

Adding a DSB/ISB instruction pair after an unconditional branch won’t affect the execution performance of your program, but it will increase the program size by 64-bits (two 32-bit instructions) after each unconditional branch. The recommendation is to only turn on the compiler options to generate this extra code in routines that do something sensitive like handling passwords or user’s sensitive data. Linux Kernel developers have to be cognizant of these issues to ensure kernel data doesn’t leak.

If you are playing with a Raspberry Pi, you can add the DSB, DMB and ISB instructions and run them in user mode. If you add an SB instruction then you need to add something like “-march=armv8.2-a+sb” to your as command line. The Broadcom ARM processor used in the Pi doesn’t support this instruction, so you will get an “Illegal Instruction” error when you execute it.

Ideally, this should all be transparent to the programmer. Future generations of the ARM processor should fix these problems. These instructions were added so systems programmers can address security problems as they appear. Without these tools, there would be no remedy. Addressing the current crop of speculative execution exploits, doesn’t guarantee that new ones won’t be discovered in the future. Its a cat and mouse, or whack-a-mole type game between hackers and security professionals. The next generation of chips will be more secure, but then the ball is in the hackers court.

Summary

It’s fascinating to see how hackers can exploit the smallest side-effect in programs or hardware to either take control of a system or to steal data from it. These recent exploits show how security has to be taken seriously in all design aspects, whether hardware, microcode, system software or user applications. As new attacks are developed, everyone has to scurry to develop workarounds, solutions and mitigations. In this cat and mouse world, there is no such thing as absolute security and everyone has to be aware that there are always risks if you are attached to a network.

If you are interested in these sorts of topics, be sure to check out my book: Programming with 64-Bit ARM Assembly Language.

Written by smist08

June 12, 2020 at 5:13 pm

Raspberry Pi Gets 8Gig and 64-Bits

with one comment

Introduction

The Raspberry Pi Foundation recently announced the availability of the Raspberry Pi 4 with 8-Gig of RAM along with the start of a beta for a 64-bit version of the Raspberry Pi OS (renamed from Raspbian). This blog post will look into these announcements, where the Raspberry Pi is today and where it might go tomorrow.

I’ve written two books on ARM Assembly Language programming, one for 32-bits and one for 64-bits. All Raspberry Pis have the ARM CPU as their brains. If you are interested in learning Assembly language, the Raspberry Pi is the ideal place to do so. My books are:

32- Versus 64-Bits

Previously the Raspberry Pi Foundation had been singing the virtues of their 32-bit operating system. It uses less memory than a 64-bit operating system and would run on every Raspberry Pi ever made. Further if you really wanted 64-bits then you could run alternative versions of Linux from Ubuntu, Gentoo or Kali. The limitation of 32-bits is that you can only address 4 Gig of memory, so this seems like a problem for an 8-gig device, but 32-bit Raspbian handles this and in fact each process can have up to 4-gig of RAM and hence all the 8gig will get used if needed, just across multiple processes.

The downside to this is that the ARM 64-bit instruction set is faster, the memory addressing is simpler without this extra Linux memory management and modern ARM processors are optimised around 64-bits and only maintain 32-bits for compatibility. There are no new improvements to the 32-bit instruction set and typically it can’t take advantage of newer features and optimizations in the processor.

The Raspberry Pi foundation has released a beta version of the Raspberry Pi OS where the kernel is compiled for 64-bits. Many of the applications are still 32-bits but can run fine in compatibility mode, this is just a band-aid until everything is compiled 64-bit. I’ve been running the 64-bit version of Kali Linux on my Raspberry Pi 4 with 4-gig for a year now and it is excellent. I think the transition to 64-bits is a good one and there will be many benefits down the road.

New Hardware

The new Raspberry Pi 4 model with 8-gig of RAM is similar to the older model. The change was facilitated by the recent availability of a 8-gig RAM chip in a compatible form factor. They made some small adjustments to the power supply circuitry to handle the slightly higher power requirements of this model. Otherwise, everything else is the same. If a 16-gig part becomes available they would be able to offer such a model as well. The current Raspberry Pi memory controller can only handle up to 16-gig, so to go higher, this would need to be upgraded as well.

The new model costs $75 USD with 8-gig of RAM. The 2-gig model is still only $35 USD. This is incredibly inexpensive for a computer, especially given the power of the Pi. Remember this is the price for the core unit, you still need to provide a monitor, cables, power supply, keyboard and mouse.

Raspberry Pi Limitations

For most daily computer usage the Raspberry Pi is fine. But what is the difference between the Raspberry Pi and computers costing thousands of dollars. Here are the main ones:

  1. No fast SSD interface. You can connect an SSD or mechanical harddrive to a Raspberry Pi USB port, but this isn’t as fast as if there was an M.2 or SATA interface. M.2 would be ideal for a Raspberry Pi given its compact size. Adding an M.2 slot shouldn’t greatly increase the price of a Pi.
  2. Poor GPU. On most computers GPUs can be expensive. For $75 or less you get an older less powerful GPU. A better GPU, like ARM’s Mali GPU or some nVidia CUDA cores would be terrific, but will probably double or triple the price of the Pi. Even with the poor GPU, the RetroPi game system is terrific.
  3. Faster memory interface. The Raspberry Pi 4 has DDR4 memory, but it doesn’t compare will to other computers with DDR4. This probably indicates a bottleneck in either the PCI bus or Pi memory controller. I suspect this keeps the price low, but limits CPU performance due to bottlenecks limiting the data flow to and from memory.

If the Raspberry Pi addressed these issues, it would be competitive with most computers costing hundreds of dollars more.

Summary

The 8-gig version of the Raspberry Pi is a powerful computer for only $75. Having 8-gig of RAM allows you to run more programs at once, have more browser windows open and generally have more work in progress at one time. Each year the Raspberry Pi hardware gets more powerful. Combine this with the forthcoming 64-bit version of the Raspberry Pi OS and you have a powerful system that is ideal for the DIY hobbyist, for people learning about programming, and even people using it as a general purpose desktop computer.

Written by smist08

June 5, 2020 at 4:36 pm

ARM Processor Modes

leave a comment »

Introduction

Last time we discussed how ARM Processor interrupts work, and we mentioned that interrupts switch the processor from user mode to an operating system mode, but we never discussed what exactly the ARM Processor modes are. In this article we will discuss the ARM Processor modes, why they exist and when they are used.

The available processor modes vary by ARM model, so we will look at those commonly available. For the exact details on any specific ARM processor you need to check in that processor’s reference manual.

ARM Processor Modes

The purpose of processor modes is to regulate access to memory and hardware resources so that a process initiated by a specific user can’t access the memory of other processes or access hardware they don’t have permission for. The operating system can add quite refined permissions, so users only have access to certain files, read-only access to certain files, or other granular rights. This might all sound like overkill for a Raspberry Pi, but all versions of Linux, including Raspbian support multiple users and multiple processes all logged in and running at once. Further you might set up specific users and groups to grant the exact rights to processes like web servers to help protect you system from malicious hackers or program bugs causing havoc.

Most ARM processors have two security levels for processes. PL0 is for user mode programs and then PL1 is for operating system code. Newer ARM processors used in servers have a third level PL2 for virtualization hypervisors, so they can keep their various hosted operating systems completely separate. There is also an optional ARM build for secure computing, if this is present then there is an even higher PL3 level that is used for a system security monitor.

The following table from the ARM Processor Reference manual. There are quite a few processor modes and we’ll talk about them all, but the two main ones are user mode for regular programs and then system mode for the operating system.

Let’s list all the processor modes and describe what it is used for:

  • User – regular programs that can access the resources they have permission for.
  • FIQ – the processor goes into this mode when handling as fast interrupt. The operating system provides this code and it has access to all operating system resources.
  • IRQ – the processor goes into this mode when handling a regular interrupt. The operating system provides this code and it has access to all operating system resources.
  • Supervisor – when a user mode program makes an SVC Assembly instruction which calls an operating system services, the program switches to this mode, which allows the program to operate at a privileged level for the duration of the code.
  • Monitor – if you have an ARM processor with security extensions then this mode is used to monitor the system.
  • Abort – if a user mode program tries to access memory it isn’t allowed, then this mode is entered to let the operating system intervene and either terminate the program, or send the program a signal.
  • Hyp – this is hypervisor mode that is an optional ARM extension. This allows the virtual hypervisor run at a more secure level than the operating systems it is virtualizing.
  • Undefined – is a user mode program tries to execute an undefined or illegal Assembly instruction then this mode is entered and the operating system can terminate the program or send it a signal.
  • System – this is the mode that the operating system runs at. Processes that the operating system considers part of itself run at this level.

The mode bits in the table, are the bits that are set in the Control Program Status Register (CPSR) are the bits that get set in the lower order bits. This way the operating system can see what mode it’s in and act accordingly when appropriate.

ARM Boot Process

When powered on, the ARM processor starts up by initiating a reset interrupt. This causes the reset interrupt handler code to execute, which will typically be a branch to the code to start the operating system. At this point we are running in IRQ mode. We will change the processor mode to supervisor and initiate the operating system boot process. To change the processor mode we directly manipulate the bits in the CPSR with code like:

MRS   R0, CPSR        @ Move the CPSR to R0
BIC   R0, R0, #0x1F   @ clear the mode bits
ORR   R0, R0, #0x13   @ Set the mode bits to 10011 (SVC mode)
MSR   CPSR, R0        @ Update the CPSR

Note that reading and writing the CPSR like this are privileged instructions and only available to programs running in PL1 or better. Besides updating the processor mode, the operating system uses these to save a program’s state when doing multitasking. Saving the registers is easy, but the CPSR must also be preserved so as not to disrupt the running process.

Summary

This was a quick introduction to the ARM Processor modes. You don’t need to know this for application programming, but if you are interested in writing an operating system or if you are interested in how operating system support works for the ARM processor then this is a starting point.

If you are interested in learning more about ARM Assembly Language Programming, please check out my book, the details are available here.

Written by smist08

December 2, 2019 at 12:37 pm