Stephen Smith's Blog

Musings on Machine Learning…

Archive for the ‘raspberry pi’ Category

Out-of-Order Instructions

leave a comment »

Introduction

We think of computer processors executing a set of instructions one at a time in sequential order. As programmers this is exactly what we expect the computer to do and if the computer decided to execute our carefully written code in a different order then this terrifies us. We would expect our program to fail, producing wrong results or crashing. However we see manufacturers claiming their processors execute instructions out-of-order and that this is a feature that improves performance. In this article, we’ll look at what is really going on here and how it can benefit us, without causing too much fear.

Disclaimer

ARM defines the Instruction Set Architecture (ISA), which defines the Assembly Language instruction set. ARM provides some reference implementations, but individual manufacturers can take these, customize these or develop their own independent implementation of the ARM instruction set. As a result the internal workings of ARM processors differs from manufacturer to manufacturer. A main point of difference is in performance optimizations. Apple is very aggressive in this regard, which is why the ARM processors in iPads and iPhones beat the competition. This means the level of out-of-order execution differs from manufacturer to manufacturer, further this is much more prevalent in newer ARM chips. As a result, the examples in this article will apply to a selection of ARM chips but not all.

A Couple of Simple Cases

Consider the following small bit of code to multiply two numbers then load another number from memory and add it to the result of the multiplication:

MUL R3, R4, R5 @ R3 = R4 * R5
LDR R6, [R7]   @ Load R6 with the memory pointed to by R7
ADD R3, R6     @ R3 = R3 + R6

The ARM Processor is a RISC processor and its goal is to execute each instruction in 1 clock cycle. However multiplication is an exception and takes several clock cycles longer due to the loop of shifting and adding it has to perform internally. The load instruction doesn’t rely on the result of the multiplication and doesn’t involve the arithmetic unit. Thus it’s fairly simple for the ARM Processor to see this and execute the load while the multiply is still churning away. If the memory location is in cache, chances are the LDR will complete before the MUL and hence we say the instructions executed out-of-order. The ADD instruction then needs the results from both the MUL and LDR instruction, so it needs to wait for both of these to complete before executing it’s addition.

Consider another example of three LDR instructions:

LDR R1, [R4] @ memory in swap file
LDR R2, [R5] @ memory not in cache
LDR R3, [R6] @ memory in cache

Here the memory being loaded by the first instruction, has been swapped out of memory to secondary storage, so loading it is going to be slow. The second memory location is in regular memory. DDR4 memory, like that used in the new Raspberry Pi 4, is pretty fast, but not as fast as the CPU and it is also loading instructions to process, hence this second LDR might take a couple of cycles to execute. It makes a request to the memory controller and its request is queued with everything else going on. The third instruction, assumes the memory is in the CPU cache and hence processed immediately, so this instruction really does take only 1 clock cycle.

The upshot is that these three LDR instructions could well complete in reverse order.

Newer ARM processors can look ahead through the instructions looking for independent instructions to execute, the size of this pool will determine how out-of-order things can get. The important point is that instructions that have dependencies can’t start and that to the programmer, it looks like his code is executing in order and that all this magic is transparent to the correct execution of the program.

Since the CPU is executing all these instructions at once, you might wonder what the value of the program counter register (PC) is? This register has a very precisely defined value, since it is used for PC relative addressing. So the PC can’t be affected by out-of-order execution. 

Coprocessors

All newer ARM processors include floating-point coprocessors and NEON vector coprocessors. The instructions that execute on these usually take a few instructions cycles to execute. If the instructions that follow a coprocessor instruction are regular ARM instructions and don’t rely on the results of coprocessor operations, then they can continue to execute in parallel to the coprocessor. This is a handy way to get more code parallelism going, keeping all aspects of the CPU busy. Intermixing coprocessor and regular instructions is another great way to leverage out-of-order instructions to get better performance.

Compilers and Code Generation

This indicates that if a compiler code generator or an Assembly Language program rearranges some of their instructions, they can get more things happening at once in parallel giving the program better performance. ARM Holdings contributes to the GNU Compiler Collection (GCC) to fully utilize the optimization present in their reference implementations. In the ARM specific options for GCC, you can select the ARM processor version that matches your target and get more advanced optimizations. Since Apple creates their own development tools under XCode, they can add optimizations specific to their custom ARM implementations.

As Assembly Language programmers, if we want to get the absolute best performance we might consider re-arranging some of our instructions so that instructions that are independent of each other are in a row and hopefully can be executed in parallel. This can require quite a bit of testing to reverse engineer the exact out-of-order instruction capability of your particular target ARM processor model. As always with performance optimizations, you must test the performance to prove you are improving things, and not just making your code more cryptic.

Interrupts

This all sounds great, but what happens when an interrupt happens? This could be a timer interrupt to say your time-slice is up and another process gets to use the ARM Core, or it could be that more data needs to be read from the Wifi or a USB device.

Here the ARM CPU designer has a choice, they can forget about the work-in-progress and handle the interrupt quickly, or they can wait a couple of cycles to let work-in-progress complete and then handle the interrupt. Either way they have to allow the interrupt handler to save the current context and then restore the context to continue execution. Typically interrupt handlers do this by saving all the CPU and coprocessor registers to the system stack, doing their work and then restoring state.

When you see an ARM processor advertised as designed for real-time or industrial use, this typically means that it handles interrupts quickly with minimal delay. In this case, the work-in-progress is discarded and will be redone after the interrupt is finished. For ARM processors designed for general purpose computing, this usually means that user performance is more important than being super responsive to interrupts and hence they can let some of the work-in-progress complete before servicing the interrupt. For general purpose computing this is ok, since the attached devices like USB, ethernet and such have buffers that can hold enough contents to wait for the CPU to get around to them.

A Step Too Far and Spectre

Hardware designers went even further with branch prediction, where if a conditional branch instruction needs to wait for a condition code to be set, they don’t wait but keep going assuming one branch direction (perhaps based on the result from the last time this code executed) and keep going. The problem here is that at this point, the CPU has to save the current state, incase it needs to go back when it guesses wrong. This CPU state was saved in a CPU cache that was only used for this, but had no security protection, resulting in the Spectre attack that figured out a way to get at this data. This caused data leakage across processes or even across virtual machines. The whole spectre debacle showed that great care has to be taken with these sorts of optimizations.

Heat, the Ultimate Gotcha

Suppose your your ARM processor has four CPU cores and you write a brilliant Assembly language program that deploys to use all four cores and fully exploits out-of-order execution. Your program is now using every bit of the ARM CPU, each core is intermixing regular ARM, floating point and NEON instructions You have intermixed your ARM instructions to get the arithmetic unit operating in parallel to the memory unit. This will be the fastest implementation yet. Then you run your program, it gets off to a great start, but then suddenly slows down to a crawl. What happened?

The enemy of parallel processing on a single chip is heat. Everything the CPU does generates a little heat. The more things you get going at once the more heat will be generated by the CPU. Most ARM based computers like the Raspberry Pi assume you won’t be running the CPU so hard, and only provide heat dissipation for a more standard load. This is why Raspberry Pis usually do so badly playing high-res videos. They can do it, as long as they don’t overheat, which typically doesn’t take long.

This leaves you a real engineering problem. You need to either add more cooling to your target device, or you have to deliberately reduce the CPU usage of your program, where perhaps paradoxically you get more work done using two cores rather than four, because you won’t be throttled due to overheating.

Summary

This was a quick overview of out-of-order instructions. Hopefully you don’t find these scary and keep in mind the potential benefits as you write your code. As newer ARM processors come to market, we’ll be seeing larger and larger pools of instructions executed in parallel, where the ability for instructions to execute out-of-order will have even greater benefits.

If you are interested in machine code or Assembly Language programming, be sure to check out my book: “Raspberry Pi Assembly Language Programming” from Apress. It is available on all major booksellers or directly from Apress here.

Written by smist08

November 15, 2019 at 11:11 am

Raspberry Pi Assembly Language Programming

leave a comment »

 

Introduction

My new book “Raspberry Pi Assembly Language Programming” has just been published by Apress. This is my first book to be published by a real publisher and I’m thrilled to see it appearing on websites of booksellers all over the Internet. In this blog post I’ll talk about how this book came to exist, the process of writing and publishing it and a bit about the book itself.

For anyone interested in this book, here are a few places where it is available:

Most of these sites let you see a preview and the table of contents.

This blog’s dedicated page to my book.

How this Book Came About

I purchased my Raspberry Pi 3+ in late 2017 and had a great deal of fun playing with it. I wrote quite a few blog posts on the Pi, a directory of these is available here. The Raspberry Pi package I purchased included a breadboard and a selection of electronic components. I put together a set of LEDs connected to the Pi’s GPIO ports. I then wrote a series of articles on making these LEDs flash using various programming languages including C, Python, Scratch, Fortran, and Erlang. In early 2018 I was interested in learning more about how the Pi’s ARM processor works and delved into Assembly language programming. This resulted in two blog posts, an introduction and then my flashing LED program ported to ARM Assembly Language.

Earlier this year I was contacted by an Apress Talent Acquisition agent who had seen my blog articles on ARM Assembly Language and wanted to know if I wanted to develop them into a book. I thought about it over the weekend and was intrigued. The material I found when writing the blog articles wasn’t great, and I felt I could do better. I replied to the agent and we had a call to discuss the book. He had me write up a proposal and possible table of contents. I did this, Apress accepted it and sent me a contract to sign.

The Process

Apress provided a Word style sheet and a written style guide. My writing process has been to write in Google Docs and then have my spouse, a professional editor, edit it. The collaboration of Google Docs is just too good to do away with. So I wrote the chapters in Google Docs, got them edited and then transferred them to MS Word and applied the Apress style sheet.

I worked with a coordinating editor at Apress who was very energetic in getting all the pieces done. She found a technical editor who would provide a technical review of each chapter as I wrote it. He was located in the UK, so often I would submit a chapter and see it edited overnight.

Once I had submitted all the chapters then a senior development editor gave the whole book a review. At that point I thought I was done, but then the book was given to Springer’s (Apress’s parent company) production department who did another editing pass. I was surprised that the production department still found quite a few things that needed fixing or improving.

After all that the book appeared fairly quickly. I like the cover, they used my photo of my breadboard with the flashing LEDs. As of today, the book is available at most booksellers, some with stock and some on preorder. I signed the contract in June and did the bulk of the writing in July and August. Overall, I’m pretty happy with the process and how things turned out.

The Book

My philosophy was to introduce complete working programs from Chapter 1 with the traditional “Hello World” program. I only covered topics where you could write the code with the tools included with the Raspberry Pi and run them. I lay the foundations for how to write larger Assembly programs, with how to code the various structured programming constructs, but also include a chapter on how to interoperate with C and Python code.

Raspbian is a 32-bit operating system as older Raspberry Pi’s and the Raspberry Pi Zero can only run 32-bit code. I didn’t want to leave out 64-bit code, as there are 64-bit versions of Linux from other distributions like Ubuntu that are available for the Pi. So I included a chapter on ARM 64-bit Assembly along with guidelines on how to port your 32-bit code to 64-bit. I then included 64-bit versions of several of the programs we had developed along the way.

There is a lot of interest in ARM Assembly Language, especially from hackers, as all phones, tablets and even a few laptops are running ARM processors now. I included a number of hacking related topics like how to reverse engineer code, as security professionals are very interested in this as they work to protect the mobile devices utilized by their organizations.

The ARM Processor is a good example of a RISC processor, so if you are interested in RISC, this book will give a good introduction to the concepts, like how to do everything with instructions that are only 32-bits in length. Once you understand ARM Assembly, picking up the Assembly language of another RISC processor like the Risc-V becomes much easier.

The book also covers how to program the floating point processor included with most ARMs along with the NEON vector processor that is available on newer Raspberry Pis.

Summary

If you are interested in learning Assembly Language, please check out my book. The Raspberry Pi provides a great platform to do this. Even if you only program in higher level languages, knowing Assembly Language will help you understand what is going on at a deeper level. How modern processors design their Assembly Language to maximize program performance and minimize memory usage is quite fascinating and I hope you find the topic as interesting as I do.

 

Written by smist08

November 1, 2019 at 11:22 am

The Race for 64-Bit Raspberry Pi 4 Linux

leave a comment »

Introduction

When the Raspberry Pi 4 was announced and shipped this past June, it caught everyone by surprise. No one was expecting a new Pi until next year sometime, if we were lucky. The Raspberry Pi 4 has updated faster components, including an updated ARM processor and USB 3.0. Raspbian, the official version version of Linux for the Pi was updated to be based on Debian Buster and shipped before the official Debian Buster actually shipped. However, Raspbian is still 32-bit, where the Raspberry foundation say this is so they only have to support one version of Linux for all Raspberry Pi devices.

Others in the Linux community, have then figured out how to run 64-bit Linux’s on the Raspberry Pi. For instance there are 64-bit versions of Ubuntu Mate, Ubuntu Server and Kali Linux. These work on the Raspberry Pi 3, but due to changes in the Raspberry architecture, didn’t work on the Raspberry Pi 4 when it shipped. We still don’t have official 64-bit releases, but we are reaching the point where the test builds are starting to work quite well.

Why 64-Bit?

To be honest, 64-bit Linux never ran very well on the Raspberry Pi 3. 64-bit Linux and 64-bit programs requires quite a bit more memory than their 32-bit equivalents. Each memory address is now 64-bits instead of 32-bits and there is a tendency to use 64-bit integers rather than 32-bit integers. The ARM processor instructions are 32-bits in both 32-bit and 64-bit mode, so programs tend to be about the same size, though 64-bit doesn’t have use of the 16-bit ARM thumb instructions. The Raspberry Pi 3 is limited to 1Gig of memory, that can just barely run a 64-bit Linux, and tends to run out of memory quickly as you run programs, like web browsers. The Raspberry Pi 4 now supports up to 4Gig of memory and that is sufficient to run 64-bit Linux along with a respectable number of programs. Plus the Raspberry Pi 4 has faster access to the SDCard and USB 3, so you can attach an even faster external drive, so if you do get swapping, it isn’t as painful.

In spite of these limitations, there are reasons to run 64-bit. The main one is that you can get better performance, especially if you actually need to work with 64-bit integers. Further the 64-bit instruction set has been optimised to work better with the execution pipeline, so you don’t get as many stalls when you perform jumps. For instance in 32-bit ARM, there is no function return instructions, so people use regular branches, pop the return address from the stack directly into the program counter or do a number of other tricks. As a result, function returns causes the execution pipeline to be flushed. In 64-bit, the pipeline knows about return instruction and knows where to get the next few instructions.

If 64-Bit Worked on the Pi 3, What’s the Problem?

If we had 64-bit working for the Pi 3, why doesn’t it just work on the Pi 4? There are a few reasons for this. The first obstacle was that Raspberry changed the whole Pi boot process. The Raspberry Pi 3 booted using the GPU. When it started the Pi 3’s GPU runs a program that knows how to read the boot folder on an SDCard and will load this into memory and then start the ARM CPU to run what it loaded into memory. The Raspberry Pi 4 now has a slightly larger EEPROM, this contains ARM code that executes on startup and then loads a further step from the SDCard. The volunteers with the other Linux distributions had to figure out this new process and adapt their code to fit into it. Sadly, the original EEPROM program didn’t provide a good way to do this, so the Linux volunteers have been working with Raspberry to get the support they need in newer versions of the EEPROM software. The most recent version seems to be working reliably finally.

The Raspberry Pi 4 then has all new hardware, so new drivers are required for bluetooth, wifi and everything else. To keep the price down, Raspberry uses older standard components, so there are drivers already written for all these devices. It’s just a matter of including the correct drivers and providing default configurations that work and settings dialogs if anything might need user input. This is all being worked on in parallel, and the consensus is that we are already in a better place than we were for the Pi 3.

It’s All Open Source so Why not Copy from Rasbian?

The Raspbian kernel is open source so anyone can look at that source code, but the EEPROM firmware is not open source. This can be reverse engineered, but that takes time. The Raspberry Pi foundation has been quite helpful in supporting people, but that is no substitute for reading the source code. This again shows the importance of open source BIOS.

Development got off to a slow start, because the Raspberry Pi foundation didn’t give anyone a heads up that this was coming. The developers of Ubuntu Mate had to order their Raspberry Pi 4’s just like everyone else when the announcement happened. This meant no one really got started until into July.

In spite of claiming up and down that they will never produce a 64-bit version of Raspbian, the Raspberry Pi foundation has produced a test Raspbian 64-bit Linux kernel. This then tests out that the Raspberry Pi firmware will support 64-bits and that all the device drivers are available. I couldn’t get this kernel to work, but it is proving very helpful for other developers. It also makes people excited that maybe Raspbian will go 64-bit sooner than later.

How Are We Doing?

The first distribution to get all this going is Gentoo Linux. They have a very smart developer Sakaki who provided the first image that actually worked. This then led to Arch and Majaro Linux releases based on Gentoo. This was a good first step, though these distributions are more for the DIY crowd due to their preference to always installing software from source code.

Next James Chambers put together a guide and images to allow you to install Ubuntu Server 64-bit on the Pi 4. Ubuntu Server is character based, but installing a desktop is no problem. The main limitation of this release is that you need a hardwired Internet connection to start. You can’t start with Wifi as the Wifi software isn’t installed with the base image. If you do have a wired Internet connection, getting it installed and installing the desktop is quite straightforward and works well. Once you have the desktop installed, then you can configure Wifi and ditch the ethernet cable.

The changes required for the Raspberry Pi 4 are being submitted to the standard Linux kernel for version 5.4. When this ships it will have available drivers for the Pi 4 hardware and official support for the Broadcom chips used in the Pi. Version 5.3 of the Linux kernel just shipped and added support for the NVidia Jetson Nano. Hopefully the wait for Linux 5.4 won’t be too long.

Summary

I’ve been running the 64-bit version of Ubuntu Linux Server, with the Xubuntu desktop for a few days now and it works really well on my Raspberry Pi 4 with 4Gig of RAM. Performance is great and everything is working. I’ve installed various software, including CubicSDR which works great. This is the first time I’ve been happy with Software Defined Radio running on a Pi.

I look forward to the official releases, and given the state of the current builds, think we are getting quite close.

Written by smist08

September 20, 2019 at 6:38 pm

Raspberry Pi 4 as a Desktop Computer

leave a comment »

Introduction

The Raspberry Pi Foundation is promoting the Raspberry Pi 4 as a full desktop computer for only $35. I’ve had my Raspberry Pi 4 for about a month now and in this article we’ll discuss if it really is a full desktop computer replacement. This partly depends on what you use your desktop computer for. My answer is that the $35 price is misleading, you need to add quite a few other things to make it work well.

Making the Raspberry Pi 4 into a Decent Desktop

The Raspberry Pi has always been a barebones computer. You’ve always needed to add a case, a keyboard, a mouse, a monitor, a power supply, a video cable and a microSD card. Many people already have these kicking around, so they don’t need to buy them when they get their Pi. For instance, I already had a keyboard and monitor. The Raspberry Pi 4 even supports two monitors.

Beyond the bare bones, you need two more things for a decent desktop, namely:

  1. The 4GB version of the Raspberry Pi 4
  2. A good USB SSD drive

With these, it starts to feel like you are playing with a regular desktop computer. You now have enough RAM to run multiple programs and any good SSD will greatly enhance the performance of thee system, only using the microSD card to boot the Pi.

The Raspberry Pi 3 is a great little computer. Its main limitation is that if you run too many programs or open too many browser tabs, it bogs down and you have a painful process of closing windows (that aren’t responding well), until things pick up again. Now the Raspberry Pi 4 with 4GB of RAM really opens up the number of things you can do at once. Running multiple browser tabs, LibreOffice and a programming IDE are no problem.

The next thing you run into with the Raspberry Pi 4 is the performance of the SD card. Since I needed a video cable and a new case, I ordered a package deal that also included a microSD card containing Raspbian. Sadly, these bundled microSD cards are the cheapest, and hence slowest available. Having Raspbian bundled on a slow card is just a waste. Switching to a Sandisk Extreme 64GB made a huge difference. The speed was much better. When buying a microSD card watch the speed ratings, often the bigger cards (64GB or better) are twice as fast as the smaller cards (32GB or less). With a good microSD card the Raspberry Pi 4 can read and write microSD twice as fast as a Raspberry Pi 3.

I’ve never felt I could truly trust running off a microSD card. I’ve never had one fail, but people report problems all the time. Further, the performance of microSD cards is only a fraction of what you can get from good SSDs. The Raspberry Pi 4 comes with two USB 3 ports which have a theoretical performance ten times that of the microSD port. If you shop around you will find M.2 and SATA SSDs for prices less than those of microSD cards. I purchased a Kingston A1000 M.2 drive which was on sale cheap because the A2000 cards just started shipping. I had to get an M.2 USB caddy to contain it, but combined this was less than $100 and USB caddies are always useful.

Unfortunately, you can’t boot the Raspberry Pi 4 directly off a USB port yet. The Raspberry Pi foundation say this is coming, but not quite here yet. What you can do is have the entire root file system on the USB drive, but the boot partition must be on a microSD card. Setting up the SSD was easier than I thought it would be. I had to partition it, format it, copy everything over to the SSD and then edit /boot/config.txt to say where the root of the main file system is.

With this done, I feel like I’m using a real desktop computer. I’m confident my data is being stored reliably, the performance is great.

Overheating

The Raspberry Pi 4 uses more power than previous Pis. This means there is more heat to dissipate. The case I received with my Pi 4 didn’t have any ventilation holes and would get quite hot. I solved the problem by removing the top of the case. This let enough heat out that I could run fine for most things. People report that when using a USB SSD that the USB controller chip will overheat and the data throughput will be throttled. I haven’t run into this, but it is something to be aware of.

I installed Tensorflow, Google’s open source AI toolkit. Training a data model with Tensorflow does make my Pi 4 overheat. I suspect Tensorflow is keeping all four CPU cores busy and producing a maximum amount of heat. This might drive me to add a cooling fan. I like the way the Pi runs so quietly, with no fan, it makes no noise. I might try using a small fan blowing down on the Pi to see is that helps.

Summary

Is the Raspberry Pi 4 a complete desktop computer for $35? No. But if you get the 4GB model for $55 and then add a USB 3 SSD, then you do have a good workable desktop computer. The CPU power of the Pi has been compared to a typical 2012 desktop computer. But for the cost that is pretty good. I suspect the Wifi/Lan and SSD are quite a bit better than that 2012 computer.

Keep in mind the Raspberry Pi runs Linux, which isn’t for everyone. A typical low cost Windows desktop goes for around $500 these days. You can get a refurbished one for $200-$300. A refurbished desktop can be a good inexpensive option.

I like the Raspberry Pi, partly because you are cleanly out of the WinTel world. No Windows, no Intel. The processor is ARM and the operating system is Raspbian based on Debian Linux. A lot of things you do are DIY, but I enjoy that. With over 25 million Raspberry Pis sold worldwide, there is a lot of community support and you join quite an enthusiastic thriving group.

Written by smist08

August 26, 2019 at 8:17 pm

Raspberry Pi 4 First Impressions

leave a comment »

Introduction

I’ve received my Raspberry Pi 4B with 4GB or RAM a few weeks ago. I’ve been using it to write my forthcoming book on Raspberry Pi Assembly Language Programming, so I thought I’d give a few of my first impressions. The biggest change for the Raspberry Pi 4 is the support for three memory sizes, 1GB, 2GB and 4GB. This overcomes the biggest complaint against the Raspberry Pi 3, that it bogs down too quickly as you run browser tabs and multiple windows.

Some of the other hardware improvements are:

  • Dual 4K monitor support with dual micro-HDMI ports.
  • Two of the four USB ports are USB-3.
  • The ethernet is now gigabit and the WiFi is faster.
  • A 1.5GHz quad-core 64-bit ARM Cortex-A72 CPU.
  • The SDRAM is now LPDDR4.
  • The GPU is upgraded to Broadcom’s VideoCore VI.
  • Hardware HEVC video support for 4Kp60 video.

On paper, this makes the Raspberry Pi 4 appear far superior to its predecessors, In this article, I’ll discuss what is much better and a few of the drawbacks. This release will squash a lot of the compatible Pi competitors, but I’ll compare it to my NVidia Jetson Nano and mention a few places where these products still surpass the Pi.

Raspbian Buster

At the same time the Raspberry Pi Foundation released the Raspberry Pi 4, they also released the new “Buster” version of Raspbian, the Debian Linux derived operating system tailored specifically to the Raspberry Pi. On the day this was announced, I ordered my Raspberry Pi 4, then went and downloaded the new Buster release, then installed it on my Raspberry Pi 3B.

If you have a Raspberry Pi 4, then you must run the Buster release as older versions of Raspbian don’t have support for the newer hardware. If you are running an older Pi then you can keep running the older version or upgrade as you like.

Is it 64-Bits?

The first rumour that was squashed was that Raspbian would move to 64-bits. This didn’t happen. Raspbian is a 32-bit operating system. The Raspberry Pi Foundation says it will stay this way for the foreseeable future. The first reason is that the Raspberry Pi 1 and Raspberry Pi Zero use a much older ARM processor that doesn’t support 64-bits. The Raspberry Pi Foundation still supports and sells these models and they are quite popular due to their low price. They don’t want to support two operating systems, so they stick to one 32-bit version that will run on every Raspberry Pi ever made. Perhaps other hardware vendors should look at this level of support for older models.

Even though 32-bit implies a 32-bit virtual address space for processes, which limits an individual process to 4GB of memory, the ARM SoC used in the Pi has memory access hardware for 48-bit addresses. This allows the operating system to give each process a different 4GB address space, so if Raspberry Pi models with more than 4GB of memory are released, Raspbian can utilize this memory.

Another problem with going to 64-bits is that all the previous Raspberry Pi models, and one version of the Raspberry Pi 4 only have 1GB of RAM. This isn’t sufficient to run a 64-bit operating system. You can do it, but the operating system takes all the RAM, and once you run a program or two, everything bogs down. This is due to all addresses and most integers becoming 64-bits, and hence twice as large. A definite nice feature of Raspbian is that it can run effectively in only 1GB or memory.

Based on Debian Buster

Raspbian is notorious for lagging behind the mainstream releases of Linux. The benefit of this is that Raspbian has always been very stable and reliable. It works well and avoids the problems that happen at the bleeding edge. The downside is that it can contain security vulnerabilities or bugs that are fixed in the newer versions.

With Buster, Raspbian released its version ahead of Debian releasing the main version. Linus Torvalds himself was involved in moving the Pi up to a newer version of the Linux kernel. His concern is that as other hardware platforms adopt proprietary software like UEFI firmware, with government mandated backdoors, that the benefits of open source are being lost. The Raspberry Pi, including its firmware are all open source and there is a feeling in the open source community that this is the future to fight for.

Some Software Not Ported Yet

As a result of the move to Buster, some software that Raspberry users are accustomed to is missing. The most notable case is Mathematica. A port of this is underway and it is promised to be included in a future upgrade.

I had problems with CubicSDR, a Software Defined Radio (SDR) program. It could detect my SDR USB device, but didn’t run properly, just displaying a blank screen when receiving.

Heat Dissipation

The Raspberry Pi 4 uses more power than previous models. It requires a USB-C power adapter which means you can’t use a power adapter from previous models. I bought my Pi 4 from buyapi.ca and got the bundle with a case, power adapter, heat sinks and micro-HDMI cable. I needed the cables. The case is their Raspberry Pi 3 case, with the holes for the cables moved for the slightly different Pi 4 configuration. The case lacked any ventilation holes and the Pi would throttle due to overheating fairly easily. My solution was to run it with the top of the case removed. This seems to provide enough air circulation that I haven’t seen any overheating since.

Some people claim the Raspberry Pi 4 requires a fan for cooling, but that hasn’t been my experience. I think the cases need properly thought out ventilation and that is all that is needed. I think a bigger heatsink like the one included with the NVidia Jetson Nano would be warranted as well. I don’t like fans and consider the quietness of the Pi as one of its biggest features.

Cons

All this sounds great, but what are the downsides of the Raspberry Pi 4?

All New Cables

I purchased an NVidia Jetson Nano and to run it, I just unplugged the cables from my Raspberry Pi 3 and plugged them into the Jetson and away it went. Not new cables required.

The Raspberry Pi required a new USB-C power supply and a lot has been made of how you can’t use Apple laptop power supplies, but I think the real issue is you can’t use an older Pi power supply, even if it can provide sufficient power.

To support dual monitors, the Pi went to micro-HDMI ports to fit both connectors. This means you need either new cables or at least micro- to regular-HDMI adapters. The NVidia Jetson supports dual monitors but annoyingly with two different cables, HDMI and a DisplayPort cable. At least the cables are the same for the two video ports.

Otherwise all my USB devices that I was using with the Raspberry Pi 3 seem to work with the Pi 4.

SDCard Bottleneck

They have improved the data transfer speed to and from the microSD card with the Pi 4, but this is still a bottleneck. I would have loved it if they had added a M.2 SSD interface to the board. You can improve on the microSD card speed by using a USB 3 external SSD. The problem is that you can’t boot from this USB 3 drive. You can copy the root filesystem over to the drive and run mostly from the USB and although I haven’t tried it yet, people report this is an improvement.

Raspberry Pi promote the 4 as a desktop computer replacement and it definitely has the processing power. However, I don’t think this really holds up without something better than running off a microSD card. The Raspberry Pi Foundation say they will add boot from USB support in a future firmware update, but it isn’t there yet. Although the speed of USB 3 is better than the microSD interface, it still isn’t nearly as good as you can obtain with M.2 and a good new SSD.

No 64-Bits Yet

The Raspberry Pi Foundation, caught everyone by surprise with their release. This included the people that maintain alternate operating systems for the Raspberry Pi. There is a good Ubuntu Mate 64-bit version that runs on the Raspberry Pi 3. It is slow and you can’t run many programs, but it does work and you can experiment with things like ARM 64-bit Assembly programming.

The person that maintains this had to order his Raspberry Pi 4, like everyone else and hasn’t produced a Pi 4 version yet. It would have been nice if the Raspberry Pi Foundation had seeded some early models to the people that develop alternate operating systems for the Pi.

As of this writing, Raspbian is the only operating system that runs on the Raspberry Pi, but hopefully the others won’t take too long to modify what they need to.

The Raspberry Pi 4 with 4GB is the first Raspberry Pi that has the power to run a true 64-bit operating system, so it would be nice to play with.

Cost

The Raspberry Pi 4 is still dirt cheap, $35 for the 1GB model and $55 for the 4Gig model. This upgrade is a bit more expensive since you need a new power adapter, new video cables and a new case as well. I think the extra $20 for the extra memory is well worth it.

Compared to the NVidia Jetson Nano

The Raspberry Pi 4 blows most of the current crop of Pi clones out of the water. One exception is the NVidia Jetson Nano. This single board computer has 4GB of memory and runs full 64-bit Ubuntu Linux and as a consequence feels more powerful than the Pi 4.

The Pi 4 has a more powerful ARM CPU, but the Jetson has 4 USB-C ports and a 128 core CUDA GPU. The CUDA GPU is used by software like CubicSDR for DSP like processing, along with most AI toolkits like Tensorflow.

The NVidia Jetson costs $99, so is nearly twice as expensive as a Pi 4. However if you want to experiment with AI, the 128-core CUDA GPU is an excellent entry level system. 

Summary

I got used to the Raspberry Pi 4 fairly quickly and after a couple of weeks thought it was pretty similar to the Raspberry Pi 3. I then needed to do something on my Raspberry 3 and booted it up. After using the Pi 4, going back to the Pi 3, felt like I was working in molasses, everything was so slow. This is a real testament to how good the new Pi is, especially with 4GB of memory.

Yes, there are some teething problems with the new model, as there often is at the bleeding edge. But overall the Raspberry Pi 4 is a solid upgrade, and once you adopt it, you really can’t go back. 

 

Written by smist08

August 2, 2019 at 7:09 pm

Playing with Julia 1.0 on the Raspberry Pi

with 2 comments

Introduction

A couple of weeks ago I saw the press release about the release of version 1.0 of the Julia programming language and thought I’d check it out. I saw it was available for the Raspberry Pi, so I booted up my Pi and installed it. Julia has been in development since 2012, it was created by four MIT professors as an open source project for mathematical computing.

Why Julia?

Most people doing data science and numerical computing use the Python or R languages. Both of these are open source languages with huge followings. All new machine learning projects need to integrate to these to get anywhere. Both are very productive environments, so why do we need a new one? The main complaint about Python and R is that these are interpreted languages and as a result are very slow when compared to compiled languages like C. They both get around this by supporting large libraries of optimized code written in C, C++, Assembler and Fortran to give highly optimized off the shelf algorithms. These work great, but if one of these doesn’t apply and you need to write Python loops to process a large data set then it can get really frustrating. Another frustration with Python is that it doesn’t have a built in array data type and relies on the numpy and pandas libraries. Between these you can do a lot, but there are holes and strange differences between the two systems.

Julia has a powerful builtin array type and most of the array manipulation features of numpy and pandas are built in to the core language. Further Julia was created from scratch around powerful new just in time (JIT) compiler technology to provide both the speed of development of an interpreted language combined with the speed of a compiled language. You don’t get the full speed of C, but it’s close and a lot better than Python.

The Julia language borrows a lot of features from Python and I find programming in it quite similar. There are tuples, sets, dictionaries and comprehensions. Functions can return multiple values. For loops work very similarly to Python with ranges (using the : built into the language rather than the range() function).

Julia can call C functions directly (meaning you can get pointers to objects), and this allows many wrapper objects to have been created for other systems such as TensorFlow. This is why Julia is very precise about the physical representation of data types and the ability to get a pointer to any data.

Julia uses the end keyword to terminate blocks of code, rather than Pythons forced indentation or C’s semicolons. You can use semicolons to have multiple statements on one line, but don’t need them at the end of a line unless you want it to return null.

Julia has native built in support of most numeric data types including complex numbers and rational numbers. It has types for all the common hardware supported ints and floats. Then it also has arbitrary precision types build around GNU’s bignum library.

There are currently 1906 registered Julia packages and you can see the emphasis on scientific computing, along with machine learning and data science.

The creators of Julia always keep performance at the top of mind. As a result the parallelization support is exceptional along with the ability to run Julia code on CUDA NVidia graphics cards and easily setup clusters.

Is Julia Ready for Prime Time?

As of the time of this writing, the core Julia 1.0 language has been released and looks quite good. Many companies have produced impressive working systems with the 0.x versions of Julia. However right now there are a few problems.

  • Although Julia 1.0 has been released, most of the add on packages haven’t been upgraded to this version yet. In the first release you need to add the Pkg package to add other packages to discourage people using them yet. For instance the library with GPIO support for the Pi is still at version 0.6 and if you add it to 1.0 you get a syntax error in the include file.
  • They have released the binaries for all the versions of Julia, but these haven’t made them into the various package management systems yet. So for instance if you do “sudo apt install julia” on a Raspberry Pi, you still get version 0.6.

Hopefully these problems will be sorted out fairly quickly and are just a result of being too close to the bleeding edge.

I was able to get Julia 1.0 going on my Raspberry Pi by downloading the ARM32 files from Julia’s website and then manually copying them over the 0.6 release. Certainly 1.0 works much better than 0.6 (which segmentation faults pretty much every time you have a syntax error). Hopefully they update Raspbian’s apt repository shortly.

Julia for Machine Learning

There is a TensorFlow.jl wrapper to use Google’s TensorFlow. However the Julia group put out a white paper dissing the TensorFlow approach. Essentially TensorFlow is a separate programming language that you use from another programming language like Python. This results in a lot of duplication and forces the programmer to operate in two different paradigms at once. To solve this problem, Julia has the Flux machine learning system built natively in Julia. This is a fairly powerful machine learning system that is really easy to use, reducing the learning curve to getting working models. Hopefully I’ll write a bit more about Flux in a future article.

Summary

Julia 1.0 looks really promising. I think in a month or so all the add-on packages should be updated to the 1.0 level and all the binaries should make it out to the various package distribution repositories. In the meantime, it’s a good time to learn Julia and you can accomplish a lot with the core language.

I was planning to publish a version of my LED flashing light program in Julia, but with the PiGPIO package not updated to 1.0 yet, this will have to wait for a future article.

 

Written by smist08

August 31, 2018 at 7:34 pm

Erlang on the Raspberry Pi

with 2 comments

Introduction

Erlang is a rather interesting programming language with a number of rather unique properties. If you are used to procedural languages like C or its many variants, then Erlang will appear very different. Erlang was originally developed by Ericsson to program telephone switches. The name is either based on Ericsson Language or is named after Agner Krarup Erlang depending on who you ask. In Erlang there are no loops, all loops are accomplished via recursion. Similarly once a variable is assigned, it can never be changed (is immutable). A lot of the function execution is controlled via pattern matching. Beyond the basic language the Erlang system is used to create highly available, scalable fault tolerant systems. You can even swap code on the fly without stopping the system. Erlang is still used in many telephony type applications, but it is also used to create highly reliable and scalable web sites. The best known being WhatsApp. Plus Facebook, Yahoo and many others have services implemented in Erlang serving massive numbers of users.

In this article we’ll begin to look at the basic language and consider an implementation of our flashing LED program for the Raspberry Pi implemented in pure Erlang. ERLang runs on most operating systems and is open source now. It is easy to add to the Raspberry Pi and is a very good system if you want to make use of a cluster of Raspberry Pis.

How to Run the Program

Erlang doesn’t come pre-installed with Raspbian, but it’s easy to add with the command:

sudo apt-get install erlang

After this completes you are good to go.

Erlang includes a terminal based command system where you can compile modules and call functions. To run the programs included here, you need to save the first file as lights.erl and the second as gpio.erl. Then you can compile and execute the program as indicated in the screenshot of my terminal window below:

Some things to note:

  1. Erlang is case sensitive and case has meaning. For instance variables start with a capital letter and functions with a lowercase letter.
  2. Punctuation is very important. The periods end a statement and in the erl shell will cause it to execute. If you miss the period, nothing will happen until you enter one (it assumes you have more text to enter). Similarly inside functions , and ; have a specific meaning that affects how things run.
  3. You do everything by calling functions both in the shell and in the Erlang language. So the c() function compiles a module (produces a .beam file). The q() function exits the shell after terminating all programs. lights:flashinglights() is our exported entry point function to run the program with. You can also call things like ls() to get a directory listing or cd() to change directories or pwd() to find out where you are. Remember to enter any lines with a period to terminate the line.
  4. To access the gpio /sys files, erl must be run from sudo. You could fix the file system permissions, but this seems easy enough.

Flashing LED Program

Below is my main module to flash the lights. Unlike the C or Fortran version of this program, there is no loop, since loops don’t exist in Erlang. Instead it uses recursion to accomplish the same things. (Actually in the Erlang tutorials there are design patterns on how to accomplish while or for loops with recursion). Notice that once a variable is assigned it can never be changed. But you accomplish the same thing with recursion by passing a modified version of the variable into a function. Similarly you can preserve variables using function closures, but I don’t do that here. I included edoc comments which are Erlang version of JavaDoc. Otherwise this is just intended to give a flavour for the language without going into too much detail.

 

%% @author Stephen Smith
%% @copyright 2018 Stephen Smith
%% @version 1.0.0
%% @doc
%% A erlang implementation of my flashing lights program
%% for the Raspberry Pi.
%% @end

-module(lights).
-export([flashlights/0]).
-author('Stephen Smith').

flashlights() ->
    Leds = init(),
    flash(Leds, 10).

init() ->
    L0 = gpio:init(17, out),
    L1 = gpio:init(27, out),
    L2 = gpio:init(22, out),
    {L0, L1, L2}.

flash(Leds, Times) when Times > 0 ->
    gpio:write(element(1, Leds), 1),
    timer:sleep(200),
    gpio:write(element(1, Leds), 0),
    gpio:write(element(2, Leds), 1),
    timer:sleep(200),
    gpio:write(element(2, Leds), 0),
    gpio:write(element(3, Leds), 1),
    timer:sleep(200),
    gpio:write(element(3, Leds), 0),

    flash(Leds, Times-1);

flash(Leds, Times) when Times =< 0 ->
    true.

Erlang GPIO Library

Rather than write the file access library for the GPIO drivers myself, doing a quick Google search revealed several existing ones including this one from Paolo Oliveira.

 

%% @author Paolo Oliveira <paolo@fisica.ufc.br>
%% @copyright 2015-2016 Paolo Oliveira (license MIT)
%% @version 1.0.0
%% @doc
%% A simple, pure erlang implementation of a module for 
%% <b>Raspberry Pi's General Purpose
%% Input/Output</b> (GPIO), using the standard Linux kernel
%% interface for user-space, sysfs,
%% available at <b>/sys/class/gpio/</b>.
%% @end

-module(gpio).
-export([init/1, init/2, handler/2, read/1, write/2, stop/1]).
-author('Paolo Oliveira <paolo@fisica.ufc.br>').

%% API

% @doc: Initialize a Pin as input or output.
init(Pin, Direction) ->
  Ref = configure(Pin, Direction),
  Pid = spawn(?MODULE, handler, [Ref, Pin]),
  Pid.

% @doc: A shortcut to initialize a Pin as output.
init(Pin) ->
  init(Pin, out).

% @doc: Stop using and release the Pin referenced as file descriptor Ref.
stop(Ref) ->
  Ref ! stop,
  ok.

% @doc: Read from an initialized Pin referenced as the file descriptor Ref.
read(Ref) ->
  Ref ! {recv, self()},
  receive
    Msg ->
      Msg
  end.

% @doc: Write value Val to an initialized Pin referenced as the file descriptor Ref.
write(Ref, Val) ->
  Ref ! {send, Val},
  ok.

%% Internals

configure(Pin, Direction) ->
  DirectionFile = "/sys/class/gpio/gpio" ++ integer_to_list(Pin) ++ "/direction",

  % Export the GPIO pin
  {ok, RefExport} = file:open("/sys/class/gpio/export", [write]),
  file:write(RefExport, integer_to_list(Pin)),
  file:close(RefExport),

  % It can take a moment for the GPIO pin file to be created.
  case filelib:is_file(DirectionFile) of
      true -> ok;
      false -> receive after 1000 -> ok end
  end,

  {ok, RefDirection} = file:open(DirectionFile, [write]),
  case Direction of
    in -> file:write(RefDirection, "in");
    out -> file:write(RefDirection, "out")
  end,
  file:close(RefDirection),
  {ok, RefVal} = file:open("/sys/class/gpio/gpio" ++ integer_to_list(Pin) ++ "/value", [read, write]),
  RefVal.

release(Pin) ->
  {ok, RefUnexport} = file:open("/sys/class/gpio/unexport", [write]),
  file:write(RefUnexport, integer_to_list(Pin)),
  file:close(RefUnexport).

% @doc: Message passing interface, should not be used directly, it is present for debugging purpose.
handler(Ref, Pin) ->
  receive
    {send, Val} ->
      file:position(Ref, 0),
      file:write(Ref, integer_to_list(Val)),
      handler(Ref, Pin);
    {recv, From} ->
      file:position(Ref, 0),
      {ok, Data} = file:read(Ref, 1),
      From ! Data,
      handler(Ref, Pin);
    stop ->
      file:close(Ref),
      release(Pin),
      ok
   end.

%% End of Module.

Summary

Erlang is a very interesting language. If you are interested in functional programming or how to create highly scalable reliable web servers, then Erlang is definitely worth checking out. We only looked at a quick introduction to the language, really by example. There are lots of good documentation, tutorials and samples just a Google search away. Perhaps in a future article we’ll look at processes and messages and how to build such a highly scalable server.

 

Written by smist08

February 18, 2018 at 11:59 pm

Posted in raspberry pi

Tagged with , ,