Stephen Smith's Blog

Musings on Machine Learning…

Archive for August 2018

Playing with Julia 1.0 on the Raspberry Pi

with 2 comments

Introduction

A couple of weeks ago I saw the press release about the release of version 1.0 of the Julia programming language and thought I’d check it out. I saw it was available for the Raspberry Pi, so I booted up my Pi and installed it. Julia has been in development since 2012, it was created by four MIT professors as an open source project for mathematical computing.

Why Julia?

Most people doing data science and numerical computing use the Python or R languages. Both of these are open source languages with huge followings. All new machine learning projects need to integrate to these to get anywhere. Both are very productive environments, so why do we need a new one? The main complaint about Python and R is that these are interpreted languages and as a result are very slow when compared to compiled languages like C. They both get around this by supporting large libraries of optimized code written in C, C++, Assembler and Fortran to give highly optimized off the shelf algorithms. These work great, but if one of these doesn’t apply and you need to write Python loops to process a large data set then it can get really frustrating. Another frustration with Python is that it doesn’t have a built in array data type and relies on the numpy and pandas libraries. Between these you can do a lot, but there are holes and strange differences between the two systems.

Julia has a powerful builtin array type and most of the array manipulation features of numpy and pandas are built in to the core language. Further Julia was created from scratch around powerful new just in time (JIT) compiler technology to provide both the speed of development of an interpreted language combined with the speed of a compiled language. You don’t get the full speed of C, but it’s close and a lot better than Python.

The Julia language borrows a lot of features from Python and I find programming in it quite similar. There are tuples, sets, dictionaries and comprehensions. Functions can return multiple values. For loops work very similarly to Python with ranges (using the : built into the language rather than the range() function).

Julia can call C functions directly (meaning you can get pointers to objects), and this allows many wrapper objects to have been created for other systems such as TensorFlow. This is why Julia is very precise about the physical representation of data types and the ability to get a pointer to any data.

Julia uses the end keyword to terminate blocks of code, rather than Pythons forced indentation or C’s semicolons. You can use semicolons to have multiple statements on one line, but don’t need them at the end of a line unless you want it to return null.

Julia has native built in support of most numeric data types including complex numbers and rational numbers. It has types for all the common hardware supported ints and floats. Then it also has arbitrary precision types build around GNU’s bignum library.

There are currently 1906 registered Julia packages and you can see the emphasis on scientific computing, along with machine learning and data science.

The creators of Julia always keep performance at the top of mind. As a result the parallelization support is exceptional along with the ability to run Julia code on CUDA NVidia graphics cards and easily setup clusters.

Is Julia Ready for Prime Time?

As of the time of this writing, the core Julia 1.0 language has been released and looks quite good. Many companies have produced impressive working systems with the 0.x versions of Julia. However right now there are a few problems.

  • Although Julia 1.0 has been released, most of the add on packages haven’t been upgraded to this version yet. In the first release you need to add the Pkg package to add other packages to discourage people using them yet. For instance the library with GPIO support for the Pi is still at version 0.6 and if you add it to 1.0 you get a syntax error in the include file.
  • They have released the binaries for all the versions of Julia, but these haven’t made them into the various package management systems yet. So for instance if you do “sudo apt install julia” on a Raspberry Pi, you still get version 0.6.

Hopefully these problems will be sorted out fairly quickly and are just a result of being too close to the bleeding edge.

I was able to get Julia 1.0 going on my Raspberry Pi by downloading the ARM32 files from Julia’s website and then manually copying them over the 0.6 release. Certainly 1.0 works much better than 0.6 (which segmentation faults pretty much every time you have a syntax error). Hopefully they update Raspbian’s apt repository shortly.

Julia for Machine Learning

There is a TensorFlow.jl wrapper to use Google’s TensorFlow. However the Julia group put out a white paper dissing the TensorFlow approach. Essentially TensorFlow is a separate programming language that you use from another programming language like Python. This results in a lot of duplication and forces the programmer to operate in two different paradigms at once. To solve this problem, Julia has the Flux machine learning system built natively in Julia. This is a fairly powerful machine learning system that is really easy to use, reducing the learning curve to getting working models. Hopefully I’ll write a bit more about Flux in a future article.

Summary

Julia 1.0 looks really promising. I think in a month or so all the add-on packages should be updated to the 1.0 level and all the binaries should make it out to the various package distribution repositories. In the meantime, it’s a good time to learn Julia and you can accomplish a lot with the core language.

I was planning to publish a version of my LED flashing light program in Julia, but with the PiGPIO package not updated to 1.0 yet, this will have to wait for a future article.

 

Advertisements

Written by smist08

August 31, 2018 at 7:34 pm

TensorFlow on the Raspberry Pi and Beyond

with one comment

Introduction

You’ve been able to use TensorFlow on a Raspberry Pi for a while, but you’ve had to build it yourself. With TensorFlow 1.9, Google added native support, so you can just use pip3 to install precompiled binaries and be up and running in no time. Although you can do this, general TensorFlow usage on the Raspberry Pi is slow. In this article I’ll talk about some challenges to running TensorFlow on the Raspberry Pi and look at some useful cases that do work. I’ll also compare some operations against my Intel i3 based laptop and the rather beefy servers available through Google’s subsidiary Kaggle.

Installing TensorFlow on a Pi

I saw the press release about how easy it was to install TensorFlow on a Raspberry Pi, so I read the TensorFlow install page for the Pi, checked the prerequisites, and followed the instructions. All I got was strange unhelpful error messages about how there was no package for my version of Python. The claim on the TensorFlow web page is that Python 3.4 or greater is required and I was running 3.4.2, so all should be good. I installed all the prerequisites and dependencies from the TensorFlow script and those all worked, including TensorBoard. But no luck with TensorFlow itself.

After a bit of research, it appeared that the newest version of Raspbian is Stretch, but I was running Jessie. I had assumed that since my operating system was updating that it would have installed any newer version of Raspbian. That turns out to not be true. The Raspberry people were worried about breaking things, so didn’t provide an automatic upgrade path. Their recommendation is to just install a new image on a new SD card. I could have done that, but I found instructions on the web on how to upgrade from Jessie to Stretch. I followed the instructions available here, and it all worked fine.

To me, this is really annoying since I wasted quite a bit of time on this. I don’t understand why Raspbian didn’t at least ask if I wanted to upgrade to Stretch offering the risks and trade-offs. At any rate now I know, not to trust “sudo apt-get dist-upgrade”, it doesn’t necessarily do what it claims.

After I upgraded to Stretch, doing a “sudo pip3 install TensorFlow” worked quickly and I was up and running.

Giving TensorFlow a Run

To try out TensorFlow on my Raspberry Pi, I just copied the first TensorFlow tutorial into IDLE (the Python IDE) and gave it a run.

import tensorflow as tf
mnist = tf.keras.datasets.mnist

(x_train, y_train),(x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

model = tf.keras.models.Sequential([
  tf.keras.layers.Flatten(),
  tf.keras.layers.Dense(512, activation=tf.nn.relu),
  tf.keras.layers.Dropout(0.2),
  tf.keras.layers.Dense(10, activation=tf.nn.softmax)
])
model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

model.fit(x_train, y_train, epochs=5)
model.evaluate(x_test, y_test)

This tutorial example trains on the MNINST dataset which is a set of handwritten digits and then evaluates the test set to see how accurate the model is. This little sample typically achieves 98% accuracy in identifying the digits. The dataset has 60,000 images for training and then 10,000 for testing.

I set this running on the Raspberry Pi and it was still running hours later when I went to bed. My laptop ran this same tutorial in just a few minutes. The first time you run the program, it downloads the test data, on the Pi this was very slow. After that it seems to be cached locally.

Benchmarking

To compare performance, I’ll look at a few different factors. The tutorial program really has three parts:

  1. Downloading the training and test data into memory (from the local cache)
  2. Training the model
  3. Evaluating the test data

Then I’ll compare the Raspberry Pi to my laptop and the Kaggle virtual environment, both with and without GPU acceleration.

 

Load Time Fit Time Eval Time
Raspberry Pi 3.6 630 4.7
I3 Laptop 0.6 95 0.5
Kaggle w/o GPU 1.7 68 0.6
Kaggle with GPU 1.1 44 0.6

 

Keep in mind that my Raspberry Pi is only a 3 and not the newer slightly faster 3 B+. The GPU in the Kaggle environment is the NVIDIA Tesla K80. The server is fairly beefy with 16GB of RAM. The Kaggle environment is virtual and shared, so performance does vary depending on how much is going on from other users.

Results

As you can see the Raspberry Pi is very slow fitting a model. The MNINST data is fairly compact as these things go and represents a relatively small data set. If you want to fit a model and only have a Raspberry Pi, I would recommend doing it in a Kaggle environment from an Internet browser. After all it is free.

I think the big problem is that the Raspberry Pi only has 1Gig of RAM and will be swapping to the SD Card which isn’t the greatest in performance. My laptop has 4Gig RAM and a good SSD Hard Drive. I suspect these are more key than comparing the Intel i3 to the ARM Cortex processor.

So why would you want TensorFlow on the Raspberry Pi then? The usage would be to run pre-trained models for specific applications. For instance perhaps you would want to make a smart door camera. The camera could be hooked up to a Raspberry Pi and then a TensorFlow image recognition model could be run to determine if someone approaching the door should be admitted, and if so, send a signal from a GPIO pin to unlock the door.

From above you might think that evaluation is still too slow on a Raspberry Pi. However, x_test which we are evaluating actually contains 10,000 test images. So performing 10,000 image evaluations in under 5 seconds is actually pretty decent.

A good procedure would be to train the models on a more powerful computer or in the cloud, then run the model on the Pi to create some sort of smart device utilizing the Pi’s great I/O capabilities.

Summary

The Raspberry Pi with its great DIY interface abilities combined with its ability to run advanced machine learning AI applications provides a great platform to develop smart devices. I look forward to seeing all sorts of new smart projects appearing on the various Raspberry Pi project boards.

Written by smist08

August 17, 2018 at 12:09 am

Posted in Uncategorized