Stephen Smith's Blog

Musings on Machine Learning…

The Technology of “Influence” – Part 2 VPN

leave a comment »

Introduction

In my novel “Influence”, the lead character J@ck Tr@de performs various hacking tasks. In the book he spends a lot of time securing his connections, hiding his identity and hiding his location. In a series of blog posts, I’m going to talk about the various technologies mentioned in the book like VPN, the Onion Browser, Kali Linux and using VHF radios. I talked about HTTPS in my last post and in this article, we’re going to discuss Virtual Private Networks (VPNs).

For anyone interested, my book is available either as a paperback or as a Kindle download on Amazon.com:

Paperback – https://www.amazon.com/dp/1730927661
Kindle – https://www.amazon.com/dp/B07L477CF6

What is a VPN?

We talked about HTTPS last time as a way to secure the communications protocol that a Browser uses to talk to a Web Server. Now consider a corporate network. People at work have their computers hooked directly into the corporate network. They use this to access email, various internal corporate websites, shared network drives and other centrally deployed applications. All of these services have their own network protocols all different than HTTP. Some of these protocols have secure variants, some don’t. Some have heavy security, some light security. Now suppose you want to access these from home or from a hotel while on a business trip? You certainly can’t just do this over the Internet, because its a public network and anyone can see what you are doing. You need a way to secure all these protocols. This is the job of VPN. When you activate VPN on your laptop, it creates a secure tunnel from your laptop through the Internet to a server in your secure corporate data center. The security mechanisms VPN uses are largely the same as HTTPS and pretty secure. Using VPN then allows you to work securely from home or from remote locations while travelling.

Why Would J@ck Use VPN?

J@ck Tr@de doesn’t work for a corporation. Why does he use VPN? Whose VPN does he use? In the example above, if I’m connected to my corporate VPN, all my network traffic is tunnelled through the VPN to the corporate server. So if I browse the Internet while connected to VPN, my HTTPS requests are sent to the corporate server and then it sends them to the Internet. This extra step slows things down, but it has an interesting side-effect. If I’m not signed into Google and I Google something, Google will see my Internet Address as the corporate server rather than my laptop. That means Google won’t know who I am exactly. It also means my location shows up as the location of the corporate server. This then hides both my location and my identity, things J@ck is very interested in doing.

But J@ck doesn’t work for a corporation? Whose VPN does he use? This “feature” of hiding identity and location is sufficiently valuable that people like J@ck will pay for it. This has resulted in companies setting up VPNs just for this purpose. Their VPN server doesn’t connect to other corporate network programs, only the Internet. Using one of these VPN services will help hide your identity and location, or at least websites can’t determine these from the address fields in your web network packets.

VPNs are popular with non-hackers as well to get at geographically locked content. For instance if you live in Canada, then the content you can get from Netflix is different than the content you get in the USA. But if you are in Canada and connect to a US based VPN server then Netflix will see you as being located in the USA and will give you the US content while you are connected.

Downsides of VPN

Sounds good, so what’s the catch? One is that since these are usually paid services, so you need to pay a monthly fee. Further, you need to authenticate to the VPN service so they know who you are. The VPN knows your IP address so it can trace who and where you are.

So do you trust your VPN? Here you have to be careful. If the VPN provider is located in the USA, then its subject to the Patriot Act and law enforcement can get ahold of their info. If you want US Netflix content, then you have to use an US based VPN, but at the same time US law enforcement really doesn’t care that much about the vagaries of what Netflix allows where. If you are a hacker then you really care and probably want to use a VPN in a country with some protections. For instance in Europe, getting a warrant for this is very difficult. Or perhaps use a VPN in the Caribbean that tend to ignore external law enforcement agencies requests. A bit of Googling can help here. Some hackers use a two or three VPNs at once, located in wildly different jurisdictions to make it even harder to be traced.

Internet bandwidth is expensive, so feeding streaming movies through a VPN can require their delux expensive plan. Doing little bits of hacking doesn’t require that much bandwidth so can be a little cheaper.

There are free VPNs, but most of these are considered rather suspect since they must be supporting themselves somehow, perhaps by selling secrets. VPNs are illegal in some countries like Iraq or North Korea. VPNs are required to be run by the government in other countries like China and Russia. So be wary of these.

Summary

VPNs are a way to secure your general Internet communications. They have the desirable side-effect of hiding your Internet address and location. VPNs are absolutely necessary for corporate security and useful enough that lots of other people use them as well,

Notice that J@ck doesn’t just rely on an VPN by itself, rather its one layer in a series of protections to ensure his anonymity and privacy.

Advertisements

Written by smist08

December 13, 2018 at 12:34 am

The Technology of “Influence” – Part 1 HTTPS

with one comment

Introduction

In my novel “Influence”, the lead character J@ck Tr@de performs various hacking tasks. In the book he spends a lot of time securing his connections, hiding his identity and hiding his location. In a series of blog posts, I’m going to talk about the various technologies mentioned in the book like VPN, the Onion Browser, Kali Linux and using VHF radios. But first I need to talk about HTTPS which is the normal Internet security mechanism we all use to secure our bank and shopping transactions. I’ll look at what this does protect and what it doesn’t protect. Once we understand the limitations of HTTPS, we can go on to look at why J@ck goes to so much trouble to add so many extra levels of security and misdirection.

For anyone interested, my book is available either as a paperback or as a Kindle download on Amazon.com:

Paperback – https://www.amazon.com/dp/1730927661
Kindle – https://www.amazon.com/dp/B07L477CF6

What is HTTPS?

The communications protocol that Browsers use to communicate with Web servers is called HTTP (HyperText Transfer Protocol). This is the protocol that gets data for websites and downloads it to your browser to be displayed. The S added is for Secure and makes this process secure by encrypting the communications. In the early days of the Web doing all this encrypting/decrypting was expensive both for typical personal computers of the day and for web sites that have quite a high volume of traffic. These days computers are more powerful and can handle this encryption easily, and due to the prevalence of hackers and scammers, the current tendency is to just encrypt all Internet traffic. In fact most modern browsers will not let you use plain old HTTP and require the S for security.

HTTPS is actually quite secure. It is very difficult to decrypt with modern computer resources (even cloud based). It authenticates the server via a digital certificate which is provided by a certificate authority that validates the identity of who has the certificate. The protocol protects against man-in-the-middle attacks where someone impersonates one party and relays the information. It protects against data being tampered with in any way.

Sounds pretty good, and in fact it is pretty good. So why does J@ck feel a need to use VPNs or use the Tor network via the Onion Browser?

Weaknesses of HTTPS

J@ck’s main complaint is that: who he talks to knows who he is and what he is doing. For instance, all Google searches go through HTTPS, so no one can eavesdrop on what you are searching for. But, Google knows. Google logs all your searches and builds a detailed profile of you. Further Google is an American company and subject to the Patriot Act and other government programs to hand over your data if requested. Hence if, say you are Googling on hacking techniques, Google could turn that over to the FBI along with your IP address. Then the FBI can ask your ISP who owns this IP address and identify you and come to your door to ask you some questions. Of course if you are signed into your Google account, then they don’t need to bother with the IP address lookup. J@ck certainly doesn’t want that to happen.

HTTPS has some other weaknesses as well. The process of granting authentication certificates isn’t perfect. One of the most common Windows Updates is to alter the list of trusted certificate authorities, since they are often caught handing out fake certificates to shady operators. Along the same lines, most people don’t check the certificate of who they are talking to. This is how most phishing emails work. They send and email asking you to check your bank account, with a link that is similar to your banks, but not the same. The fake link goes to a page that looks like your bank’s login page, but it isn’t. If you click on the certificate icon in your browser you will see the certificate that that it isn’t your banks. But who does this? If you type in your username and password to this site, the bad actors can then use it to login to your real bank account and steal your money.

Hackers can learn a bit about the content of HTTPS traffic even though its encrypted. Perhaps the URI by comparing the lengths of the strings.

Another worry is that often more companies can see your data than you might think. For instance if you are talking to your bank, then you certainly expect you bank can understand your data. However your bank might use a third party web hosting company to host the web site and then that company can also see your data. Then the web hosting company might host the site on a cloud provider like AWS or Azure and then that group might be able to see your data. Then often websites protect themselves against DDoS attacks using a service like CloudFlare and part of that setup lets CloudFlare see the unencrypted data. So suddenly you aren’t just trusting one company, but four companies. This then provide many more vectors of attack and vulnerable points for hackers. Plus the bank might have hired outsourced programming to set up their website, and those contractors have enough credentials to see unencrypted data. These are actually the main causes for all the security breaches you read about at large Internet sites.

Summary

HTTPS is a pretty good way to secure Internet traffic and if you follow some basic good practices you should be ok. For instance never use a link in an email. Always goto the website through another means (like a favorite or use Google). For data you really care about, like your bank account, only access it from a network you trust, not the Wifi at a hotel or coffee shop.

Now that we understand the strength and weaknesses of HTTPS we can look at the extra layers that J@ck uses to stay anonymous and secure.

Written by smist08

December 11, 2018 at 2:33 am

Posted in Security, Writing

Tagged with , ,

“Influence” – My First Novel

with 3 comments

Introduction

I am really excited that my first novel is now available for sale on Amazon as either a paperback or as a Kindle download. Here is the synopsis for the novel:

Influence is set in the present about punk rock hacker, J@ck Tr@de, who discovers a security backdoor in a large corporate server operating system to gain access to all of the world’s servers. He uses this illicit access to mine bitcoin and influence local politics via Social Media. He becomes criminally and romantically entwined with Mia, the creator of the backdoor and their plans escalate to increase their wealth and power. The FBI investigate and chase them, in a clumsy cat and mouse game. As the story progresses, J@ck’s Social Media altering Bots become more and more influential.l. They make J@ck a billionaire through stock market manipulation. The Social Media Bots continue to evolve…

Where It Began

I’ve been writing this blog since January, 2009. That will be ten years of blogging next month! I really enjoy blogging, mostly on my technology interests. This blog started by being all things ACCPAC, since that’s what I worked on originally at Computer Associates, then ACCPAC International and finally Sage. I find I really enjoy writing and was looking to do more. Almost three years ago I retired, and at that point mostly lost my main blogging topic on Sage 300 (ACCPAC).

I’ve always been a big Science Fiction fan, I’ve read Science Fiction since elementary school with books like Isaac Asimov’s Lucky Starr series. When I started at Computer Associates, I lived in Tsawwassen and had a long bus ride commute downtown. I spent most of this ride voraciously reading all the Science Fiction novels nominated for the Hugo and Nebula Awards, as well as whatever my favorite authors published.

We spent this March down in Yuma, Arizona. One of the things we did while we were down there was attend the “Write On the Edge” writing group. This group gets together weekly at the Yuma Foothills Library to do some writing. They do some sort of writing exercise each meeting. The first time we attended, it was to write a few paragraphs on a topic that a moderator chose. Since Easter was approaching, the topic was “Easter Eggs”. There were a lot of short pieces on people’s favorite Easter family moments (whether real or imagined) and one about horrible carnivorous beasts hatching from the eggs. I took the approach of computer software Easter Eggs meaning little jewels buried in the code. This led to the creation of the J@ck Tr@de character and the few paragraphs around where he finds the W-Server backdoor.

Then as we did the 24 hour drive back home (over three days), I kept thinking about those few paragraphs and felt I had enough ideas to develop it into a novel. This then led to “Influence”.

A Lot of Writing

When I got home, I put the blogging aside (hence no articles here from March to July) and started my novel. I first wrote a very quick outline. Mostly a beginning and an ending, some settings and some notes on some characters. I then started writing. I tried to write at least two pages a day. Sometimes more, and there were only a couple of days when I didn’t write anything. I participated in a couple of writing groups, one in Gibsons and one in North Vancouver. These were get togethers at coffee shops where writers bring their laptops and write. My wife, Cathalynn Labonte-Smith is also an author. She has Creative Writing, Technical Writing and Teaching degrees. She has worked as an editor and would read what I wrote each day and edit it. I wrote the whole thing in Google Docs, so the collaboration was really easy. I was pretty happy to finish my first and second drafts in July.

It’s Written, Now What?

I then went to publish the book. Most of the bigger publishers only take submissions from literary agents. So I followed the submission guidelines for a number of agencies, but didn’t have any luck. I did quite a bit of online research and talked to quite a few authors. The consensus seemed to be that the publishers were publishing less and less each year and that they only picked up authors who’ve already made a name for themselves on their own. Further the published authors didn’t think the publishing companies do much to promote their work and that they have to do all their own promotion. Meanwhile self-publishing is getting easier and easier. I chose Kindle Direct Publishing form Amazon, mostly because its all online, there are no upfront costs and it was easy. So now my book is on Amazon for sale around the world.

Summary

Buy my book:

Paperback – https://www.amazon.com/dp/1730927661

Kindle – https://www.amazon.com/dp/B07L477CF6

It was a lot of fun writing. I planned this book to be the first book in a trilogy and I’ve already written sixty pages of the second volume.

 

Written by smist08

December 5, 2018 at 7:08 pm

Getting Productive with Julia

with 3 comments

Introduction

Julia is a programming language that is used quite extensively by the scientific community. It is open source, it just reached its version 1.0 milestone after quite a few years of development and it is nearly as fast as C but with many features associated with interpretive languages like R or Python.

There don’t seem to be many articles on getting all up and running with Julia, so I thought I’d write about some things that I found useful. This is all based on playing with Julia on my laptop running Ubuntu Linux.

Run in the Cloud

One option is to avoid any installation hassles is to just run in the cloud. You can do this with JuliaBox. JuliaBox gives you a Jupyter Notebook interface where you can either play with the various tutorials or do your own programming. Just beware the resources you get for free are quite limited and JuliaBox makes its money by charging you for additional time and computing power.

Sadly at this point, there aren’t very many options for running Julia in the cloud since the big AI clouds seem to only offer Python and R. I’m hoping that Google’s Kaggle will add it as an option, since the better performance will open up some intriguing possibilities in their competitions.

JuliaBox gives you easy direct access to all the tutorials offered from Julia’s learning site. Running through the YouTube videos and playing with these notebooks is a great way to get up to speed with Julia.

Installing Julia

Julia’s website documents how to install Julia onto various operating systems. Generally the Julia installation is just copying files to the right places and adding the Julia executable to the PATH. On Ubuntu you can search for Julia in the Ubuntu Software App and install it from there. Either way this is usually pretty easy straight forward. This gives you the ability to run Julia programs by typing “julia sourefile.jl” at a command prompt. If you just type “julia” you get the REPL environment for entering commands.

You can do quite a lot in REPL, but I don’t find it very useful myself except for doing things like package management.

If you like programming by coding in your favorite text editor and then just running the program, then this is all you need. For many purposes this works out great.

The Juno IDE

If you enjoy working in full IDE’s then there is Juno which is based on the open source Atom IDE. There are commercial variants of this IDE with full support, but I find the free version works just fine.

To install Juno you follow these instructions. Basically this involves installing the Atom IDE by downloading and running a .deb installation package. Then from within Atom, adding Julia support to the IDE.

Atom has integration with Julia’s debugger Gallium as well as provides a plot plane and access to watch variables. The editor is fairly good with syntax highlighting. Generally not a bad way to develop in Julia.

Jupyter

JuliaBox mentioned above uses Jupyter and runs it in the cloud. However, you can install it locally as well. Jupyter is a very popular and powerful notebook metaphor for developing programs where you see  the results of each incremental bit of code as you write it. It is really good at displaying all sorts of fancy formats like graphs. It is web based and will run a local web server that will use the local Julia installation to run. If you develop in Python or R, then you’ve probably already played with Jupyter.

To install it you first have to install Jupyter. The best way to do this is to use “sudo apt install jupyter”. This will install the full Jupyter environment with full Python support. To add Julia Jupyter support, you need to run Julia another way (like just entering julia to get the REPL) and type “Pkg.add(“IJulia”)”. Now next time you start Jupyter (usually by typing “jupyter notebook”), you can create a new notebook based on Julia rather than Python.

Julia Packages

Once you have the core Julia programming environment up and running, you will probably want to install a number of add-on packages. The package manager is call Pkg and you need to type “using Pkg” before using it. These are all installed by the Pkg.add(“”) command. You only need to add a package once. You will probably want to run “Pkg.update()” now and again to see if the packages you are using have been updated.

There are currently about 1900 registered Julia packages. Not all of them have been updated to Julia version 1.0 yet, so check the compatibility first. There are a lot of useful packages for things like machine learning, scientific calculations, data frames, plotting, etc. Certainly have a look at the package library before embarking on writing something from scratch.

Summary

These are currently the main ways to play with Julia. I’m sure since Julia is a very open community driven system, that these will proliferate. I don’t miss using the giant IDEs Visual Studio or Eclipse, these have become far too heavy and slow in my opinion. I find I evenly distribute my time between using Jupyter, Juno and just edit/run. Compared to Python it may appear their aren’t nearly as many tools for Julia, but with the current set, I don’t feel deprived.

 

Written by smist08

October 10, 2018 at 3:55 am

Avoiding Airline Collisions with Julia

leave a comment »

Introduction

I was just watching an old episode of “Mayday: Air Crash Investigations“, on the crash of a Russian passenger jet with a DHL cargo plane over Switzerland. In this episode, both planes had onboard collision avoidance systems, but one plane listened to air traffic control rather than the collision avoidance system and went down rather than up, resulting in the collision. In reading about the programming language Julia recently, I had noticed several presentations on the development of the next generation of collision avoidance systems, in Julia. This piqued my interest, along with the fact that my wife is currently getting her pilot’s license, to have a slightly deeper look into this.

Modern airliners have employed an onboard Traffic Collision Avoidance Systems (TCAS) since the 1980s. TCAS is required on any passenger airplane that takes more than 19 passengers. These systems work by monitoring the transponders of nearby aircraft and determining when a collision is imminent. At this point it provides a warning to the plane’s pilot along with a course of action. The TCAS systems on the two aircraft communicate so one plane is ordered to go up and the other to descend.

Generally there are three layers to collision avoidance that operate on different timescales. At the coarsest level planes travelling in one direction are required to be at a different altitude than planes in the reversion direction. Usually one direction gets even altitudes like 30,000 feet and the reverse gets odd altitude like 31,000 feet. At a finer level, air traffic control is responsible for keeping the planes apart at medium distances. Then close up (minutes apart) it is TCAS’s job to avoid the collisions. This is partly due to the aftermath of the Russian/DHL crash and partly due to a realization that the latency in communications with air traffic control is too great when things get too close for comfort.

Interestingly it was the collision of two passenger plane’s over the Grand Canyon in 1956 that caused congress to create the FAA and started the development of the current TCAS system. It took thirty years to develop and deploy since it required computers to get much smaller and faster first.

Why Julia

The FAA has funded the development of the next generation of traffic avoidance which has been dubbed ACAS X. This started in 2008 and after quite a bit of study, it was decided to use Julia extensively in its development. Reading the reasons for why Julia was selected is rather scary when you consider what it highlights about the current TCAS system.

Problem 1 – Specifications

A big problem with TCAS was that the people that defined the system wrote the specification first as English like pseudo-code and then re-wrote that as a more programmy pseudo-code with variables and such. Then others would take this code and implement it in Mathlab to test the algorithms. Then the people who actually made the hardware would take this and re-implement it in C++ or Assembler. When people had a recent look at all this code, they found it to be a big mess, where the different specs and code bases had been maintained separately and didn’t match. There was no automation and very little validation. The first idea of fixing this code base was rejected as completely unreliable and impossible to add new features to.

They wanted to the new system to take advantage of modern technologies like satellite navigation systems, GPS, and on-board radar systems. This means the new system will work with other planes that don’t have collision avoidance or perhaps don’t even have a transponder. In fact they wanted the new system to be easily extensible as new sensor inputs are added. Below is a small example of the reams of pseudo code that makes up TCAS.

The hope with Julia is to unify these different code bases into one. The variable pseudo-code would actually be true Julia code and the English code would be incorporated into JavaDoc like comments in the code (actually using Latex). This would then eliminate the need to use Mathlab to test the pseudo-code. The consensus is that Julia code is easily as readable as the above pseudo-code but with the advantage of being runnable and testable.

The FAA doesn’t have the authority to mandate Avionics hardware companies run Julia on their ACAS X systems, but the hope is that the performance of Julia is good enough that they won’t bother reimplementing the system in C++ and that everything will be the same Julia code. Current estimates have the Julia code running 1.5 times the speed of C code and the thought is that with newer computer chips, this should be sufficient. The hope then is that the new system will not have the translation errors that dog TCAS.

Now that the specification is true computer code many other tools can be written or used to help check correctness, such as the tool below which generates a flowchart from the Julia code/specification.

Problem 2 – Testing/Validation

Certainly with TCAS implementing the system in Mathlab was hard. But then Mathlab is quite slow and that greatly restricts the number of test cases that can be effectively be automated. The TCAS system is based on a huge number of giant decision trees and billions of test cases. A number of test/validation frameworks have been developed to test the new ACAS X system including using theorem proving, probabilistic model checking, adaptive stress testing, simulations and weakest precondition code analysis.

Now if the Avionics hardware manufacturers run the actual Julia code, there will have only been one code base from specification to deployment and it will have all been very thoroughly developed, tested and validated.

Summary

The new ACAS X system is currently being flight tested and is projected to start being deployed in regular commercial aircraft starting in 2020. Looking at the work that has gone into this system, it looks like it will make flying much safer. Hopefully it also sets the stage for how future large safety-critical systems will be developed. Further it looks like the Julia programming language will play a central part in this.

Written by smist08

October 7, 2018 at 10:28 pm

Julia Flux for Machine Learning

leave a comment »

Introduction

Flux is a Neural Network Machine Learning library for the Julia programming language. It is entirely written in Julia and relies on Julia’s built-in support for running on GPUs and providing distributed processing. It makes writing Neural Networks easy and leverages the power and expressiveness of the Julia language to make creating your Neural Network just the same as writing any other Julia expressions.

My last article pointed out some problems with using TensorFlow from Julia, due to many of the newer features being implemented in Python rather than being implemented in the core shared library. One recommendation from the TensorFlow folks is that if you want eager execution then use Flux rather than TensorFlow. The Flux folks claim a real benefit of Flux over TensorFlow is that you only need to know one language to do ML. Whereas for TensorFlow you need to know TensorFlow (its graph language) plus the host language like Python. Then it’s confusing because there is a lot of duplication and it isn’t always clear in which system to do things or whether to use a TensorFlow of Python data type. Flux then simplifies all this.

Although this all sounds wonderful remember that Julia just hit version 1.0 and Flux just hit version 0.67. The main problem I found was excessive memory usage, which I’ll benchmark and discuss later on.

Also note that Flux isn’t a giant compilation of algorithms like SciKit Learn. It is rather specific to Neural Networks. There are other libraries available in Julia for things like Random Forests, but you need to find the correct package and install it. Then each of these may or may not fully support Julia 1.0 yet.

MNIST in Flux

To give a flavour for using Julia and Flux here are a couple of examples from the FluxML model zoo. You can see it’s very simple to setup the Neural Network layers, perform the training and test the accuracy.

using Flux, Flux.Data.MNIST, Statistics
using Flux: onehotbatch, onecold, crossentropy, throttle
using Base.Iterators: repeated
# using CuArrays

# Classify MNIST digits with a simple multi-layer-perceptron
imgs = MNIST.images()

# Stack images into one large batch
X = hcat(float.(reshape.(imgs, :))...) |> gpu

labels = MNIST.labels()
# One-hot-encode the labels
Y = onehotbatch(labels, 0:9) |> gpu

m = Chain(
  Dense(28^2, 32, relu),
  Dense(32, 10),
  softmax) |> gpu

loss(x, y) = crossentropy(m(x), y)

accuracy(x, y) = mean(onecold(m(x)) .== onecold(y))

dataset = repeated((X, Y), 200)
evalcb = () -> @show(loss(X, Y))
opt = ADAM(params(m))

Flux.train!(loss, dataset, opt, cb = throttle(evalcb, 10))

println("acc X,Y ", accuracy(X, Y))

# Test set accuracy
tX = hcat(float.(reshape.(MNIST.images(:test), :))...) |> gpu
tY = onehotbatch(MNIST.labels(:test), 0:9) |> gpu

println("acc tX, tY ", accuracy(tX, tY))

Here is a more sophisticated model which uses a convolutional Neural Network.

using Flux, Flux.Data.MNIST, Statistics
using Flux: onehotbatch, onecold, crossentropy, throttle
using Base.Iterators: repeated, partition
# using CuArrays

# Classify MNIST digits with a convolutional network
imgs = MNIST.images()

labels = onehotbatch(MNIST.labels(), 0:9)

# Partition into batches of size 1,000
train = [(cat(float.(imgs[i])..., dims = 4), labels[:,i])
         for i in partition(1:60_000, 1000)]

train = gpu.(train)

# Prepare test set (first 1,000 images)
tX = cat(float.(MNIST.images(:test)[1:1000])..., dims = 4) |> gpu
tY = onehotbatch(MNIST.labels(:test)[1:1000], 0:9) |> gpu

m = Chain(
  Conv((2,2), 1=>16, relu),
  x -> maxpool(x, (2,2)),
  Conv((2,2), 16=>8, relu),
  x -> maxpool(x, (2,2)),
  x -> reshape(x, :, size(x, 4)),
  Dense(288, 10), softmax) |> gpu

m(train[1][1])

loss(x, y) = crossentropy(m(x), y)

accuracy(x, y) = mean(onecold(m(x)) .== onecold(y))

evalcb = throttle(() -> @show(accuracy(tX, tY)), 10)
opt = ADAM(params(m))

Flux.train!(loss, train, opt, cb = evalcb)

Performance

One of Julia’s promises is the ease of use of a scripting language like Python with the speed of a compiled language like C. As it stands Flux isn’t there yet. There seem to be some points where Flux goes away for a long time. These might be the garbage collector kicking in, or something else. I find the speed is about the same order of magnitude as other systems (modulo the pauses), but the big problem is memory usage.

To solve MNIST using a convolutional Neural Network from Python using the TensorFlow tutorial runs quite well and uses 400Meg of memory. Running the similar model using Julia and TensorFlow uses 600Meg of memory. Running the simple model above using Julia and Flux takes 2Gig or memory. Running the convolutional model above uses 2.6Gig. This laptop that I’m using has 4Gig of RAM and is running Ubuntu Linux. This is why I think the big stalls in performance is garbage collection.

The problem with this is that MNIST is a nice small dataset and the model used to solve it isn’t very large as Neural Networks go. If Flux is using six times as much memory as Python then it really diminishes its usefulness as an ML toolkit.

I spent a bit of time looking at the Julia Differential Equations tutorial. They were pointing out that using matrix operations in the Julia expression evaluator would lead to lots of unnecessary temporary storage for instance to evaluate:

D = A + B + C

Where these are all large matrices has to create a temporary matrix to hold the sum A + B which is then added to C. This temporary matrix has to be allocated from the heap and then later garbage collected. This process seems to be rather inefficient in Julia, at least by going by all the workarounds they have to avoid this situation. They have SVectors which are for small vectors that can be allocated on the stack rather than the heap. They recommend using the +. operator which does things element by element and is smart enough to not create lots of temporary values on the heap. I wonder if Flux needs some optimisations like they spent so much time putting into the Differential Equations library.

Summary

Julia and Flux make a nice system for Machine Learning in theory. I think until the technology matures a bit and some problems like memory management are better addressed, that using this for large projects is a bit problematic. A lot of the current ML systems being worked on with Flux are by PhD candidates who are developing Flux as part of their thesis work. Hopefully they improve the memory usage and allow Flux and Julia to live up to their full potential.

 

Written by smist08

September 24, 2018 at 9:02 pm

TensorFlow from Julia

with 2 comments

Introduction

Last time, I gave a quick introduction to the Julia programming language which has just reached the 1.0 release mark after ten years of development. Julia is touted as the next great thing for scientific computing, machine learning, data science and artificial intelligence. Its hope is to supplant Python which is currently the goto language in these fields. The goal is a more unified language, since it was developed well after Python and learned from a lot of its mistakes. It also claims to have the flexibility of Python but with the speed of a true compiled language like C.

I saw that in the list of packages there was support for using Google’s TensorFlow AI system natively from Julia so I thought I would give this a try. Although it worked, it did reveal some challenges that Julia is going to face in its battle to become a true equal with Python.

Using TensorFlow in Julia

The TensorFlow wrapper/interface for Julia is in a package created by a PhD candidate at MIT, Jon Malmaud. You can add it to Julia using Pkg.add(“TensorFlow”) as well as view the source code on GitHub. Since I wrote an article recently comparing TensorFlow running on a Raspberry Pi to running on my laptop, I thought I’d use the same example and compare Julia to those cases. I cut/pasted the code into the Julia IDE Juno and made some code syntax changes and gave it a go. It came back that the Keras object was undefined.

I then noticed that in the Tensorflow.jl github there were a couple of examples doing predictions on the MNIST dataset, so at least these were solving the same problem as my article, just using different models. I fired these up, but they failed with syntax errors in the code to load the MNIST dataset. Right now this is a bit of a problem in Julia that not all libraries have been updated to the Julia 1.0 syntax. I had a look at the library to load MNIST and noticed that no one had contributed to it in three years. It appeared to be abandoned with no plans to continue it. After a bit more research I found another Julia package called MLDatasets that was maintained and would load MNIST along with several other popular datasets.

I logged an issue with the Tensorflow.jl repository that they should fix this. They replied that they didn’t have time but if I wanted to fix it, to go ahead. So I fixed this and checked it in to the Tensorflow.jl Github. So now these MNIST examples work with Julia 1.0. I was then happy to have given my small contribution back to this community.

I then thought, why not be ambitious and add the Keras layer to Tensorflow.jl? Well this led to some interesting revelations to how Tensorflow is architected.

Problems with the Tensorflow Architecture

Looking at some of the issues in the Tensorflow.jl library there were requests for things like TensorFlow’s eager execution and the TensorFlow layers interface. The answer to these issues was that the Julia interface only talked to the DLL/SO interface to Tensorflow and that these modules didn’t exist there and were in fact written in Python rather than C++. I had a look inside the TensorFlow Github and found that their Keras layer is also written in Python.

Originally Tensorflow.jl talked to the Tensorflow Python interface. Julia is really good at interoperability and can easily talk to both Python libraries as well as C/C++ DLL/SOs. The problem with talking to Python libraries is that it involves running a Python process and then doing process to process communications to execute the code. This tends to be way slower than talking to DLLs or SOs. So early on the TensorFlow.jl library was changed to just talk to the DLL/SO interface for Tensorflow and eliminated all Python dependencies. This then lets Julia use the really performant part of TensorFlow and perform all the core operations very quickly.

Now the problem seems to be that Google is doing a lot of the new Tensorflow development in Python and not putting the code into the core shared library. Google is also spending a lot of time promoting these new interfaces as the way to go. This means if you aren’t programming in Python you are definitely a second class citizen.

OK, so is this just bad for the newbie language Julia? Should Julia programmers just use the Jula native Flux AI library? Well, the other thing Google is promoting is running TensorFlow on things like mobile devices, but then you are accessing TensorFlow from Swift on iOS or from Java on Android. Now you have the same problems as the Julia programmer. You only have efficient access to the core low level APIs for TensorFlow and all the new fancy high level access is denied to you. Google’s API block diagram below highlights this.

To me this is a big architectural problem with TensorFlow. Its great to use from Python, but is really limited in other environments. The videos and blogs starting to surface on TensorFlow 2.0 are promoting eager execution and the Keras layer will be the default and primary ways to program with TensorFlow. This then begs the question as to whether these will be moved into the core shared library or will remain as Python code? At this point I haven’t seen this explained, but as we get closer to the 2.0 preview later this year, I’ll be watching this keenly.

It would certainly be nice if they move this Python code into C++ in the shared library so everyone can use it. At that point I think TensorFlow would be much more usable from Julia, Swift, Java, C++, etc. Here’s hoping that is a major upgrade in the 2.0 release.

Julia TensorFlow Code

Just for interest here is the simplest Julia MNIST example just to give a flavour for the code. This is a simple linear model, so doesn’t give great results. There is a more complicated example that uses a convolutional neural network and gives far superior results.

using TensorFlow
include("mnist_loader.jl")

loader = DataLoader()

sess = Session(Graph())

x = placeholder(Float32)
y_ = placeholder(Float32)

W = Variable(zeros(Float32, 784, 10))
b = Variable(zeros(Float32, 10))

run(sess, global_variables_initializer())

y = nn.softmax(x*W + b)

cross_entropy = reduce_mean(-reduce_sum(y_ .* log(y), axis=[2]))
train_step = train.minimize(train.GradientDescentOptimizer(.00001), cross_entropy)

correct_prediction = argmax(y, 2) .== argmax(y_, 2)
accuracy=reduce_mean(cast(correct_prediction, Float32))

for i in 1:1000
    batch = next_batch(loader, 100)
    run(sess, train_step, Dict(x=>batch[1], y_=>batch[2]))
end

testx, testy = load_test_set()

println(run(sess, accuracy, Dict(x=>testx, y_=>testy)))

Summary

You can certainly use TensorFlow from Julia. Just beware that you are limited to the lower level APIs, so anything TensorFlow has implemented in Python isn’t available to you. This means you set up the graph and then execute it, really like you always did in the earlier versions of TensorFlow. It would certainly be nice if Google fixes this problem for TensorFlow 2.0.

Written by smist08

September 22, 2018 at 5:43 pm