Stephen Smith's Blog

Musings on Machine Learning…

Playing with Software Defined Radio

leave a comment »

Introduction

Most Ham Radios these days, receive signals through an antenna, convert the signal to digital, process the signal with a built-in computer, and then output the result converting back to analog for the speaker. This trend to doing all the radio signal processing in software instead of using electronic components is called Software Defined Radio (SDR). The ICOM 7300 is built around SDR as are all the expensive Flex Radios.

Inexpensive SDR

Some clever hackers figured out that an inexpensive chip used in boards to receive TV into a computer, could actually tune to any frequency. From this discovery, many inexpensive USB dongles have been produced that utilize this “TV Tuner” chip, but to tune radio instead of TV. This is possible because all this chip does is receive a signal from an antenna and then convert it to digital for the computer to process. I purchased the RTL-SDR dongle for around $30 which included a small VHF/UHF antenna.

I run Linux, both on my laptop and on a Raspberry Pi. I looked around for software to use with this device and found several candidates. I chose CubicSDR because it easily installed from the Ubuntu App store on both my laptop and on my Raspberry Pi.

I tried it first on the Pi, but it just didn’t work well. It would keep hanging and the sound was never good. I then tried it on my laptop and it worked great. This led me to believe that the Raspberry Pi just doesn’t have the horsepower to run this sort of system. Either due to lack of memory (only having 1Gig) or that the ARM processor isn’t quite powerful enough. Doing some reading online, the consensus seemed to be that you couldn’t run both the radio software and a GUI on the same Pi. You needed to either have two Pi’s or use a command line version of the software. I was disappointed the Pi wasn’t up to the challenge, but got along just fine using my laptop.

Enter the NVidia Jetson Nano

I recently acquired an NVidia Jetson Nano Developers Kit. This is similar to a Raspberry Pi, but with a more powerful quad-core ARM processor, 4Gig or RAM and 120 Tegra NVidia GPU processors (it also costs $99 rather than $35).

I installed CubicSDR on this, and it worked right away like a charm. I was impressed, getting software for the Nano can sometime be difficult since it runs true 64-Bit Ubuntu Linux on ARM, so you need to have that built. But CubicSDR was in the App Store and installed with no problem. I fired it up and it recognized the RTL-SDR and it recognized the NVidia Tegra GPUs. It took over 10 of them for doing its signal processing and worked really well.

Below is a screenshot of CubicSDR playing an FM radio station.

CubicSDR

CubicSDR is open source and free, it uses GNURadio under the covers (low level open source radio processing). CubicSDR has quite an impressive display. Like fancy high end radios you can see what is happening on the frequencies around where you are tuned in. The interface can be a bit cryptic and you need to refer to the documentation to do some things. For instance the volume, doesn’t honor the system setting and you have to use the green slider in the upper right. Knowing what the various sliders do is quite helpful. Tuning frequencies is a bit tricky at first, but once you check the manual and play with it, it becomes easy. Using CubicSDR really is like using a high end radio, just for a fraction of the cost.

It is certainly helpful to know ham terminology and to know what radio protocol is used where. For instance most VHF communications use narrow band FM. Most longer wavelength ham communications are either upper or lower sideband. Aeronautical uses AM. Commercial FM stations use wide band FM.

Antennas

Although the RTL-SDR supports pretty much any frequency, you need the correct antenna for what you are doing. The ham bands that bounce off the stratosphere to allow you to talk to people halfway around the world use quite long wavelengths. The longer the wavelength, the larger the antenna you need to receive them. Don’t expect to receive anything from the 20 meter band without a good sized antenna. That doesn’t mean it has to be expensive, you can get good results using a dipole or end-fed antenna, both of these are just made out of wires, but you do have to string them high up and facing the right direction.

What About Transmitting?

This RTL-SDR only receives signals. If you want to transmit as well, then you need a more expensive model. These sort of SDR transmitters are very low power, so if you want to be heard, you will need a good linear amplifier, rated for the frequencies you want to use. You will also need a better antenna.

If you transmit you also require a ham radio license and call sign. You are responsible for not causing interference and that you signal doesn’t bleed through to adjacent channels. Since you are assembling this all yourself, an advanced license is required.

Summary

SDR is great fun to play with and there are lots of great projects you can create with this and an inexpensive single board computer. It’s too bad the Raspberry Pi isn’t quite up to the task. However, more powerful Pi competitors like the Jetson Nano run SDR just fine.

Advertisements

Written by smist08

April 16, 2019 at 2:08 am

Playing with CUDA on My NVIDIA Jetson Nano

leave a comment »

Introduction

I reported last time about my new toy, an NVIDIA Jetson Nano Development Kit. I’m pretty familiar with Linux and ARM processors. I even wrote a couple of articles on Assembler programming, here and here. The thing that intrigued be about the Jetson Nano is its 128 Maxwell GPU cores. What can I do with these? Sure I can speed up TensorFlow since it uses these automatically. I could probably do the same with OpenGL programs. But what can I do directly?

So I downloaded the CUDA C Programming Guide from NVIDIA’s website to have a look at what is involved.

Setup

The claim is that the microSD image of 64Bit Ubuntu Linux that NVIDIA provides for this computer has all the NVIDIA libraries and utilities you need all pre-installed. The programming guide made it clear that if you need to use the NVIDIA C compiler nvcc to compile your work. But if I typed nvcc at a command prompt, I just got an error that this command wasn’t found. A bit of Googling revealed that everything is installed, but it did it before installation created your user, so you need to add the locations to some PATHS. Adding:

export PATH=${PATH}:/usr/local/cuda/bin
export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/usr/local/cuda/lib64

To my .bashrc file got everything working. It also shows where cuda is installed. This is handy since it includes a large collection of samples.

Compiling the deviceQuery sample produced the following output on my Nano:

 CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: "NVIDIA Tegra X1"
  CUDA Driver Version / Runtime Version          10.0 / 10.0
  CUDA Capability Major/Minor version number:    5.3
  Total amount of global memory:                 3957 MBytes (4148756480 bytes)
  ( 1) Multiprocessors, (128) CUDA Cores/MP:     128 CUDA Cores
  GPU Max Clock rate:                            922 MHz (0.92 GHz)
  Memory Clock rate:                             13 Mhz
  Memory Bus Width:                              64-bit
  L2 Cache Size:                                 262144 bytes
  Maximum Texture Dimension Size (x,y,z)         1D=(65536), 2D=(65536, 65536), 3D=(4096, 4096, 4096)
  Maximum Layered 1D Texture Size, (num) layers  1D=(16384), 2048 layers
  Maximum Layered 2D Texture Size, (num) layers  2D=(16384, 16384), 2048 layers
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       49152 bytes
  Total number of registers available per block: 32768
  Warp size:                                     32
  Maximum number of threads per multiprocessor:  2048
  Maximum number of threads per block:           1024
  Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
  Max dimension size of a grid size    (x,y,z): (2147483647, 65535, 65535)
  Maximum memory pitch:                          2147483647 bytes
  Texture alignment:                             512 bytes
  Concurrent copy and kernel execution:          Yes with 1 copy engine(s)
  Run time limit on kernels:                     Yes
  Integrated GPU sharing Host Memory:            Yes
  Support host page-locked memory mapping:       Yes
  Alignment requirement for Surfaces:            Yes
  Device has ECC support:                        Disabled
  Device supports Unified Addressing (UVA):      Yes
  Device supports Compute Preemption:            No
  Supports Cooperative Kernel Launch:            No
  Supports MultiDevice Co-op Kernel Launch:      No
  Device PCI Domain ID / Bus ID / location ID:   0 / 0 / 0
  Compute Mode:
     < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >
deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 10.0, CUDA Runtime Version = 10.0, NumDevs = 1

Result = PASS

This is all good information and what all this data means is explained in NVIDIA’s developer documentation (which is actually pretty good). The deviceQuery sample exercises various information APIs in the CUDA library to tell you all it can about what you are running. If you can compile and run deviceQuery in the samples/1_Utilities folder then you should be good to go.

CUDA Hello World

The 128 NVidia Maxwell cores basically consist of a SIMD computer (Single Instruction Multiple Data). This means you have one instruction that they all execute, but on different data. For instance if you want to add two arrays of 128 floating point numbers you have one instruction, add, and then each processor core adds a different element of the array. NVidia actually calls their processors SIMT meaning single instruction multiple threads, since you can partition the processors to different threads and have the two threads each with a collection of processors doing their SIMD thing at once.

When you write a CUDA program, you have two parts, one is the part that runs on the host CPU and the other is the part that runs on the NVidia GPUs. The NVidia C compiler, NVCC adds a number of extensions to the C language to specify what runs where and provide some more convenient syntaxes for the common things you need to do. For the host parts, NVCC translates its custom syntax into CUDA library calls and then passes the result onto GCC to compile regularly. For the GPU parts, NVCC compiles to an intermediate format called PTX. The reason it does this is to support all the various NVidia GPU models. When the NVidia device driver goes to load this code, it does a just in time compile (which it then caches), where the PTX code is compiled to the correct binary code for your particular set of GPUs.

Here is the skeleton of a simple CUDA program:

// Kernel definition
__global__ void VecAdd(float* A, float* B, float* C)
{
    int i = threadIdx.x;
    C[i] = A[i] + B[i];
}

int main()
{
    ...
    // Kernel invocation with N threads
    VecAdd<<<1, N>>>(A, B, C);
    ...
}

 

The __global__ identifier specifies the VecAdd routine as to run on the GPU. One instance of this routine will be downloaded to run on N processors. Notice there is no loop to add these vectors, Each processor will be a different thread and the thread’s x member will be used to choose which array element to add.

Then in the main program we call VecAdd using the VecAdd<<>> syntax which indicates we are calling a GPU function with these three arrays (along with the size).

This little example skips the extra steps of copying the arrays to GPU memory or copying the result out of GPU memory. There are quite a few different memory types, and various trade offs for using them.

The complete program for adding two vectors from the samples is at the end of this article.

This example also doesn’t explain how to handles larger arrays or how to do error processing. For these extra levels of complexity, refer to the CUDA C Programming Guide.

The CUDA program here is very short, just doing an addition. If you wanted to say multiply two 10×10 matrices, you would have your CUDA code do the dot product of a row in the first matrix by a column in the second matrix. Then you would have 100 cores execute this code, so the result of the multiplication would be done 100 times faster than just using the host processor. There are a lot of samples on how to do matrix multiplication in the samples and documentation.

Newer CUDA Technologies

The Maxwell GPUs in the Jetson Nano are a bit old and reading and playing with the CUDA libraries revealed a few interesting tidbits on things they are missing. We all know how NVidia has been enhancing their products for gaming and graphics with the introduction of things like real time ray tracing, but the thing of more interest to me is how they’ve been adding features specific to Machine Learning and AI. Even though Google produces their own hardware for accelerating their TensorFlow product in their data centers, NVidia has added specific features that greatly help TensorFlow and other Neural Network programs.

One thing the Maxwell GPU lacks is direct matrix multiplication support, newer GPUs can just do A * B + C as a single instruction, where these are all matrices.

Another thing that NVidia just added is direct support for executing computation graphs. If you worked with the early version of TensorFlow then you know that you construct your model by building a computational graph and then training and executing it. The newest NVidia GPUs can now execute these graphs directly. NVidia has a TensorRT library to move parts of TensorFlow to the GPU, this library does work for the Maxwell GPUs in the Jetson Nano, but is probably way more efficient in the newest, bright and shiny GPUs. Even just using TensorFlow without TensorRT is a great improvement and handles moving the matrix calculations to the GPUs even for the Nano, it just means the libraries have more work to do.

Summary

The GPU cores in a product like the Jetson Nano can be easily utilized using products that support them like TensorFlow or OpenGL, but it’s fun to explore the lower level programming models to see how things are working under the covers. If you are interested in parallel programming on a SIMD type machine, then this is a good way to go.

 

/**
 * Copyright 1993-2015 NVIDIA Corporation.  All rights reserved.
 *
 * Please refer to the NVIDIA end user license agreement (EULA) associated
 * with this source code for terms and conditions that govern your use of
 * this software. Any use, reproduction, disclosure, or distribution of
 * this software and related documentation outside the terms of the EULA
 * is strictly prohibited.
 *
 */

/**
 * Vector addition: C = A + B.
 *
 * This sample is a very basic sample that implements element by element
 * vector addition. It is the same as the sample illustrating Chapter 2
 * of the programming guide with some additions like error checking.
 */

#include <stdio.h>

// For the CUDA runtime routines (prefixed with "cuda_")
#include <cuda_runtime.h>

#include <helper_cuda.h>

/**
 * CUDA Kernel Device code
 *
 * Computes the vector addition of A and B into C. The 3 vectors have the same
 * number of elements numElements.
 */

__global__ void
vectorAdd(const float *A, const float *B, float *C, int numElements)
{
    int i = blockDim.x * blockIdx.x + threadIdx.x;

    if (i < numElements)
    {
        C[i] = A[i] + B[i];
    }
}

/**
 * Host main routine
 */

int
main(void)
{
    // Error code to check return values for CUDA calls
    cudaError_t err = cudaSuccess;

    // Print the vector length to be used, and compute its size
    int numElements = 50000;
    size_t size = numElements * sizeof(float);
    printf("[Vector addition of %d elements]\n", numElements);

    // Allocate the host input vector A
    float *h_A = (float *)malloc(size);

    // Allocate the host input vector B
    float *h_B = (float *)malloc(size);

    // Allocate the host output vector C
    float *h_C = (float *)malloc(size);

    // Verify that allocations succeeded
    if (h_A == NULL || h_B == NULL || h_C == NULL)
    {
        fprintf(stderr, "Failed to allocate host vectors!\n");
        exit(EXIT_FAILURE);
    }

    // Initialize the host input vectors
    for (int i = 0; i < numElements; ++i)
    {
        h_A[i] = rand()/(float)RAND_MAX;
        h_B[i] = rand()/(float)RAND_MAX;
    }

    // Allocate the device input vector A
    float *d_A = NULL;
    err = cudaMalloc((void **)&d_A, size);

    if (err != cudaSuccess)
    {
        fprintf(stderr, "Failed to allocate device vector A (error code %s)!\n", cudaGetErrorString(err));
        exit(EXIT_FAILURE);
    }

    // Allocate the device input vector B
    float *d_B = NULL;
    err = cudaMalloc((void **)&d_B, size);

    if (err != cudaSuccess)
    {
        fprintf(stderr, "Failed to allocate device vector B (error code %s)!\n", cudaGetErrorString(err));
        exit(EXIT_FAILURE);
    }

    // Allocate the device output vector C
    float *d_C = NULL;
    err = cudaMalloc((void **)&d_C, size);

    if (err != cudaSuccess)
    {
        fprintf(stderr, "Failed to allocate device vector C (error code %s)!\n", cudaGetErrorString(err));
        exit(EXIT_FAILURE);
    }

    // Copy the host input vectors A and B in host memory to the device input vectors in
    // device memory
    printf("Copy input data from the host memory to the CUDA device\n");
    err = cudaMemcpy(d_A, h_A, size, cudaMemcpyHostToDevice);

    if (err != cudaSuccess)
    {
        fprintf(stderr, "Failed to copy vector A from host to device (error code %s)!\n", cudaGetErrorString(err));
        exit(EXIT_FAILURE);
    }

    err = cudaMemcpy(d_B, h_B, size, cudaMemcpyHostToDevice);

    if (err != cudaSuccess)
    {
        fprintf(stderr, "Failed to copy vector B from host to device (error code %s)!\n", cudaGetErrorString(err));
        exit(EXIT_FAILURE);
    }

    // Launch the Vector Add CUDA Kernel
    int threadsPerBlock = 256;
    int blocksPerGrid =(numElements + threadsPerBlock - 1) / threadsPerBlock;
    printf("CUDA kernel launch with %d blocks of %d threads\n", blocksPerGrid, threadsPerBlock);
    vectorAdd<<<blocksPerGrid, threadsPerBlock>>>(d_A, d_B, d_C, numElements);
    err = cudaGetLastError();

    if (err != cudaSuccess)
    {
        fprintf(stderr, "Failed to launch vectorAdd kernel (error code %s)!\n", cudaGetErrorString(err));
        exit(EXIT_FAILURE);
    }

    // Copy the device result vector in device memory to the host result vector
    // in host memory.
    printf("Copy output data from the CUDA device to the host memory\n");
    err = cudaMemcpy(h_C, d_C, size, cudaMemcpyDeviceToHost);

    if (err != cudaSuccess)
    {
        fprintf(stderr, "Failed to copy vector C from device to host (error code %s)!\n", cudaGetErrorString(err));
        exit(EXIT_FAILURE);
    }

    // Verify that the result vector is correct
    for (int i = 0; i < numElements; ++i)
    {
        if (fabs(h_A[i] + h_B[i] - h_C[i]) > 1e-5)
        {
            fprintf(stderr, "Result verification failed at element %d!\n", i);
            exit(EXIT_FAILURE);
        }
    }

    printf("Test PASSED\n");

    // Free device global memory
    err = cudaFree(d_A);

    if (err != cudaSuccess)
    {
        fprintf(stderr, "Failed to free device vector A (error code %s)!\n", cudaGetErrorString(err));
        exit(EXIT_FAILURE);
    }

    err = cudaFree(d_B);

    if (err != cudaSuccess)
    {
        fprintf(stderr, "Failed to free device vector B (error code %s)!\n", cudaGetErrorString(err));
        exit(EXIT_FAILURE);
    }

    err = cudaFree(d_C);

    if (err != cudaSuccess)
    {
        fprintf(stderr, "Failed to free device vector C (error code %s)!\n", cudaGetErrorString(err));
        exit(EXIT_FAILURE);
    }

    // Free host memory
    free(h_A);
    free(h_B);
    free(h_C);

    printf("Done\n");
    return 0;
}






Written by smist08

April 3, 2019 at 6:01 pm

Can NVidia Bake a Better Pi Than Raspberry?

with 2 comments

Introduction

I love my Raspberry Pi, but I find it’s limited 1Gig of RAM can be quite restricting. It is still pretty amazing what you can do with these $35 computers. I was disappointed when the Raspberry Foundation announced that the Raspberry Pi 4 is still over a year away, so I started to look at Raspberry Pi alternatives. I wanted something with 4Gig of RAM and a faster ARM processor. I was considering purchasing an Odroid N2, when I saw the press release from NVidia’s Developer Conference that they just released their NVidia Jetson Nano Developer Kit. This board has a faster ARM A57 quad core processor, 4 Gig of RAM plus the bonus of a 128 core Maxwell GPU. The claim being that this is an ideal DIY computer for those interested in AI and machine learning (i.e. me). It showed up for sale on arrow.com, so I bought one and received it via FedEx in 2 days.

Setup

If you already have a Raspberry Pi, setup is easy, since you can unplug things from the Pi and plug them into the Nano, namely the power supply, keyboard, monitor and mouse. Like the Pi, the Nano runs from a microSD card, so I reformatted one of my Pi cards to a download of the variant of Ubuntu Linux that NVidia provides for these. Once the operating system was burned to the microSD card, I plugged it into the Nano and away I went.

One difference from the Pi is that the Nano does not have built in Wifi or Bluetooth. Fortunately the room I’m setting this up in has a wired Internet port, so I went into the garage and found a long Internet cable in my box of random cables, plugged it in and was all connected to the Internet. You can plug a USB Wifi dongle in if you need Wifi, or there is an M.2 E slot (which is hard to access) for an M.2 Wifi card. Just be careful of compatibility, since the drivers need to be compiled for ARM64 Linux.

The board doesn’t come with a case, but the box folds into a stand to hold the board. For now that is how I’m running. If they sell enough of these, I’m sure cases will appear, but you will need to ensure there is enough ventilation for the huge heat sink.

Initial Impressions

The Jetson Nano certainly feels faster than the Raspberry Pi. This is all helped by the faster ARM processor, the quadrupled memory, using the GPU cores for graphics acceleration and that the version of Linux is 64 Bit (unlike Raspbian which is 32 Bit). It ran the pre installed Chromium Browser quite well.

As I installed more software, I found that writing large amounts of data to the microSD card can be a real bottleneck, and I would often have to wait for it to catch up. This is more pronounced than on the Pi, probably because other things are quite slow as well. It would be nice if there was an M.2 M interface for an NVMe SSD drive, but there isn’t. I ordered a faster microSD card (over three times faster than what I have) and hope that helps. I can also try putting some things on a USB SSD, but again this isn’t the fastest.

I tried running the TensorFlow MNIST tutorial program. The version of TensorFlow for this is 1.11. If I want to try TensorFlow 2.0, I’ll have to compile it myself for ARM64, which I haven’t attempted yet. Anyway, TensorFlow automatically used the GPU and executed the tutorial orders of magnitude faster than the Pi (a few minutes versus several hours). So I was impressed with that.

This showed up another gotcha. The GPU cores and CPU share the same memory. So when TensorFlow used the GPU, that took a lot of memory away from the CPU. I was running the tutorial in a Jupyter notebook running locally, so that meant I was running a web server, Chromium, Python, and then TensorFlow with bits on the CPU and GPU. This tended to use up all memory and then things would grind to a halt until garbage collection sorted things out. Running from scratch was fine, but running iteratively felt like it kept hitting a wall. I think the lesson here is that to do machine learning training on this board, I really have to use a lighter Python environment than Jupyter.

The documentation mentions a utility to control the processor speeds of the ARM cores and GPU cores, so you can tune the heat produced. I think this is more for if you embed the board inside something, but beware this sucker can run hot if you keep all the various processors busy.

How is it so Cheap?

The NVidia Jetson Nano costs $99 USD. The Odroid is $79 so it is fairly competitive with other boards trying to be super-Pis. However, it is cheaper than pretty much any NVidia graphics card and even their Nano compute board (which has no ports and costs $129 in quantities of 1000).

The obvious cost saving is no Wifi and no bluetooth. Another is the lack of a SATA or M.2 M interface. It does have a camera interface, a serial interface and a Pi like GPIO block.

The Nano has 128 Maxwell GPU cores. Sounds impressive, but remember most graphics cards have 700 to 4000 cores. Further Maxwell is the oldest supported platform (version 5) where as the newest is the version 7 Volta core.

I think NVidia is keeping the cost low, to get the DIY crowd using their technologies, they’ve seen the success of the Raspberry Pi community and want to duplicate it for their various processor boards. I also think they want to be in the ARM board game, so as better ARM processors come out, they might hope to supplant Intel in producing motherboards for desktop and laptop computers.

Summary

If the Raspberry Pi 4 team can produce something like this for $35 they will have a real winner. I’m enjoying playing with the board and learning what it can do. So far I’ve been pretty impressed. There are some limitations, but given the $100 price tag, I don’t think you can lose. You can play with parallel processing with the GPU cores, you can interface to robots with the GPIO pins, or play with object recognition via the camera interface.

For an DIY board, there are a lot of projects you can take on.

 

Ghidra

with one comment

Introduction

In my novel “Influence”, the lead character J@ck Tr@de searches for an easter egg in a server operating system. To do this he uses a disassembler which converts machine code back into source code. Normally you write computer programs in a programming language which a compiler (another program) reads through and converts to the bits and bytes that computers actually execute. In my novel, the disassembler uses AI to do an especially good job. I don’t know of any disassembler that uses AI yet, but a new really powerful disassembler has just been released by the NSA as open source. So I thought I’d spend a bit of time blogging on this, since it’s open source, perhaps someone will add some AI to it, so it is a powerful as the tool J@ck uses.

Most disassemblers either aren’t very good, or are quite expensive. This just changed when the NSA released their internally developed tool Ghidra as open source. I don’t know where the word Ghidra comes from, but the icon for the program is a dragon. I downloaded this and gave it a run. It was easy to install, ran well and looks really powerful. Why did the NSA do this? Don’t they usually guard their internal tools with their lives? They claim it’s to help security researchers at Universities and such do a better job discovering vulnerabilities in software, making us all safer as a result. I wonder if it’s the NSA trying to get some good publicity, since they are generally untrusted and most Americans got upset when it was revealed that the NSA could access any photo on any cell phones, including dick-pics. This really upset a lot of people, probably the only good thing a dick-pic has ever done.

For anyone interested, my novel, Influence, is available either as a paperback or as a Kindle download on Amazon.com:

Paperback – https://www.amazon.com/dp/1730927661
Kindle – https://www.amazon.com/dp/B07L477CF6

Installation

Ghidra is a Java program, so you need to have the Java 11 JDK installed first. I’m doing this on Ubuntu and didn’t already have Java installed. The Java that is installed by default apt-get is Java 10, so it didn’t work. To install Java 11, took a bit of Googling, but the following commands worked:

sudo add-apt-repository ppa:linuxuprising/java
sudo apt-get update
sudo apt-get install oracle-java11-installer

This adds an additional repository and then installs Java 11 from it. Then download Ghidra, uncompress it somewhere and run the shell script to start it.

Running

To play around with it, I created a new project and imported the executable file “head” from /usr/bin. This gave me some basic information on the executable:

It then takes a second to analyse the file and then I can launch the code browser and look through a split screen with the assembler code on the left and the generated C code on the right.

I can view a function call graph of the current function (the functions that call it and the functions that it calls).

I can view a function graph of the entire program that I can zoom in and out and browse around in.

I can annotate the program, add any insights I see. I can patch the program. All very powerful. Ghidra has full scripting support, the built in scripting language is Java, after all it is a Java program. But the API has support to add other scripting languages. There is a plug-in architecture so you can write extensions. It supports many executable formats and knows about many processor instruction sets.

Trust the NSA?

After I downloaded Ghidra, I watched a couple of YouTube videos on it. One of the presenters ran WireShark so he could see if Ghidra was making any network connections back to the NSA. After all, could this be a trojan horse that the NSA will use to find out what hackers are up to? At least this presenter didn’t see any network calls while he was running. But to a real hacker this could be major concern. As of this writing all the Java code has been open sourced, but some of the extra addons that are in other languages still need to be posted, so right now you can’t build Ghidra as it’s distributed, but the NSA say this should be remedied in a few weeks.

Summary

Although, perhaps not as powerful as what J@ck was using, this is a really powerful tool to reverse engineer programs and even operating systems. The generated source code isn’t great, but it’s helpful compared to just assembler. I think the expensive commercial disassembler vendors must be pretty upset as I really don’t see any reason for them to exist now? I think this will be a big boon to black and white hat hackers as well as to anyone that needs to reverse engineer some code (perhaps to figure out an undocumented API). Happy Hacking.

Written by smist08

March 6, 2019 at 9:19 pm

Social Media Bots

leave a comment »

Introduction

In my novel “Influence”, the lead character J@ck Tr@de spends a lot of time creating and improving Social Media Bots. I thought in this article I’d spend a bit of time providing some background on these. Social Media Bots weren’t made up by me, they’ve been around for a while. It is estimated that 15-20% of all social media accounts are really Bots and that 15-20% of all posts on social media sites like Twitter and Facebook were created by these Bots.

For anyone interested, my book is available either as a paperback or as a Kindle download on Amazon.com:

Paperback – https://www.amazon.com/dp/1730927661
Kindle – https://www.amazon.com/dp/B07L477CF6

What is a Bot?

Bot is short for Robot and really means a computer program that is pretending to be a real human being. Early bots were easy to identify and rather simple, but over time they’ve become more and more sophisticated, even to the extent of being credited with getting Donald Trump elected as president of the USA.

The original Bots were spambots, which were programs that just send out spam emails. Basically hackers would take over people’s computers and install a program on them (the spambot) which would then send out spam to all the contacts on the poor victim’s email. Programmers found these quite effective and took the same idea to social media.

Most of these Bots are quite simple and just work to advocate some idea by posting from a collection of human created messages. They can be trying to influence political views, direct people to dubious websites or perhaps just make people mad for the fun of it.

There is an interesting website, Botometer, that will analyse a Twitter account and score it to see if it’s a bot. I ran it on all my Twitter followers and quite a few got a score indicating they were Bots.

Bots Get More Sophisticated

Like any computer programs, Bots keep coming out with new versions getting more and more sophisticated. They now create quite realistic Internet personas with photos and a history. If you look at such a Bot’s Facebook page, you might be hard pressed to tell that it doesn’t belong to a real person. Creating social media accounts is pretty easy with very little verification. You just need a valid email account and need the ability to respond to the email that is sent to ensure it is you, plus perhaps fool a simple captcha tool.

Another newer Bot is the so called ChatBot. These are programs that can carry on a conversation. They can use modern sophisticated machine learning algorithms to carry on a conversation on a topic like providing movie reviews. Many companies are trying to deploy ChatBots to automate their customer service. Companies can purchase ChatBot kits that they can customize for their own customer service needs. Often companies use ChatBots to handle their social media accounts. A major large company can’t answer all the Tweets and Facebook posts it receives, so they automate this with a Chatbot. Sometime this is effective, sometimes it just pisses people off. The feeling is that people getting some sort of answer is better than no answer.

The developers that create Social Media Bots took this same technology and incorporated it into their Bots. Now these Bots don’t just post canned messages, but can also carry on limited conversations on these topics. Often political campaigns employ these to give the impression they have far more support than they really do. If you post a comment to a news article on Facebook, often you get responses almost right away. Often most of these responses are actually from Social Media Bots using ChatBot technology. The Russians really spearheaded this in the American 2016 election campaign.

As Machine Learning and AI technology gets more and more powerful, these Social Media Bots get harder and harder to distinguish from real people. Especially given the low quality of posts from actual real people. When a corporation uses a Chatbot for technical support, it will identify itself as such and often has an option to switch to a real person (perhaps with quite a long wait time), but when you are on Social Media, how do you really know who you are talking to?

In my book, Influence, the main character, J@ck programs his Bots to both network and to modify their own code. As it is, Bots behave as viruses and spread maliciously from computer to computer. The current Bots tend to rely on volume to do their damage. But as in Influence, perhaps the Bots will start to coordinate their actions and work together to accomplish their goals. Given the number of computing devices connected to the Internet, a successfully spread Bot could harness tremendous computing power to spread its Influence. Applying new algorithms for reinforcement and adaptive learning, the programs can get more and more effective out in the wild without requiring additional coding from their creators. Is it really that far fetched that this network of Bots couldn’t become aware or intelligent in some sort of sense?

Summary

Twenty percent of users and twenty percent of posts on Social Media are via automated Bots and not created by real people. Should you believe what you see on Facebook, should you be influenced by all the tweets you see going by on Twitter? Are your thought processes critical enough to filter out all the automated noise that is being targeted at you? Are your consumer decisions on what you buy or your political decisions on how you vote being controlled by all these Bots? This is definitely something people should be aware of you should be aware of this and don’t just believe it all.

Written by smist08

February 5, 2019 at 9:38 pm

The Technology of “Influence” – Part 5 VHF Radio Modems

leave a comment »

Introduction

In my novel “Influence”, the lead character J@ck Tr@de performs various hacking tasks. In the book he spends a lot of time securing his connections, hiding his identity and hiding his location. In this series of blog posts, I’m going to talk about the various technologies mentioned in the book like VPN, the Onion Browser, Kali Linux and using VHF radios. I’ve talked about HTTPS,  VPNs, the Onion Browser and Kali Linux so far, now we’re going to discuss VHF Radio Modems.

Very High Frequency (VHF) is a radio band used by both commercial and amateur radio operators (on different frequencies). Often if you see people using small handheld radios then chances are they are using VHF. This frequency band works line of sight and doesn’t require a very large antenna to work quite well. Like any radio frequency you can transmit and receive digital data over the air, just like a cell phone does. You can buy fairly inexpensive VHF radio modems which can be used to connect a computer to the Internet via a VHF radio.

In this article we’ll look at these in a bit more detail and discuss why J@ck finds these useful.

For anyone interested, my book is available either as a paperback or as a Kindle download on Amazon.com:

Paperback – https://www.amazon.com/dp/1730927661
Kindle – https://www.amazon.com/dp/B07L477CF6

Why Does J@ck Use These?

In the previous articles we have J@ck accessing the Internet from a coffee shop Wifi using HTTPS, a VPN and the Onion Browser. With all this security, why doesn’t J@ck feel secure? As we mentioned before you want to consider security as an Onion where the more layers you have protecting you, the more secure you can feel. However the good hackers always feel paranoid and worry about being traced. In this case J@ck is worried, what if the NSA, FBI or some other agency can track his Internet usage back to the coffee shop’s wifi?

J@ck doesn’t know if anyone can do this or if anyone is actually looking for him. By having a homeless person plant a Raspberry Pi with a VHF radio outside the coffee shop and then J@ck accesses that via a VHF radio modem attached to his laptop, J@ck can be upto 2 km away from the coffee shop, as long as he has line of sight.

This way if the people in the black SUVs show up, J@ck can see them, be warned and escape. Most importantly then he will know someone is looking for him.

The downside for J@ck is that each layer of the security onion adds overhead and latency that slows down his Internet access. With all this security in place J@ck can only access the Internet very slowly.

Strictly speaking to use these frequencies you should have either a Ham or Commercial Radio license. But then if you follow the license rules, you need to identify yourself every 30 minutes, and J@ck is certainly not going to do that. In the scheme of things, J@ck considers the penalties for illegally operating a radio, the least of his problems. There are radio modems for UHF and 900 MHz as well, J@ck could use these as well. As long as the radio is cheap enough to be disposable.

Can the NSA Catch J@ck?

If the NSA can trace J@ck’s Internet traffic back to the coffee shop. Perhaps via a compromised Tor exit node and a compromised VPN, then what can they do?

If the NSA suspect J@ck is using a VHF modem, then rather than sending the SWAT team into the coffee shop, they could have three vehicles with radio direction finding equipment move into the area quietly and then they could triangulate J@ck’s true location from the emissions from the VHF radio attached to his laptop.

J@ck’s hope is that they wouldn’t do this the first time, so if the G-men do show up at the coffee shop then he would assume they would either find his Raspberry Pi/Radio Modem or guess that he was doing this and then use the radio vans the second time.

J@ck also limits his time at each coffee shop, so that the Feds have less time to set this all up and trap him.

Summary

Catching hackers is a game of cat and mouse. Since J@ck is the mouse he wants to be as elusive as possible. VHF modems are just another tool to make it harder to trace back to J@ck’s location and catch him.

Written by smist08

January 24, 2019 at 9:12 pm

Open Source Photography Toolkit

leave a comment »

Introduction

Since retiring, I’ve switched to entirely running open source software. For photography, Adobe Photoshop and Lightroom dominate the scene. Most articles and books are based on these products. The Adobe products have a reputation for being very good, but they are quite expensive, especially since they have switched to a subscription model of pricing. In this article I’m going to talk about the excellent open source programs that work very well in this space.

Basically there are two streams here, the quicker and easier software equivalent to Adobe Lightroom and then the more technical and sophisticated software equivalent to Adobe Photoshop.

I run all these programs on Ubuntu Linux, however they all have versions for the Mac and Windows.

You can download the source code for any open source program and have a look at how the programs work. If you find a bug, you can report it, or if you are a programmer you can fix it. Figuring out enough of a program to work on it is a large undertaking, but I feel comforted that that avenue is open to me if I need it.

digiKam

digiKam is an open source photo management program similar to Adobe’s Lightroom. It is easier to use than a full photo editing tool like GIMP or Adobe Photoshop, and has tools to automate the processing of the large number of photos taken in a typical shoot. It has the ability to import all the photos from raw format for further processing, it has a pretty good image editor built in and then lots of tools for managing your photos, like putting them in albums, assigning keywords, and editing the meta-data. There is an extensive search tool, so you can find your photos again if you forgot where you put them. There are tools to publish your photos to various photography websites as well as various social media websites.

screenshot from 2019-01-05 11-57-36

Unlike Lightroom, there aren’t nearly as many books or tutorials on the product. I only see one book on Amazon. However the web based manual for digiKam is pretty good and I find it more than enough. It does peter out near the end, but most of the things that are TBD are also easy to figure out (mostly missing the specifics of various integrations with third party web sites).

Another difference is that digiKam does actually edit your pictures and doesn’t just store differences like LR does, so you need to be aware of that in your management workflows.

Lightroom costs $9.99/month and is subscription based. digiKam is free. One benefit is you don’t have to worry about having your photos held hostage if you get tired of paying month after month. Especially if you are an infrequent user.

GIMP

GIMP is very powerful photo-editing software. It is an open source equivalent of Adobe Photoshop. I recently saw a presentation by an author of a book on Photoshop on his workflow for editing photos with Photoshop. I was able to go home and perform the exact same workflows in GIMP without any problems. These involved a lot of use of layers and masks, both of which are well supported in GIMP.

screenshot from 2019-01-05 12-10-31

Both Photoshop and GIMP are criticised for being hard to use, but they are the real power tools for photo editing and are both well worth the learning curve to become proficient. There are actually quite a few good books on GIMP as well as many YouTube tutorials on the basic editing tasks.

For 90% of your needs, you can probably use digiKam or Lightroom. But for the really difficult editing jobs you need a tool like this.

Photoshop typically costs $20/month on a subscription basis. GIMP is free.

RawTherapee

GIMP doesn’t have the ability built in to read raw image files. There are plug-ins hat you can install, but I’ve not gotten good results with these, often they work stand-alone, but not from within GIMP. digiKam can process raw files, and doing that en-mass is one of its main features.

screenshot from 2019-01-05 14-02-19

Sometimes you want a lot of control of the process when you do this processing. This is where RawTherapee comes in. It is a very sophisticated conversion program. It supports batch processing and has very sophisticated color processing.

Often in the open source world, components are broken out separately rather than bundled into one giant program. This provides more flexibility to mix and match software and allows the development teams to concentrate on what they are really good at.

Typically you would take all your pictures in your camera’s raw mode, convert these to a lossless file format like TIFF and then do your photo editing in GIMP. This is the harder, but more powerful route as opposed to using digiKam for the entire workflow.

OpenShot

OpenShot is actually movie editing software. I included it here, because many photographers like to create slideshows of their work, where the images have nice transitions and change from image to image with the music. OpenShot is an ideal open source program for doing this. If you have a Mac, then you can use iMovie for this, but if you don’t have a Mac or what something that works on any computer then OpenShot is a good choice.

screenshot from 2019-01-05 14-08-30

Summary

There are good open source pieces of software that are very competitive with the expense commercial software products. Adobe has a near monopoly in the commercial space and tries to squeeze every dime it can out of you. It’s nice that there is a complete suite of alternatives. I only use open source software for my photography, and have find it to easily fill all my needs.

This article only talks about four pieces of software. There are actually many more specialized applications out there that you can easily find by googling. Chances are if you look below the ads in your Google search results, you will find some good free open source software that will do the job for you.

 

Written by smist08

January 5, 2019 at 10:29 pm