Stephen Smith's Blog

Musings on Machine Learning…

Archive for May 2020

Coffee in the Age of Social Distancing

leave a comment »

The Write Cup

By Stephen Smith

Introduction

Here in British Columbia, Canada, COVID-19 restrictions are slowly being relaxed. As they are relaxed, coffee shops are scrambling to re-open while meeting the various government regulations for social distancing and cleaning. In this article I’ll discuss the various setups and trade-offs various shops are taking.

Inside vs Outside Seating

It is far easier for coffee shops to offer outside patio seating than providing inside seating. In both cases social distancing is required and the tables have to be measured to ensure they are sufficiently separated. Many coffee shops don’t have enough room for any inside seating and they have to keep the people in the counter lineup sufficiently separated. Often setting up the counter lineup takes all their inside floor space. Some have an indoor and snaky lineup to the counter, the pickup area and then an exit door.

BEWARE! None of the washrooms are…

View original post 528 more words

Written by smist08

May 29, 2020 at 1:40 pm

Posted in Uncategorized

Browsing MSDOS and GW-Basic Source Code

leave a comment »

Introduction

These days I mostly play around with ARM Assembly Language and have written two books on it:

But long ago, my first job out of University involved some Intel 80186 Assembly Language programming, so I was interested when Microsoft recently posted the source code to GW-Basic which is entirely written in 8086 Assembly Language. Microsoft posted the source code to MS-DOS versions 1 and 2 a few years ago, which again is also entirely written in 8086 Assembly Language.

This takes us back to the days when C compilers weren’t as good at optimizing code as they are today, processors weren’t nearly as fast and memory was at a far greater premium. If you wanted your program to be useful, you had to write it entirely in Assembly Language. It’s interesting to scroll through this classic code and observe the level of documentation (low) and the programming styles used by the various programmers.

Nowadays, programs are almost entirely written in high-level programming languages and any Assembly Language is contained in a small set of routines that provide some sort of highly optimized functionality usually involving a coprocessor. But not too long ago, often the bulk of many programs consisted entirely of Assembly Language.

Why Release the Source Code?

Why did Microsoft release the source code for these? One reason is that they are a part of computer history now and there are historians that want to study this code. It provides insight into why the computer industry progressed in the manner it did. It is educational for programmers to learn from. It is a nice gesture and offering from Microsoft to the DIY and open source communities as well.

The other people who greatly benefit from this are those that are working on the emulators that are used in systems like RetroPie. Here they have emulators for dozens of old computer systems that allow vintage games and programs to be run on modern hardware. Having the source code for the original is a great way to ensure their emulations are accurate and a great help to fixing bugs correctly.

Example

Here is an example routine from find.asm in MS-DOS 2.0 to convert a binary number into an ASCII string. The code in this routine is typical of the code throughout MS-DOS. Remember that back then MS-DOS was 16-bits so AX is 16-bits wide. Memory addresses are built using two 16-bit registers, one that provides a segment and the other that gives an offset into that 64K segment. Remember that MS-DOS can only address memory upto 640K (ten such segments).

;——————————————————————–
;       Binary to Ascii conversion routine                
;                                                                 
; Entry:                                                          
;       DI      Points to one past the last char in the             
;       AX      Binary number                                       
;             result buffer.                                        
;                                                                   
; Exit:                                                             
;       Result in the buffer MSD first                            
;       CX      Digit count                                         
;                                                                   
; Modifies:                                                         
;       AX,BX,CX,DX and DI                                          
;                                                                   
;——————————————————————–
bin2asc:
        mov     bx,0ah
        xor     cx,cx
go_div:
        inc     cx
        cmp     ax,bx
        jb      div_done
        xor     dx,dx
        div     bx
        add     dl,’0′          ;convert to ASCII
        push    dx
        jmp     short go_div
div_done:
        add     al,’0′
        push    ax
        mov     bx,cx
deposit:
        pop     ax
        stosb
        loop    deposit
        mov     cx,bx
        ret

For an 8086 Assembly Language programmer of the day, this will be fairly self evident code and they would laugh at us if we complained there wasn’t enough documentation. But we’re 40 or so years on, so I’ll give the code again but with an explanation of what is going on added in comments.

bin2asc:
        mov     bx,0ah ; we will divide by 0ah = 10 to get each digit
        xor     cx,cx ; cx will be the length of the string, initialize it to 0
go_div:
        inc     cx ; increment the count for the current digit
        cmp     ax,bx ; Is the number < 10 (last digit)?
        jb      div_done   ; If so goto div_done to process the last digit
        xor     dx,dx ; DX = 0
        div     bx ; AX = AX/BX  DX=remainder
        add     dl,’0′          ;convert to ASCII. Know remainder is <10 so can use DL
        push    dx ; push the digit onto the stack
        jmp     short go_div ; Loop for the next digit
div_done:
        add     al,’0′ ; Convert last digit to ASCII
        push    ax ; Push it on the stack
        mov     bx,cx ; Move string length to BX
deposit:
        pop     ax ; get the next significant digit off the stack.
        stosb ; Store AX at ES:DI and increment DI
       ; Loop decrements CX and branches if CX not zero.
; Falls through when CX=0
        loop    deposit
        mov     cx,bx ; Put the count back in CX
        ret ; return from routine.

A bit different than a C routine. The routine assumes the DF flag is set, so the stosb increments the memory address, perhaps this is a standard across MS-DOS or perhaps it’s just local to this module. I think the comment is incorrect and that the start of the output buffer is passed in. The routine uses the stack to reverse the digits, since the dividing by 10 algorithm peels off the least significant digit first and we want the most significant digit first in the buffer. The resulting string isn’t NULL terminated so perhaps MS-DOS treats strings as a length and buffer everywhere.

Comparison to ARM

This code is representative of CISC type processors. The 8086 has few registers and their usage is predetermined. For instance the DIV instruction is only passed one parameter, the divisor. The dividend, quotient and remainder are set in hard-wired registers. RISC type processors like the ARM have a larger set of registers and tend to have three operands per instruction, namely two input registers and an output register.

This code could be assembled for a modern Intel 64-bit processor with little alteration, since Intel has worked hard to maintain a good level of compatibility as it has gone from 16-bits to 32-bits to 64-bits. Whereas ARM redesigned their instruction set when they went from 32-bits to 64-bits. This was a great improvement for ARM and only possible now that the amount of Assembly Language code in use is so much smaller.

Summary

Kudos to Microsoft for releasing this 8086 Assembly Language source code. It is interesting to read and gives insight into how programming was done in the early 80s. I hope more classic programs have their source code released for educational and historical purposes.

Written by smist08

May 25, 2020 at 6:56 pm

Virtual LinuxFest Northwest

leave a comment »

Introduction

Last year we packed up the r-pod travel trailer and headed down to Bellingham, WA for some mountain biking and LinuxFest Northwest 2019. It was a really fun and informative show that I blogged about here. I greatly enjoyed the show and hoped to return the next year participating by giving a presentation. I applied last fall and was accepted. This looked like it was going to really work out since the show corresponded with the release of my second computer book: Programming with 64-Bit ARM Assembly Language from Apress.

Enter Covid-19

Things were looking good until in February things started to lock down and get cancelled due to the Covid-19 outbreak. Eventually LinuxFest NorthWest in Bellingham was added to the list of cancelled events. Even if the organizers hadn’t cancelled, the border between Canada and the USA was closed to all non-essential travel. I suspect I would have had a hard time convincing the border guards that my presentation at LinuxFest was essential.

The organizers of LinuxFest Northwest weren’t happy abandoning the show all together so they asked all the presenters if they would record their presentations and upload them to be posted on the LinuxFest YouTube channel. Further they set up a questions and answer section on the LinuxFest discussion forums.

It looks like quite a few presenters participated and you can find all the sessions here. The Q&A forums are all here. More specifically my presentation is here.

Summary

I’m disappointed that LinuxFest Northwest didn’t happen live this year. I was looking forward to it. But at least we had the virtual event. I invite you to browse and watch a few of the sessions. Hopefully, we will gather in person next year.

Written by smist08

May 22, 2020 at 1:09 pm

Programming with 64-Bit ARM Assembly Language

with 4 comments

Introduction

My first book on Assembly Language is Raspberry Pi Assembly Language Programming which is all about ARM 32-Bit Assembly Language Programming. This is since the official variant of Linux for the Raspberry Pi, Raspbian is 32-bit. There are good reasons for this, the most important being that until the Raspberry Pi 4, the maximum memory was 1Gig which isn’t enough to properly run a 64-bit version of Linux. Yes you can do it, but it’s rather painful.

Now with the Raspberry Pi 4 supporting 4Gig of RAM and other SBC’s like the nVidia Jetson Nano also containing 4Gig of RAM, running 64-bit operating systems makes a lot more sense. Further in the ARM world, all phones and tablets have moved to 64-bits. All Apple products are 64-bit and all but the very cheapest Android phones are 64-bit.

Hence I felt it made sense to create a 64-bit version of my book and my publisher Apress agreed. This resulted in my newest book: Programming with 64-Bit ARM Assembly Language.

Beyond the Raspberry Pi

Along with teaching how to program 64-bit ARM Assembly Language, the book goes beyond the Raspberry Pi to cover how to add Assembly Language routines to your Apple iOS or Google Android Apps. Every App developer is struggling to get their App noticed out of the millions of Apps in the App stores. Having better performance is one great way for users to recommend your App to their friends.

The book also covers how to write Assembly Language for ARM 64-Bit Linux including Ubuntu as included with the nVidia Jetson Nano or Kali Linux running on a Raspberry Pi 4. Further the book covers how to cross compile your code, compile/assemble it on a powerful Intel/AMD computer and then run it on your target device.

There is a lot of interest in IoT and embedded devices these days. Often these are based on ARM processors and often you need to do some Assembly Language programming to write the device drivers for the various custom pieces of hardware you are developing.

About ARM 64-Bit Assembly Language

When ARM developed the 64-bit version of their processor, they took the time to fix many problems that have developed over the years in the 32-bit versions. The Assembly Language syntax is more streamlined and a lot of little used features like conditional instructions were removed entirely. As a consequence this new book is a complete rewrite. Although anyone familiar with 32-bit ARM Assembly should find 64-bit Assembly familiar, there are a lot of differences and improvements, such as doubling the number of registers.

With the new book you learn how to utilize all the new features that are now available to you. How the instruction syntax is much more uniform across all the coprocessors and how to use all the new registers you have at your disposal.

The newest generations of the ARM processor all have deep execution pipelines and multiple cores. The new 64-bit instruction set is the foundation that allows the ARM processor to fully exploit these features and get the best performance for the smallest amount of power usage.

Where to Buy

With Covid-19, things are moving a bit slower than normal. The ePub versions of my book are available now from Apress directly. This should flow to all the other retailers shortly, in the meantime they have the book available for presale. The print version is in process, but I’m not sure how long it will take this time around. Here are some sample places where it is listed:

Over the coming weeks, it’ll change from pre-release to shipping now.

Summary

If you are interested in learning 64-Bit ARM Assembly Language, either to optimize your programs or to learn about the architecture of a modern RISC processor then this book is for you. I hope this book motivates people to use more Assembly Language in their work to produce high performance applications. When people are surveyed for their favorite features in applications, better performance is always top of the list.

Written by smist08

May 2, 2020 at 10:46 am

Benchmarking with Folding@Home

with 2 comments

Benchmarking with Folding@Home

Introduction

There are a lot of different benchmarks to rate the performance of computers, but I’m finding it interesting watching the performance of the various computers around the house running Folding@Home which I blogged about last time. Folding@Home provides all sorts of interesting statistics that I’ll talk about as well. The main upshot is how much processing power a GPU has compared to a CPU.

My Benchmarks

Here are some of the specs and the Folding@Home statistics for a number of the computers lying around my house:

Brand CPU # CPUs Memory Points per Day
MSI Intel i7 9750H

10

16Gig

42,337

MSI nVidia GTX 1650

896

4Gig

262,389

HP Intel i3 6300U 

4

4Gig

16,178

Apple Core 2 Duo

2

4Gig

2,073

Dell Celeron N2840

2

4Gig

565

 

These are all laptop computers. The Dell is a Chromebook that is running GalliumOS. Here are a few takeaways from these benchmarks:

  • Intel Celerons that are used in many cheap Chromebooks are terribly underpowered.
  • The Core 2 Duo does not contain the newer AVX SIMD instructions added with the Intel Core line of processors and that is why the 2008 era MacBooks do so poorly.
  • GPUs are far more powerful than CPUs for this sort of calculation.
  • On the MSI Gaming Laptop there are 12 logical CPUs, but one is controlling the GPU and one is being used by myself to write this article.

Folding@Home Operating System Statistics

Here are the operating system statistics as of May 1, 2020 taken from Folding@Home statistics page.

OS AMD GPUs NVidia GPUs CPUs CPU cores TFLOPS x86 TFLOPS
Windows

113,145

399,642 1,018,463 6,415,885 911,141 1,832,063
Linux

12,045

127,665 2,001,996 14,315,265 425,332

681,347

macOSX

136

0 96,503 477,058 5,537

5,753

Totals

125,326

527,307

3,116,962 21,208,208 1,342,010

2,519,163

 

Here are some takeaways from these numbers:

  • There are twice as many Linux CPUs at Windows CPUs. I think this is partly because technically advanced users that are more likely to use Folding@Home prefer Linux.
  • NVidia GPUs outnumber AMD GPUs by 3 to 1.
  • Most people with GPUs run Windows. Shows the draw of most gaming being on Windows.
  • It’s sad that Apple doesn’t offer GPUs on very models.

No ARM Support

It’s sad that Folding@Home doesn’t support the ARM processor. I’d love to add my various Raspberry Pis and nVidia Jetsons to this table. I think this would also greatly increase the number of Linux CPUs shown. Newer ARM CPUs all have NEON SIMD coprocessors and there could be support for common ARM GPUs.

Summary

It would be nice if computer reviews start adding Folding@Home benchmarks to all the other benchmarks they publish. I find it interesting comparing how various CPUs and GPUs do when running this algorithm.

Written by smist08

May 1, 2020 at 11:30 am

Posted in Business

Tagged with , ,