Stephen Smith's Blog

Musings on Machine Learning…

Posts Tagged ‘vmware

The Road to TensorFlow – Part 1 Linux

with 12 comments

Introduction

There have been some remarkable advancements in Artificial Intelligence type algorithms lately. I blogged on this a little while ago here. Whether its computers reading hand-writing, understanding speech, driving cars or winning at games like Go, there seems to be a continual flood of stories of new amazing accomplishments. I thought I’d spend a bit of time getting to know how this was all coming about by doing a bit of reading and playing with the various technologies.

I wanted to play with Neural Network technology, so thought the Google TensorFlow open source toolkit would be a good place to start. This led me down the road to quite a few new (to me) technologies. So I thought I’d write a few blog posts on my road to getting some working TensorFlow programs. This might take quite a few articles covering Linux, Python, Python libraries like Pandas, Stock Market technical analysis, and then TensorFlow.

Linux

The first obstacle I ran into was that TensorFlow had no install image for Windows, after a bit of Googling, I found you need to run it on MacOS or Linux. I haven’t played with Linux in a few years and I’d been meaning to give it a try.

I happened to have just read about a web site osboxes.org that provides VirtualBox and VMWare images of all sorts of versions of Linux all ready to go. So I thought I’d give this a try. I downloaded and installed VirtualBox and downloaded a copy of 64Bit Ubuntu Linux. Since I didn’t choose anything special I got Canonical’s Unity Desktop. Since I was trying new things, I figured oh well, lets get going.

Things went pretty well at first, I figured out how to install things on Ubuntu which uses APT (Advanced Packaging Tool) which is a command line utility to install things into Ubuntu Linux. This worked pretty well and the only problems I had were particular to installing Python which I’ll talk about when I get to Python. I got TensorFlow installed and was able to complete the tutorial, I got the IDLE3 IDE for Python going and all seemed good and I felt I was making good progress.

Then Ubuntu installed an Ubuntu update for me (which like Windows is run automatically by default). This updated many packages on my virtual image. And in the process broke the Unity desktop. Now the desktop wouldn’t come up and all I could do was run a single terminal window. So at least I could get my work off the machine. I Googled the problem and many people had it, but none of the solutions worked for me and I couldn’t resolve the problem. I don’t know if its just that Unity is finicky and buggy or if it’s a problem with running in a VirtualBox VM. Perhaps something with video drivers, who knows.

Anyway I figured to heck with Ubuntu and switched to Red Hat’s Fedora Linux. I chose a standard simple Gnome desktop and swore to never touch Unity again. I also realized that now I’m retired, I’m not a commercial user, so I can freely use VMWare, so I also switched to VMWare since I wondered if my previous problem was caused by VirtualBox. Anyway installing TensorFlow on Fedora seemed to be quite difficult. The dependencies in the TensorFlow install assume the packages that Ubuntu installs by default and apparently these are quite different that Fedora. So after madly installing things that I didn’t really think were necessary (like the Gnu Fortran compiler), I gave up on Fedora.

So I went back to osboxes.org and downloaded an Ubuntu image with the Gnome desktop. This then has been working great. I got everything re-installed quite quickly and was back to being productive. I like Gnome much better than Unity and I haven’t had any problems. Similarly, I think VMWare works a bit better than VirtalBox and I think I get a bit better performance in this configuration.

I have Python along with all the Python scientific and numerical computing libraries working. I have TensorFlow working. I spend most of my time in Terminal windows and the IDLE3 IDE, but occasionally use FireFox and some of the other programs pre-installed with the distribution.

gnome

I’m greatly enjoying working with Linux again, and I’m considering replacing my currently broken desktop computer with something inexpensive natively running Linux. I haven’t really enjoyed the direction Windows has taken after Windows 7 and I’m thinking of perhaps doing most of my computing on Linux and MacOS.

Summary

I am enjoying using Linux again. In spite of my initial problems with Ubuntu’s Unity Desktop and then with Fedora (running TensorFlow). Now that I have a good system that seems to be stable and working well I’m pretty happy with it. I’m also glad to be free of things like App stores and its nice to feel in control of my environment when running Linux. Anyway this was the small first step to TensorFlow.

Written by smist08

August 23, 2016 at 11:40 pm

Virtualization

with 5 comments

Introduction

There was a discussion on LinkedIn the other day, that was started since the latest version of Sage 100 ERP only allows one copy of itself to be installed on a given computer. Many programs operation this way such as most Microsoft products and other Sage products like Sage 300 ERP. The main reason for this is to avoid confusion for users when they are using integration technologies like COM or .Net. Since then it’s easy to know what you are talking to when you integrate from another program. This is also how the Windows Installer works, so if you want to use this technology then this is what you get.

But the topic came up as to what to do to support multiple customers? The answer given was to use virtualization. We use this fairly extensively here at Sage for Development, QA and Support. This blog posting is to cover a bit more fully our uses of virtualization and some of the things we have discovered along the way.

virtualization

VirtualBox

The Sage 100 and Sage X3 groups use Oracle VirtualBox. This one is nice because it’s open source (Oracle acquired it as part of Sun). I’ve run VMs created with this, but never created one myself or have too much experience with it.

VMWare

The Sage 300 team uses VMWare. It used to be that you could use the VMWare player for free, but now it is only free for non-commercial use, but at least it’s fairly cheap. Generally you only need the Player and not the Workstation version. One nice feature is the unity feature which does an amazing job of integrating the virtual environment with your desktop environment which is good for demo purposes.

VMWarePlayerUnity

For server based VMs we use VMWare because our experience is that the memory usage is much better than the Microsoft Windows Server versions (but I haven’t played with Windows Server 2012 yet). The MS Server ones tend to force a lot of locked memory and you can’t run as many VMs. Our support department keeps a library of all supported operating systems times all supported versions installed, so if a client problem comes up say running XX version 3 on Windows XP 32-bit, then we boot up the right VM and try to reproduce the customer’s problem.

Generally we find it useful to create a base operating system image like Windows 7 (64-bit) and keep a clean copy that we update every now and then with Windows updates. Then when we want a VM we just get a copy of the base operating system and install what we want on top of it. (We also keep some images of popular operating systems with office and SQL Server as a better starting point). Generally to give a quick way to get running when a need arises.

VirtualPC

We used to use MS VirtualPC a lot, but have moved away from it because MS doesn’t seem to be updating it anymore and it doesn’t support 64-bit client operating systems. This one is included with MSDN subscriptions, so it you have one of these, you probably have access to it.

It seems Microsoft is repurposing its VirtualPC software to their XP Mode feature to allow you to run Windows XP only software easily on Windows 7.

Client Operating System Licenses

Generally all the developers at Sage have an MSDN Universal subscriptions so this gives us the licensing to do what we need with the client operating systems. But for most development partners, there is a lot of benefit in having an MSDN subscription yourselves.

Hardware Requirements

One disadvantage of virtual machines in the past has been how large they are (usually around 32Gig). This uses up disk space fast, but with cheap 3TB hard drives, this doesn’t seem to be much of a problem anymore.

I’ve found the main thing you need for good performance in virtual environments is lots of memory. If your computer has 8Gig RAM then you can allocate 4Gig to the VM and still have 4Gig for your base operating system. Even though I find frequently switching back and forth between things in the VM and things in the base operating system can be slow, so I like to work for longer periods in on or the other.

Also quite a few laptops have hardware virtualization support turned off by default, going into the BIOS setup and turning this on can speed up VMs quite a bit.

Summary

To me virtualization software is quite amazing. I’m astounded that I can just run Windows 8 or Linux easily on my Windows 7 laptop. I think virtualization software has come a long way and is still progressing quickly. If you haven’t tried it out recently and you need to keep things separated, then you really should try one of these out. It saves a lot of headaches not having to worry about the installation of one thing messing up something else you have installed.

Choosing Between Cloud Providers

with one comment

Introduction

It seems that every day there are more cloud providers offering huge cloud based computing resources at low prices. The sort of Cloud providers that I’m talking about in this blog posting are the ones where you can host your application in multiple virtual machines and then the cloud service offers various extension APIs and services like BigData or SQL databases. The extension APIs are there to help you manage load and automatically provision and manage your application. The following are just a few of the major players:

  1. Amazon Web Services. This is the most popular and flexible service. There are many articles on how much web traffic is handles by AWS these days.
  2. Microsoft Azure. Originally a platform for .Net applications, it now supports general virtualization and non-Microsoft operating systems and programs.
  3. Rackspace. Originally a hardware provider, now offers full services with the OpenStack platform.
  4. VMWare. Originally just a virtualization provider, has now branched out to full cloud services.

There are many smaller specialty players as well like Hiroku for Ruby on Rails or the Google App Engine for Java applications. There are also a number of other large players like IBM, Dell and HP going after the general market.

All of these services are looking to easily host, provision and scale your application. They all cater to a large class of applications, whether hosting in the cloud a standard Windows desktop application, or providing the hardware support for a large distributed SaaS web application. Many of these services started out for specific market niches like Ruby or .Net, but have since expanded to be much more general. Generally people are following the work of Amazon to be able to deploy seamlessly anything running in a virtual machine over any number of servers that can scale according to demand.

Generally these services are very appealing for software companies. It is quite expensive and quite a lot of trouble maintaining your own data center. You have to man it 24×7, you are continually buying and maintaining hardware. You have to have these duplicated in different geographies with full failover. Generally quite a lot of activities that distract you from your main focus of developing software. Fewer and fewer web sites are maintaining their own data centers. Even large high volume sites like NetFlix or FourSquare run on Amazon Web Services.

Which to Choose?

So from these services which one do you choose, how do you go about choosing. This is a bit of game where the customer and the service provider have very different goals.

For a customer (software developer), you want the cheapest service that is the most reliable, high performance and easiest to use. Actually you would always like the cheapest, so if something else comes along, you would like to be easily able to move over. You might even want to choose two providers, so if one goes down then you are still running.

For the service provider, they would like to have you exclusively and to lock you in to their service. They would like to have you reliant on them and to attract you with an initial low price, which then they can easily raise, since switching providers is difficult. They would also like to have additional services that they can offer you down the road to increase your value to them as a customer.

OpenStack

Both Amazon and Azure look to lock you in by offering many proprietary services, which once you are using, makes switching to another service very difficult. These are valuable services, but as always you have to be careful as to whether they are a trap.

Amazon pretty much owns this market right now. New players have been having trouble entering the market. Rackspace suddenly realized that just providing outsourced hardware wasn’t sufficient anymore and that too much new business was going to Amazon. They realized that creating their own proprietary services in competition with Amazon probably wouldn’t work. Rackspace came up with the disruptive innovation of creating an open source cloud platform called OpenStack that it developed in conjunction with Nasa. They also realized that so many people were already invested in Amazon that they made it API compatible with several Amazon services.

OpenStack has been adopted by many other Cloud providers and there are 150 companies that are officially part of the OpenStack project.

This new approach has opened up a lot of opportunities for software companies. Previously to reduce lock-in to a given vendor, you had to keep you application in its own virtual image and then do a lot of the provisioning yourself. With this you can start to automate many processes and use cloud storage without suddenly locking yourself into a vendor or to have to maintain several different ways of doing things.

Advantages for Customers

With OpenStack, suddenly customers can start to really utilize the cloud as a utility like electricity. You can:

  1. Get better geographic coverage by using several providers.
  2. Get better fault tolerance. If one provider has an outage, your service is still available via another.
  3. Better utilize spot prices to host via the lowest cost provider and to dynamically switch providers as prices fluctuate.
  4. Have more power and flexibility when negotiating deals with providers.
  5. Go with the provider with the best service and switch as service levels fluctuate.

One thing that scares software companies is that as soon as they commit to one platform, then do a lot of work to support it, then suddenly have a new service appears that leapfrogs the previous services. Keeping up and switching become a major challenge. OpenStack starts to offer some hope in getting off this treadmill, or at least making running on this treadmill a bit easier.

Is OpenStack Ready?

At this point OpenStack doesn’t offer as many services as Azure or AWS. Its main appeal is flexibility. The key will be how well or the major companies backing OpenStack can work together to evolve the platform quickly and how strong their commitment is to keeping this platform open. For instance will we start to see proprietary extensions in various implementations, rather than committing back to the home open source project?

Amazon and Azure have one other advantage, and that is that they are subsidized by other businesses. For instance Amazon has to have all this server infrastructure anyway in order to handle the Christmas shopping rush on its web store. So it doesn’t really have to charge the full cost, any money it makes off AWS is really a bonus. By the same token Microsoft is madly trying to buy market share in this space. It is taking profits from its Windows and Office businesses and subsidizing Azure to offer very attractive pricing which is very hard to resist.

Apple uses this strategy for iCloud. iCloud runs on both Amazon and Azure. This way it isn’t locked into a single vendor. Has better performance in more regions. Won’t go down if one of these services goes down (like Azure did on Feb. 29). Generally we are seeing this strategy more and more as people don’t want to put their valuable eggs all in one basket.

Summary

With the sudden explosion of Cloud platform providers, suddenly there are huge opportunities for software developers to reduce costs and expand capabilities and reach. But how do you remain nimble and quick in this new world? OpenStack provides a great way to provide a basis for service and then allows people to easily move to new services and respond to the quickly changing cloud environment. It will be interested to see how the OpenStack players can effectively compete with the proprietary and currently subsidized offerings from Microsoft and Amazon. Within Sage we currently have products on all these platforms. SalesLogix cloud is on Amazon, SageCRM.com is on Rackspace and Sage 200 (UK) is on Azure. It’s interesting to see how these are all evolving.