Stephen Smith's Blog

Musings on Machine Learning…

Archive for the ‘Cloud’ Category

Sage Intelligence Go!

with 4 comments

Introduction

A new flavor of Sage Intelligence was demonstrated by Himanshu Palsule during his keynote at Sage Summit called Sage Intelligence Go! (SIG). I thought I’d spend a bit of time this week providing a few more details about what it is and where it fits into the scheme of things.

Sage Intelligence is a business intelligence/reporting tool used by many Sage ERP products. For those who have been around the Sage 300 ERP community, they will recognize it as Alchemex which Sage purchased a few years ago. This is a product that runs as a Microsoft Excel add-in which extracts data from the ERP to be manipulated by the power of the Excel spreadsheet engine. This is a very popular way of doing Financial Reporting since you often want to present the data graphically or you want to perform complex calculations or you want to slice and dice the data using pivot tables. Excel provides a great platform for performing these tasks. The original Financial Reporter bundled with Sage 300 ERP works in a similar manner as an Excel Add-in. Sage Intelligence is used for quite a few things beyond Financial Accounting including Sales Analysis and other predictive type BI functions.

Sage Intelligence has been around for a while and is a good mature product. Since it is written with the full Excel add-in SDK, it must run in Excel and specifically the locally installed version of Excel. This means it isn’t particularly well suited for cloud applications such as Office Online.

Microsoft has now released a newer SDK for developing Office apps. This new SDK is designed to allow you to develop applications that can still run in the locally installed full Microsoft Excel, but they can also run inside the cloud/web based Excel Online as well as the new specialty versions of Office like the one for the iPad.

As Sage ERP applications move to the cloud, you would want to have your Financial Reporter be cloud based as well. You would want to be able to edit Financial Reports in the browser as well as display and interact with reports on any device including your tablet or smart phone. This is where Sage Intelligence Go! comes in. It is written in the new Office Apps SDK and will run on the online and device version s of Excel as well as the full locally installed Excel. This you don’t need a Windows PC at all to use your cloud based ERP and cloud based Financial Reporter. However you still have all the power of Excel helping you visualize and manipulate your reports.

sig1

Sage One Accounting

Sage One Accounting by Sage Pastel is such a cloud based ERP system. This product has Sage Intelligence Go! as an option for performing various reporting needs. When SIG started out here, it was much simpler. The original Office Apps SDK was very simple and didn’t nearly have the power of the various older SDKs supported by local Excel. However as time as gone by, the Office App SDK has become much stronger and the functionality is now much more in line with what we expect of such a solution.

Office Apps

For those of you who attended Sage Summit and saw James Whittaker’s keynote, they would have seen the importance Microsoft is placing on the new Office Apps. Basically to switch from using the browser and search, or using Apps from a mobile device App store, you will get your data directly in your business applications via this new sort of App. If you go to the Office App Store, and search on Sage you will find a Sage One app for Outlook and Sage Intelligence Go! If you search a bit more broadly, you won’t find that many non-trivial applications. Sage Intelligence Go! is probably the most sophisticated Office App in the store.

Notice that you don’t run SIG reports from the ERP application, they run from Excel. Once you have the XLSX file, you can just run it anytime from any version of Excel and have full access to your data. This is really a new paradigm for doing reporting. For SIG this works really well. Whether other applications fit this model is yet to be seen.

Web Services

SIG needs to communicate both with the Office API (whether cloud or local) as well as with the ERP to get the data to process. Both of these functions are accomplished via RESTful web services. The communication with the ERP must be via Web Services since these could originate from the SIG Office App running in the Microsoft Office Online Cloud or from the Office App running in the local Excel. It isn’t possible to use the traditional methods of database integration like ODBC when the two may be running in completely different locations.

Basically the Sage Cloud based ERP exposes a standard set of Web Services that are protected by Sage ID. When the user starts SIG, they get a Sage ID login prompt from with-in Excel and then this is transmitted through to the Sage Cloud ERP which uses the Sage ID to look up which tenants this user runs as. This information is related back to SIG in case it needs a second prompt to act which tenant (company) to access.

Sage Intelligence Go consists of a server component that runs in the cloud as well as the Office App that runs in Excel. The Server portion of SIG uses the provided Web Services to load the data to be reported on from the ERP into its in-memory database for processing. This in-memory database is maintained in the cloud where the SIG service runs. Then the Office App part of SIG interacts with the server side to present the correct required data and perform various processing functions.

sig2

ERP Integration

This solution requires the ERP provide Web Services that are exposed to the Internet in general, so that it can be access via Excel running anywhere whether installed locally, running on an iPad or running as a Web Application in the cloud. This means these Web Services need to be secure and available 24×7 with very good availability. For this reason you will see SIG first integrating to Sage Cloud based ERPs (like the current Sage One Accounting by Sage Pastel) and later via the Sage Data Cloud which will offer these services on behalf of on-premised installed ERP systems.

Summary

Sage Intelligence Go! is our solution for cloud reporting. Its primary purpose is to provide Financial Reporting capabilities, but it is also capable of handling other BI type reporting needs. This is our solution to moving a lot of reporting functions into the cloud.

sig3

 

 

Advertisements

SEOS

with 2 comments

Introduction

Sage ERP Online Services (SEOS) was a system developed alongside the online/cloud versions of Sage 200 ERP (UK) and Sage Morano ERP (Spain). The purpose of this was to provide a number of common sets of functionalities for all Sage online applications such as creating new tenants and users, managing audit logs, managing cloud credentials and managing cloud resources. The system also handled common functions such as integration to the various Sage billing systems. Although SEOS was originally developed as part of the Sage 200 and Sage Morano online projects, it was always intended to be a general tool used by all Sage cloud products. SEOS has been in successful production with both Sage 200 and Sage Morano for some time now, it is now being adopted by various other Sage cloud products both in Europe and in North America.

A lot of what SEOS does is behind the scenes and customers and partners generally don’t see it as a separate service. But I thought it might be of interest to people to know what is or will be going on behind the scenes in our various cloud products. This article is just talking about SEOS in general and doesn’t make any claims as to which will be the next product to adopt SEOS or any timelines for such adoption. But for people using Sage 200 or Sage Morano, they probably already have some experience using the SEOS portal.

One of the key goals of SEOS is to automate everything so there are no manual processes in creating new tenants or users, but additionally it orchestrates disaster recovery, backup/restore and the auto-scaling of resources.

SEOS Services

SEOS has a number of different ways to interact with it. There is a web portal for DevOps/IS type people who run the system, there is a web portal for customers and partners and then there are a number of APIs for the various ERPs to communicate with. Some of the main components are shown below.

seos1

The main job an ERP has to do to integrate to SEOS is provide a provisioning engine. This engine runs all the time (actually at least two of them for high availability). This engine checks the SEOS job queue to see if it has anything to do. Tasks it might be asked to perform include creating a new tenant, creating a new user or suspending a tenant.

Bootstrapping the System

To create a new cloud instance of an ERP that is integrated with SEOS just requires the provisioning engine running. This is either setup manually or via some PowerShell scripts. Once the provisioning engine is going, everything else is done in the provisioning engine as it gets messages from SEOS. When it gets its first create tenant message, it will see that no application servers are running and create a couple (for high availability), it will create the database and do anything else that is needed for that tenant to be able to operate. For the first tenant that could mean creating quite a few storage, database, VM or other resources.

Then as tenants are added, the provisioning engine will be monitoring (with the help of SEOS) the usage of the system and will start additional resources (like application servers), if it determines that the system needs to scale to handle the additional load.

Thus nearly all the work of creating a large cloud system, possibly consisting of hundreds of VMs will all be brought about automatically via the SEOS/provisioning engine system.

This then helps with disaster recover, since when SEOS switches to the alternate data center, it handles switching the databases and basically bootstraps the whole process the same way. Note that databases are a bit different since they are already being replicated to the alternate site and you just want to switch the backup to primary and then use that.

Multi-Cloud

When the provisioning engine requires a new resource like an Azure SQL database or a new Web Role (VM), it doesn’t just go to the cloud API to get it (say using the Azure SDK). Instead it asks SEOS for the resource and SEOS creates it for the ERP. This way the ERP isn’t using the native cloud API, instead it just uses the SEOS API. This then opens up the possibility for hosting these cloud ERPs in different clouds.

Currently all the SMB ERPs are hosted in the Microsoft Azure cloud because we get very good pricing on this and it meets our needs extremely well. However we don’t want to put all our eggs in one basket and if conditions change dramatically, we can much more easily switch other providers. There are other reasons we may need to do this, for instance Azure doesn’t have a data center in Africa and we have a lot of customers in Africa, so we may need a provider closer than Singapore.

DevOps

DevOps is the group that runs all the various Sage Cloud offerings (its official name varies from region to region, but the idea is the same). Having DevOps manage dozens of cloud products all working different ways with different maintenance procedures would be a huge challenge. SEOS brings all these aspects together into one set of procedures.

Take logging for example. It’s easy for any application to generate huge log files for diagnostic and auditing purposes. There are lots of good APIs for doing this. But managing these logs is a challenge. The logs need to be archived for future reference. Generally there are several types of logs, like the Windows Event Log, and application diagnostic log and an application security log. All these need to be kept in a central spot including backing them up. There has to be easy ways to search them, say by tenant, event type or date time. Many people use Elastic Search to search their logs. SEOS provides a uniform way for managing these and automates most of the process. This way DevOps only needs to know one way of managing logs and not a separate way for each ERP. Plus SEOS automates the whole process, avoiding manual procedures and mistakes.

Billing Engines

Sage is a large company and operates around the world. Most of our products are charged for and the backend systems that do billing vary around the world. Various geographic regions have extra regulatory requirements and different business rules. SEOS handles providing usage data to the billing engine via a standard adapter. Each region has to write an adapter to interface to SEOS. Removing this burden of interfacing to the Sage billing systems is a huge benefit to the ERP teams who really don’t want to have to deal with these details.

Summary

If you are a partner or customer using Sage SMB Cloud offerings in Europe, then you have probably already seen SEOS and know something about it. If you are using Sage SMB Cloud offerings in other parts of the world, then you will probably start to see SEOS appearing over the next while. These will probably appear more as features or services, but these are being brought to you by the SEOS global initiative.

Written by smist08

August 9, 2014 at 3:34 pm

Sage Summit 2014

with 3 comments

Introduction

I’m just back from Sage Summit 2014 which was held at the Mandalay Bay Resort/Hotel in Las Vegas, Nevada. There were over 5200 attendees at the show, a new record for Sage. The Mandalay Bay is a huge complex and I racked up a record number of steps for GCC getting from one place to another. Las Vegas is easy to get to for most people since there are a lot of direct flights from around North America and you can find really cheap hotel accommodation near to the conference (like $29 at the Excalibur which is connected to Mandalay Bay by a free tram). The only down side to having he conference in Vegas is that smoking is still allowed in many public places, which is really annoying.

The conference had a great many guest speakers including quite a few celebrities like Magic Johnson and Jessica Alba. The convention trade show wasn’t just booths, there were also open speaking theatres that always had something interesting going on as well as the Sage Innovation Lab Exhibit.

There were a great many product breakout sessions as well as a large number of breakout sessions on general business and technology topics. The intent was to make Sage Summit a place to come to for a lot more than just learning new technical details about your Sage product, or promoting new add-ons for you to purchase. A lot of customers attending the show told me that it was these general sessions on accounting, marketing and technology that they found the most useful.

The show was huge and this blog post just covers a few areas that I was directly involved in or attended.

Great General Sessions

Besides the mandatory Sage keynotes, there were quite a few general sessions which were quite amazing. My favorite was Brad Smith’s interview with Biz Stone the co-founder of Twitter and Jelly. Biz certainly provides a lot of interesting context to Web startups, as well as a lot of interesting stories of why he left Google and chose the path he chose. It was certainly interesting in the way a lot of the successful founders left very secure lucrative careers to really struggle for years to get their dreams off the ground. A common theme was the need for persistence so you could survive long enough to eventually get that big break. Another common theme was to follow people and ideas rather than companies and money. Now I’m going to have to read Biz’s book: “Things a Little Bird Told Me: Confessions of the Creative Mind”.

bizstone

Another very popular session was the panel discussion with Magic Johnson, CEO of Magic Johnson Enterprises, Jessica Alba, co-founder of the Honest Company and J. Carrey Smith, CEO of Big Ass Solutions. This discussion concentrated on their current businesses and didn’t delve into their celebrity pasts for which at least two panelists are rather well known for. There were a lot of good business tips given and it was interesting to see how Magic Johnson and Jessica Alba have adapted what they did before to becoming quite successful CEOs.

magicalba

Sage’s Technology Vision

A lot of Sage’s technology and product presentations were about our mobile and cloud technology vision. The theme was to aggressively move into these areas with purposeful innovation that still protect the investment that our customers have in our current technologies. At the heart of this vision is the Sage Data Cloud. This acts as a hub which mobile solutions can connect to as well as a way that data can be accessed in our existing products whether in the cloud or installed on premise. Below is the architectural block diagram showing the main components of this.

sagetech2

This is perhaps a bit theoretical, but we already have products in the market that are filling in key components of this vision. Some of these are included in the next diagram.

sagetech1

We use the term “hybrid cloud” quite a bit, this indicates that you can have some of your data on premise and some of your data in the cloud. There are quite a few use cases that people desire. Not everyone is sold with trusting all their data to a cloud vendor for safe keeping. In some industries and countries there are tight regulatory controls on where your data can legally be located. The Hybrid Cloud box in the diagram includes Sage 50 ERP (US and Canadian), Sage 100 ERP and Sage 300 ERP.

To effectively operate mobile and web solutions, you do need to have your data available 24×7 with a very high degree of uptime and a very high degree of security. Most small or mid-sized business customers cannot afford sufficient IT resources to maintain this for their own data center. One solution to this problem is to synchronize a subset of your on premise ERP/CRM data to the Sage Data Cloud and then have your mobile solutions accessing this. Then it becomes Sage’s responsibility to maintain the uptime, 24×7 support and apply all the necessary security procedures to keep the data safe.

Another attraction for ISVs is integrate their product to the Sage Data Cloud and then let the Sage Data Cloud handle all the details of integrating to the many Sage ERP products. This way they only need to write one integration rather than separate integrations for Sage 50 ERP, Sage 100 ERP, Sage 300 ERP, Sage 300 CRE, etc.

We had a lot of coverage of the Sage 300 Online offering which has been live for a while now. This was introduced last Summit and now offers Sage 300 ERP customers the choice of installing on premise or running in the Azure cloud. Running in the cloud saves you having to back up your data, perform updates or maintain servers or operating systems. This way you can just run Sage 300 and let Sage handle the details. Of course you can get a copy of your data anytime you want and even move between on premise and the cloud.

The Sage Innovation Lab

On the trade show we had a special section for the Sage Innovation Lab. Here you could play with Google Glasses, Pebble Watches, 3D Printers and all sorts of neat toys to see some prototypes and experiments that Sage is working on with these. We don’t know if these will all be productized, but it’s cool to get a feel for how the future might begin to look like.

Summary

This really was Sage Summit re-imagined. There were a great many sessions, keynotes and demonstrations on all sorts of topics of interest to businesses. This should be taken even further for next year’s Sage Summit which will be in New Orleans, LA on July 27-30, 2015. Does anyone else remember all those great CA-World’s in New Orleans back in the 90s?

Written by smist08

August 2, 2014 at 8:07 pm

Google Mobile Trends

with 2 comments

Introduction

In a previous blog I talked about Apple’s mobile directions following their annual developer’s conference. This past week Google help their annual I/O developer’s conference in San Francisco. So it seems like a good time to comment on Google’s mobile trends. There is a lot of similarity between Apple and Google’s directions. But there are also differences since each company has a slightly different view on what direction to take. Both companies are very large and have their hands in a lot of pies. As a consequence their announcements tend to be quite diverse and it’s interesting to see how they try to unify them all.

New Android

Like Apple announced a new version of iOS, so too Google announced a new version of Android namely “L”. I don’t know what happened to the cute code names like Kit Kat or Ice Cream Sandwich, but “L” it is. One big feature they’ve added is 64 Bit support. Apple recently introduced 64 Bit processors in their latest phones and now Google devices can catch up. This means that most new higher end phones are now all going to sport 64 Bit quad core processors. This is an amazing amount of computing power in such tiny devices.

As with most new operating systems these days, it includes a new UI look. With the new Android this new update is called Material Design and follows the fashion of a flatter more austere look to things.

There are many other new features in the new version of Android which will hopefully make people more productive.

Wearable’s Everywhere

A big trend in rumored and announced, but generally not shipping products are wearable’s. Google leads the way in these with Google Glasses. Then there are the endless smart watch announcements, rumors and even the odd product.

smartwatch

I have a Garmin GPS Watch with a heart rate monitor. Combined with the website where I upload the data, this is a great device to track my cycling and running. It would be nice if its battery lasted longer and if it was more waterproof so I could swim with it. In the meantime there are many great apps to do the same sort of things with your phone. These all do more than the Garmin watch, but the phone is bulkier and less durable in damp environments.

Having a waterproof watch that can do more than my Garmin and has a longer battery life would be fantastic.

Although in the early stages, both Google and Apple see the fitness market for metrics and tracking as a huge potential market. Both companies are both partnering and developing new sensors to measure new things, like small waterproof sensors for swimming or unique ways to measure other sports like for golf swings. Similarly measuring all sorts of biometric data beyond heart rate to include blood pressure, blood glucose levels and all sorts of other things. Eventually these will morph into a Star Trek like medical tricoder.

Home Automation

Google has now purchased a number of home automation companies including Nest. And are now integrating these with Android to provide full control of everything in your home from your phone. Including remotely setting the thermostat, receiving smoke detector alerts and monitoring security cameras. Most of these things are available now as separate discrete components but Google is working especially hard to make this whole area much more unified.

nest_thermostat_insteon-800x420

Online Cars

Another big area of interest is integrating into cars. Already most cars can interface to iPhones and Android phones to make calls hands free and play music. Now the goal is to sell Android (and iOS) into the auto industry. To have better more connected GPS (with real time traffic updates) and access to all your music library. Further many car companies are enabling using your car as a Wi-Fi hotspot.

I’m not sure how far this should go, since it all gets very distracting. Already we have so many potential distractions in cars. And just things like texting are causing many accidents.

Everything in the Cloud

With all Google’s products, the emphasis is storing all data in the cloud. They will only store things on local devices if there is a huge outcry of people that need to work offline (like on airplanes). Chromebooks really showed that this was possible and Google has led the way in offering lots of free cloud storage and making sure everything they do will interact seamlessly with these cloud documents.

They tout the convenience of this that things are always backed up, so if your laptop is stolen or destroyed you don’t lose anything. However critics worry about privacy concerns with storing sensitive data under someone else’s control, especially a search provider. It’s rather scary to corporate compliance officers that sensitive corporate documents might start showing up in people’s search results. Often this wouldn’t be due to Google doing something maliciously as someone just misclicking the visibility of the document to allow it to be viewed by anyone, and by anyone this means anyone on the Internet (and not just anyone that finds it on the corporate network).

All that being said, Google, Apple and Microsoft are all pushing this model like mad, and a lot of innovations that are in the pipeline completely rely on the adoption of this philosophy. It certainly is convenient to have all your photos and videos automatically uploaded and not to have to worry about sync’ing things by hand anymore.

Big Data

Google really started the big data revolution when they published the details of their Map Reduce algorithm that the original Google search was built upon. This then spawned a huge industry of open source tools and databases all built around this algorithm. Map Reduce was revolutionary since it let people get instant search results on searching for everything. Basically it worked by having a database of everything people might search for, like a giant cache that could return results instantly.

The limitation of Map Reduce is that constructing queries is quite difficult and often requires the database to be rebuilt in a different way. If you don’t do that, although the main query the database solves returns instantly, any other query takes a week to process.

Google is now claiming Map Reduce and all the industry like Hadoop based on it are completely obsolete. They were heavily promoting their new Cloud Dataflow service where the claim this service can also do efficient real time analytics as well as preserve the performance of the main functionality.

It will be interesting to see what this new service can really do and will it really threaten all the various NoSQL databases like MongoDB.

Summary

There are a lot of interesting things going on in the mobile world. It will be interesting to see if all our phones are replaced by watches or glasses in a couple of years. It will be interesting to see what great things come of all these new cloud big data services.

Written by smist08

June 28, 2014 at 5:28 pm

Troubleshooting Problems in the Cloud

with one comment

Introduction

Imagine that you are managing a large cloud based solution with lots of moving parts, thousands of users and fairly complicated business logic going on all the time. Now an irate customer phones into support complaining that something bad happened. Sometimes from the explanation you can figure out what happened quite easily. Sometimes what the user described sounds completely inexplicable and impossible. Now what do you do to troubleshoot the problem?

Easy Problems

First let’s review how easy problems are solved and can be dealt with quickly:

  • From the description of the problem, the programmer goes, oh of course we should have thought of that and on reviewing the code finds and easy fix which can be deployed easily. (Too bad this doesn’t happen more often).
  • The programmer scratches his head, but carefully goes through the customer’s steps and reproduces the bug on his local system, so he can easily investigate in the debugger and solve the problem.
  • The problem is a bit of a mystery so a programmer and support analyst using GoToMeeting to watch the customer work. On watching them work, they realize what this customer is doing differently to cause the problem and then the programmer can reproduce and debug the problem.
  • On examining the standard application logging, an unhandled exception is found in the log around the time the customer reported the problem and from this the code can be examined and the cause of the exception can be determined and fixed.

These are the standard and generally easy problems to fix. But what happens when these methods don’t yield any results?

Harder Problems

Here are some examples of harder problems that can drive people crazy.

  • The problem happens infrequently and can’t be consistently reproduced. But it happens frequently enough that customers are getting rather annoyed.
  • The system works fine most of the time, but every now and again, the system goes berserk for some reason. Then it just magically goes back to normal, or you need to restart the servers to get things back on track.
  • The problem only happens to certain users, perhaps at certain times. You can watch it happening to them via GoToMeeting, but can’t for the life of you get it to happen to you or to happen in your test environment and many other users never experience this.
  • Often problems aren’t outright bugs, but other strange behaviors, like the whole system suddenly slowing down for no apparent reason and then some time later it goes back to normal, again for no apparent reason.

A lot of time these sorts of things are due to interactions between what people are doing in different parts of the system. Predicting and testing all these situations is very difficult and often a result of emergent phenomena which we talk about a bit later.

Preventative Measures

Generally the only way to solve the hard problems is to instrument your web application so you know exactly what is going on. Not only does this help with solving hard problems, but it also helps with making using your web site a better experience for users, since you can log what they do, how long it takes and where they are getting stuck. Often usability problems are more serious to users than program failures or other bugs.

Instrumenting you application generally means having good logging and making sure you log a lot of metrics that you can monitor. This can also means having extra APIs to the running application where you can inquire on the state of various components, find out how many of one type of object is currently in use for instance. Often your infrastructure like your web server will do a lot of logging for you, so be aware of this as there is no need to log things twice.

Having a dashboard to track these metrics is very helpful.  The dashboard can both make reports and graphs from the application logs as well as use the application’s diagnostic API to provide useful information about what is happening. Often you can integrate with third party vendors like New Relic or Microsoft Application Insights, but one way or another you need this.

appinsights

One objection to logging a lot of information is that the process of doing this can slow things down so much it becomes unacceptable. This is certainly true if you are logging synchronously to a file (i.e. not continuing on until the data is written). But modern systems get around this problems by logging asynchronously. The system doing the logging just fires the log message at a listener and goes on without waiting for or needing a reply. This causes logging to become much less a burden on the application. It then moves the problem to the listener application to log the messages, and if it gets too far behind it usually starts spilling messages. Most common logging infrastructures like Microsoft’s ETW already have this ability to take advantage of.

Some things you must track via APIs or logging:

  • Any resource used in the system. Java and C# programmers often think they can’t have leaks because of garbage collection, but garbage collection isn’t very smart and often important resources will leak due to unexpected references or a circular dependency. In a 24×7 web application that needs to essentially run forever in a busy state, this is incredibly important.
  • Generally what users are doing, so you know what the possible interactions might be and what are the concurrent things you may need to do to replicate the problem.
  • Any exceptions or errors thrown by the program this tends to be a given, but make sure you include programmatically handled errors since these can be useful as well.
  • Make sure you log performance metrics, if something takes longer than expected, log it.
  • Any calls to external systems and the results (with performance stats).
  • Assert type conditions where the program finds itself in an unexpected state (is make these get logged and not compiled out for production).
  • Anything the programmer considers a sensitive area of the program that they are worried about.

There are also some things that must not be logged, these include any sort of passwords, decryption keys or sensitive customer data (ie nearly all customer data). Generally a lot of people need to work with the logs and there can’t be any sensitive information there as this will be considered a major security problem.

Make sure you have some good software (either home grown, open source or commercial) to search and analyze your logs. These logs will get incredibly big and will need to be carefully managed (usually archiving them each day). There are many good packages to make this chore far easier.

Emergent Behaviors

Emergent behavior refers to complex unexpected behaviors arising from large sets of much simpler processes. Modern web applications are getting bigger and bigger with many moving parts and many interactions between different systems and subsystems. We are already starting to see emergent behavior arising. Nothing on the scale of the system suddenly becoming intelligent, but on the scale where predicting what will happen in all cases has become impossible. It doesn’t matter how much QA you apply, chaos theory and mathematics quite clearly state that the system is beyond simple prediction. That doesn’t mean it is unstable. If done properly your system should still be quite stable, meaning small changes won’t cause radically different things to happen.

The key point is to keep this mind when building your system, and to make sure you have plenty of instrumentation in place so that even if you can’t predict what will happen, you can still see what is happening and act on it.

Summary

Diagnosing and solving hard problems in a large web application with thousands of concurrent users can be a lot of fun and very challenging. Having good preventative measures in place can make life a lot easier. You don’t want to be continually pushing out newer versions of your application just to add more logging because you can’t figure out what is going on.

Apple Mobile Trends

with 2 comments

Introduction

With Apple’s WWDC conference just wrapping up, I thought it might be a good time to meditate on a few of the current trends in the mobile world. I think the patent wars are sorting themselves out as Google and Apple settle and we are seeing a lot more competitive copying. Apple added a lot of features that competitors have had for a while as well as adding a few innovations unique to Apple.

The competitive fervor being shown in both the Google and Apple mobile camps is impressive and making it very hard for any other system to keep up.

ios8_large

Cloud Storage

Apple has had the iCloud for a while now, but with this version we are really seeing Apple leverage this. When Google introduced the Chromebook they used this video to show the power of keeping things in the Web. This idea has been copied somewhat by Microsoft. But now Apple has taken this to the next level by allowing you to continue from device to device seamlessly, so you can easily start an e-mail on your phone and then continue working on it on your MacBook. No having to e-mail things to yourself, it all just seamlessly works.

Apple also copied some ideas from Google Drive and DropBox to allow copying files across non-Apple devices like Windows as well as sharing documents between applications. So now this is all a bit more seamless. It’s amazing how much free cloud storage you can get by having Google, Microsoft, Apple and Dropbox accounts.

Generally this is just the beginning as companies figure out neat things they can do when your data is in the cloud. If you are worried about privacy or the NSA reading your documents, you might try a different solution, but for many things the convenience of this outweighs the worries. Perhaps a bigger worry than the FBI or NSA is how advertisers will be allowed to use all this data to target you. Apple has added some features to really enable mobile advertising, whether these become too intrusive and annoying has yet to be seen.

Copying is the Best Compliment

Apple has also copied quite a few ideas from Google, Blackberry and Microsoft into the new iOS. There is a lot more use of transparency (like introduced in Windows Vista). There is now a customizable and predictive keyboard adding ideas from Blackberry and Microsoft. Keyboard entry has been one of Apple’s weaknesses that it is trying to address. Similarly the drive option in the iCloud is rather late to the game.

Apps versus the Web

There is a continuing battle between native applications and web applications for accessing web sites. People often complain that the native mobile application only gives them a subset of what is available on the full web site, but then on the other hand the consensus is that the native mobile apps give a much better experience.

True web applications give a unified experience across all devices and give the same functionality and the same interaction models. This is also easier for developers since you only need to develop once.

However Apple is having a lot of success with apps. Generally people seem to find things easier in the Apple App store than in browsing and bookmarking the web. Apple claims that over half of mobile Internet traffic is through iOS apps now (but I’m not sure if this is skewed by streaming video apps like Netflix that use a disproportionate amount of bandwidth).

Yet another Programming Language

Apple also unveiled a new programming language Swift for mobile development. One of the problems with C, C++ and Objective C programming is the use of pointers which tends to make programming harder than it needs to be. Java was an alternative object oriented extension to C with no pointers, then C# came along copying much of Java. Generally most scripting languages like Basic, JavaScript, Python, etc. have never had pointers.

Rather than go down the road of Java and C#, Swift has tried to incorporate the ease of use of scripting languages, but still give you full control over the iOS API. How this all works out is yet to be seen, but it will be interesting if it makes iPhones and iPads really easy to program similar to the early PCs back in the Basic days.

swift-screenshot

The Internet of Things

Apple introduced two new initiatives, their Health Kit and Home Kit. Health kit is mostly to encourage adding medical sensing devices to your iPhone, whereas Home Kit is to extend iOS into devices around the home and to control them all from your iPhone.

The Health Kit is designed to centralize all your health related information in one central place. There is getting to be quite a catalog of sensors and apps to continuously track your location, speed, heart rate, pulse, blood pressure, etc. If you are an athlete, this is great information on your fitness level and how you are doing. Garmin really pioneered this with their GPS watches with attached heart rate monitors. I have a Garmin watch and it provides a tremendous amount of information when I run or cycle. I don’t think this is much use for the iPhone, which I always leave behind since I don’t want to risk it getting wet, but this might really take off if Apple really releases a smart watch this fall like all the rumors say.

Home Kit is a bit of reaction to Google buying Nest, the intelligent thermostat. Basically you can control all your household items from your phone, so you can warm up the house as you are driving home, or turn all the lights on and off remotely. We have a cottage with in-floor heating, it would be nice if we could remotely tell the house to start heating up in the winter a few hours before we arrive, right now it’s a bit cold when we first get there and turn on the heat. However with zoned heating we would need four thermostats and at $250 each, this is rather excessively expensive. I think the price of these devices has to come down quite a bit to create some real adoption.

There is a lot of concern about having all of these hacked and interfered with, but if they get the security and privacy correct, then these are really handy things to have.

Summary

Apple has introduced some quite intriguing new directions. Can Swift become the Basic programming languages for mobile devices? Will Health Kit and Home Kit usher in a wave of new wonderful intelligent devices? Will all the new refinements in iOS really help users have an even better mobile experience? Will native apps continue to displace web sites, or will web sites re-emerge as the dominant on-line experience? Lots of questions to be answered over the next few months, but it should be fun playing with tall these new toys.

Written by smist08

June 7, 2014 at 4:27 pm

Elastic Search

with 5 comments

Introduction

We’ve been working on an interesting POC recently that involved Google like search. After evaluating a few alternatives we chose Elastic Search. Search is an interesting topic, often associated with Big Data, NoSQL and all sorts of other interesting technologies. In this article I’m going to look at a few interesting aspects of Elastic Search and how you might use it.

elasticsearchlogo

Elastic Search is an open source search product based on Apache Lucene. It’s all written in Java and installing it is just a matter of copying it to a directory and then as long as you already have Java installed, it’s ready to go. An alternative to Elastic Search is Apache Solr which is also based on Lucene. Both are quite good, but we preferred Elastic Search since it seemed to have added quite a bit of functionality beyond what Solr offers.

Elastic Search

Elastic search is basically a way of searching JSON documents. It has a RESTful API and you use this to add JSON documents to be indexed and searched. Sounds simple, but how is this helpful for business applications? There is a plugin that allows you to connect to databases via JDBC and to setup these to be imported and indexed on a regular schedule. This allows you to perform Google like searches on your database, only it isn’t really searching your database, its searching the data it extracted from your database.

Web Searches

This is similar to how web search services like Google works. Google dispatches thousands of web crawlers that spend their time crawling around the Internet and adding anything they find to Google’s search database. This is the real search process. When you do a search from Google’s web site it really just does a read on its database (it actually breaks up what you are searching for and does a bunch of reads). The bottom line though is that when you search, it is really just doing a direct read on its database and this is why it’s so fast. You can see why this is Big Data since this database contains the results for every possible search query.

This is quite different than a relational database where you search with SQL and the search goes out and rifles through the database to get the results. In a SQL database putting the data into the database is quite fast (sometimes) and then reading or fetching it can be quite slow. In NoSQL or BigData type databases much more time goes into adding the data, so that certain predefined queries can retrieve what they need instantly. This means the database has to be designed ahead of time to optimize these queries and then often inserting the data takes much longer (often because this is where the searching really happens).

Scale Out

Elastic Search is designed to scale out from the beginning, it automatically does most of the work for starting and creating clusters. It makes adding nodes with processing and databases really easy so you can easily expand Elastic Search to handle your growing needs. This is why you find Elastic Search as the engine behind so many large Internet sites like GitHub, StumbleUpon and Stack Overflow. Certainly a big part of Elastic Search’s popularity is how easy it is to deploy, scale out, monitor and maintain. Certainly much easier than deploying something based on Hadoop.

elasticsearch_topologies

Analyzers

When you index your data its fed through a set of analyzers which do things like convert everything to lower case, split up sentences to lower case, split words into roots (walking -> walk, Steve’s -> Steve), deal with special characters, dealing with other language peculiarities, etc. Elastic Search has a large set of configurable analyzers so you can tune your search results based on knowledge of what you are searching.

Fuzzy Search

One of the coolest features is fuzzy search, in this case you might not know exactly what you are searching for or you might spell it wrong and then Elastic Search magically finds the correct values. When ranking the values, Elastic Search uses something called Levenshtein distance to rank which values give the best results. Then the real trick is how does ElasticSearch do this without going through the entire database computing and ranking this distance for everything? The answer is having some sophisticated transformers on what you entered to limit the number of reads it needs to do to find matching terms, combined with good analyzers above, this turns out to be extremely effective and very performant.

Real Time

Notice that since these search engines don’t search the data directly they won’t be real time. The search database is only updated infrequently, so if data is being rapidly added to the real database, it won’t show up in these type of searches until the next update processes them. Often these synchronization updates are only performed once per day. You can tune these to be more frequent and you can write code to insert critical data directly into the search database, but generally it’s easier to just let it be updated on a daily cycle.

Security

When searching enterprise databases there has to be some care on applying security, ensuring the user has the rights to search through whatever they are searching through. There has to be some controls added so that the enterprise API can only search indexes that the user has the rights to see. Even if the search results don’t display anything sensitive they could still leak information. For instance if you don’t have rights to see a customer’s orders, but can search on them, then you could probably figure out how many orders are done by each customer which could be quite useful information.

Certainly when returning search results you wouldn’t reveal anything sensitive like say salaries, to get these you would need to click a link to drill down into  the applications UI where full security screen is done.

Summary

Elastic Search is a very powerful search technology that is quite easy to deploy and quite easy to integrate into existing systems. A lot of this is due to a powerful RESTful API to perform operations.