Stephen Smith's Blog

Musings on Machine Learning…

Voice Input and Concierge Services

with 2 comments


Some of the most exciting new technologies appearing on mobile phones are around voice recognition and concierge or personal assistant type of applications. These include ambitious applications like Apple’s Siri, along with a number of initiatives from Google including Google Now and Google Voice Search.

The voice recognition by itself is a truly amazing technology, but this is only a fraction of the story. After the voice input is recognized the query is combined with other input, like your location, to determine a lot of context for what you are asking about, identifies the problem domain and gives a truly meaningful answer along with relevant data to correctly answer or respond to your query.

Of all the technologies on Star Trek, we don’t see any sign of a working warp drive or transporter, but being able to ask a computer anything on any topic and get a good answer, we seem to have that now. So perhaps if Star Trek IV was set another ten years ahead, then Scotty wouldn’t have had any trouble interacting with our primitive computers.


Device or Service?

An incorrect assumption is that you can integrate apps running on your phone to these services. This is the wrong way to think about how they work. They aren’t a voice recognition/query engine running on your device. In fact they send all the (nearly) raw input to a major data center to process them. Even though there isn’t a device API for accessing Siri, developers have found clever ways around this, by putting clever things in the contact list and constructing special text messages, but again this is really just using Siri as voice recognition software. The real intent of Siri is much deeper; it’s really a task completion engine.

These engines are really taking your voice input and then mapping them to various problem domains which then talk to many APIs on the backend. The goal isn’t to run an app and then just provide a voice recognition engine that translates voice commands into regular app commands as if the user had typed them. The goal is really that you don’t need device apps. When you ask Siri a question, you don’t need a matching app running, if you ask about airline info, it gets it, if you ask about weather, it gets it. You don’t need to run the right app.

In a way a limitation of current mobile phones is the need to download and install so many apps. Do you really need all of these? Most of the apps on my phone are specialized query information gathering apps like weather, news and such. The real beauty of these new personal assistant type applications is that they eliminate the need for all these other apps. Wouldn’t a phone or tablet be much easier if you didn’t need to find and install all these apps? Isn’t this the original appeal of the Internet to PC users? You don’t need to install dozens of applications (which got more and more painful); all you needed was a Browser and nothing else. To some degree these personal assistant applications become a workable Browser for mobile devices, where you no longer need all these apps anymore. Sure there are some special purpose apps for playing games and performing specialized functions, but generally you can just use Siri, Google Voice Search or Google Now for most things that you probably use Apps for now. Sure these aren’t perfect yet, just like the original Netscape Browser wasn’t perfect, but they are getting there very quickly.

Integrating to ERP and CRM

OK, so we don’t integrate to these new services via Apps talking to APIs on devices, so if we want to integrate our CRM or ERP into say Siri, how do we do it? Suppose we want to ask Siri what is the status of an Order from a vendor, or we want to ask Siri what is the credit limit of a customer I’m about to visit?

The key is to have this information available on the Internet via RESTful Web Services like SData. The reason for RESTful Web Services is that they allow discovery by search engine spiders. Generally shortened URLs give the list of how to build the rest of the URL, this allows a general engine to discover all the data. RESTful Web Services are the new Internet standard and all these services are built to interact with them.

The key is for vendors (like Sage) to make the right agreements with these services, so that the data can be accessed in a secure way, and you aren’t doing something like exposing all your ERP data to the Internet in general. Security and the rules for who can access what are crucial. Standard sign-on mechanisms like OAuth are going to have to be used.

The other thing is that all this data must be in a central location. This means that any ERP or CRM data that is going to be available to these services must be sync’ed to a central cloud location. This then fits in with Sage’s connected services strategy of sync’ing key on-premise data to the cloud (of course if you are already running your CRM or ERP in the cloud then you can skip this step). I blogged about Sage’s Hybrid Cloud here. From Sage’s Hybrid Cloud we can expose the correct data via SData Web Services for anyone that wants to participate in these services. Then Sage can make the correct deals with the services and is responsible that all the security concerns are setup correctly.

This can then lead to a company’s employees and customers being able to make general inquiries into these services and for the right questions have them mapped to a problem domain in the ERP or CRM space, have the backend systems provide answers with relevant data added from the Hybrid Cloud.

None of these services would look into the Hybrid Cloud in real time, they all operate like Search Engines which are continuously polling sites and updating their master databases, then for performance reasons all the real queries are handled as highly optimized Big Data queries against a master search database, so that all questions are magically answered instantly.

Overtime the questions answered can become more and more sophisticated, incorporating more and more sources of business data. Perhaps you can ask Siri: What’s the best way to increase my company’s revenue? And then get back a useful answer.


I think these personal assistant type applications are going to become more and more prevalent in the mobile world (or even on regular computers). To me it’s exciting to consider participating in this and to think about all the questions that we can help answer.


2 Responses

Subscribe to comments with RSS.

  1. […] interaction and general direct input into the brain. In a way projected where technologies like Siri and Google Now along with Google Glasses will be in ten […]

  2. […] Tools like Apple’s Siri are actually starting to be useful. I blogged on this previously here. Certainly people are relying on this in their cars to dial phone and to select music. Even to ask […]

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: