Stephen Smith's Blog

Musings on Machine Learning…

Archive for April 2014

Some Thoughts on Security

with 2 comments

Introduction

With the recent Heartbleed security exploit in the OpenSSL library a lot of attention has been focused on how vulnerable our computer systems have become to data theft. With so much data travelling the Internet as well as travelling wireless networks, this has brought home the importance of how secure these systems are. With a general direction towards an Internet of Things this makes all our devices whether our fridge or our car possibly susceptible to hackers.

I’ll talk about Heartbleed a bit later, but first perhaps a bit of history with my experiences with secure computing environments.

Physical Isolation

My last co-op work term was at DRDC Atlantic in Dartmouth, Nova Scotia. In order to maintain security they had a special mainframe for handling classified data and to perform classified processing. This computer was located inside a bank vault along with all its disk drives and tape units. It was only turned on after the door was sealed and it was completely cut off from the outside world. Technicians were responsible for monitoring the vault from the outside to ensure that there was absolutely no leakage of RF radiation when classified processing was in progress.

After graduation from University my first job was with Epic Data. One of the projects I worked on was a security system for a General Dynamics fighter aircraft design facility. This entire building was built as a giant Faraday cage. The entrances weren’t sealed, but you had to travel through a twisty corridor to enter the building to ensure there was not line for radio waves to pass out. Then surrounding the building was a large protected parking lot where only authorized cars were allowed in.

Generally these facilities didn’t believe you could secure connections with the outside world. If such a connection existed, no matter how good the encryption and security measures, a hacker could penetrate it. The hackers they were worried about weren’t just bored teenagers living in their parent’s basements, but well trained and financed hackers working for foreign governments. Something like the Russian or Chinese version of the NSA.

Van Eck Phreaking

A lot of attention goes to securing Internet connections. But historically data has been stolen through other means. Van Eck Phreaking is a technique to listen to the RF radiation from a CRT or LCD monitor and to reconstruct the image from that radiation. Using this sort of technique a van parked on the street with sensitive antenna equipment can reconstruct what is being viewed on your monitor. This is even though you are using a wired connection from your computer to the monitor. In this case how updated your software is or how secure your cryptography is just doesn’t matter.

Everything is Wireless

It seems that every now and then politicians forget that cell phones are really just radios and that anyone with the right sort of radio receiver can listen in. This seems to lead to a scandal in BC politics every couple of years. This is really just a reminder that unless something is specifically marked as using some sort of secure connection or cryptography, it probably doesn’t. And then if it doesn’t anyone can listen in.

It might seem that most communications are secure now a days. Even Google search switches to always use https which is a very secure encrypted channel to keep all your search terms a secret between yourself and Google.

But think about all the other communication channels going on. If you use a wireless mouse or a wireless keyboard, then these are really just short range radios. Is this communications encrypted and secure? Similarly if you use a wireless monitor, then it’s even easier to eavesdrop on than using Van Eck.

What about your Wi-Fi network? Is that secure? Or is all non-https traffic easy to eavesdrop on? People are getting better and better at hacking into Wi-Fi networks.

In your car if you are using your cell phone via blue tooth, is this another place where eavesdropping can occur?

Heartbleed

Heartbleed is an interesting bug in the OpenSSL library that’s caused a lot of concern recently. The following XKCD cartoon gives a good explanation of how a bug in validating an input parameter caused the problem of leaking a lot of data to the web.

heartbleed_explanation

At the first level, any program that receives input from untrusted sources (i.e. random people out on the Internet) should very carefully and thoroughly valid any input. Here you can tell it what to reply and the length of the reply. If you give a length much longer than what was given then it leaks whatever random contents of memory were located here.

At the second level, this is an API design flaw, that there should never have been such a function with such parameters that could be abused thus.

At the third level, what allows this to go bad is a performance optimization that was put in the OpenSSL library to provide faster buffer management. Before this performance enhancement, this bug would just have caused an application fault. This would have been bad, but been easy to detect and wouldn’t have leaked any data. At worst it would have perhaps allowed some short lived denial of service attacks.

Mostly exploiting this security hole just returns the attacker with a bunch of random garbage. The trick is to automate the attack to repeatedly try it on thousands of places until by fluke you find something valuable, perhaps a private digital key or perhaps a password.

password-heartbleed-thumb-v1-620x411

Complacency

The open source community makes the claim that open source code is safer because anyone can review the source code and find bugs. So people are invited to do this to OpenSSL. I think Heartbleed shows that security researcher became complacent and weren’t examining this code closely enough.

The code that caused the bug was checked in by a trusted coder, and was code reviewed by someone knowledgeable. Mistakes happen, but for something like this, perhaps there was a bit too much trust. I think it was an honest mistake and not deliberate sabotage by hackers or the NSA. The source code change logs give a pretty good audit of what happened and why.

Should I Panic?

In spite of what some reporters are saying, this isn’t the worst security problem that has surfaced. The holy grail of hackers is to find a way to root computers (take them over with full administrator privileges). This attack just has a small chance of providing something to help on this way and isn’t a full exploit in its own right. Bugs in Java, IE, SQL Server and Flash have all allowed hackers to take over peoples computers. Some didn’t require anything else, some just required tricking the user into browsing a bad web site. Similarly e-mail or flash drive viruses have caused far more havoc than this particular problem. Another really on-going security weakness is caused by government regulations restricting the strength of encryption or forcing the disclosure of keys, these measures do little to help the government, but they really make the lives of hackers easier. I also think that e-mail borne viruses have wreaked much more havoc than Heartbleed is likely to. But I suspect the biggest source of identity theft is from data recovered from stolen laptops and other devices.

Another aspect is the idea that we should be like gazelle’s and rely on the herd to protect us. If we are in a herd of 100 and a lion comes along to eat one of us then there is only a 1/1000 chance that it will be me.

This attack does highlight the importance of some good security practices. Such as changing important passwords regularly (every few months) and using sufficiently complex or long passwords.

All that being said, nearly every website makes you sign in. For web sites that I don’t care about I just use a simple password and if someone discovers it, I don’t really care. For other sites like personal banking I take much more care. For sites like Facebook I take medium care. Generally don’t provide accurate personal information to sites that don’t need it, if they insist on your birthday, enter it a few days off, if they want a phone number then make one up. That way if the site is compromised then they just get a bunch of inaccurate data on you. Most sites ask way too many things. Resist answering these or answer them inaccurately. Also avoid overly nosey surveys, they may be private and anonymous, unless hacked.

The good thing about this exploit, seems to be that it was discovered and fixed mostly before it could be exploited. I haven’t seen real cases of damage being done. Some sites (like the Canadian Revenue Services) are trying to blame Heartbleed for unrelated security lapses.

Generally the problems that you hear about are the ones that you don’t need to worry so much about. But again it is a safe practice to use this as a reminder to change your passwords and minimize the amount of personally identifiable data out there. After all dealing with things like identity theft can be pretty annoying. And this also help with the problems that the black hat hackers know about and are using, but haven’t been discovered yet.

Summary

You always need to be vigilant about security. However it doesn’t help to be overly paranoid. Follow good on-line practices and you should be fine. The diversity of computer systems out there helps, not all are affected and those that are, are good about notifying those that have been affected. Generally a little paranoia and good sense can go a long way on-line.

Advertisements

Written by smist08

April 26, 2014 at 6:51 pm

Disaster Recovery

with 4 comments

Introduction

In a previous blog article I talked about business continuity, what you need to do to keep Sage 300 ERP up and running with little or no downtown. However I mushed together two concepts, namely keeping a service highly available along with having a disaster recovery plan. In this article I want to separate these two concepts apart and consider them separately.

We’ve had to give these two concepts a lot of thought when crafting our Sage 300 Online product offering, since we want to have this service available as close to 100% as possible and then if something truly catastrophic happens, back on its feet as quickly as possible.

Terminology

There is some common terminology which you always see in discussions on this topic:

RPO – Recovery Point Objective: this is the maximum tolerable period in which data might be lost due to a major incident. So for instance if you have to restore from a backup, how long ago was that backup made.

RTO – Recovery Time Objective: this is the duration of time within which a business process must be restored after a disaster (or disruption) in order to avoid unacceptable consequences. For instance if a computer fails, how long can you wait to replace it.

HA – High Availability: usually concerns keeping a system running with little or no downtime. This doesn’t include scheduled downtime and it usually doesn’t include a major disaster like an earthquake eating a datacenter.

DR – Disaster Recovery: this is the process, policies and procedures that are related to preparing for recovery or continuation of technology infrastructure which are vital to an organization after a natural or human-induced disaster.

High Availability

HA means creating a system that can keep running when individual components fail (no single point of failure), like one computer’s motherboard frying, a power supply failing or a hard disk failure. These are reasonably rare events, but often systems in data centers run on dozens of individual computers and things do fail and you don’t want to be down for a day waiting for a new part to be delivered.

Of course if you don’t mind being down for a day or two when things fail, then there is no point spending the money to protect against this. Which is why most businesses set RPO and RTO targets for these type of things.

Some of this comes down to procedures as well. For instance if you have all redundant components but then run Windows Update on them all at once, they will reboot all at once bringing your system down. You could schedule a maintenance windows for this, but generally if you have redundant components you can Windows update the first and when its fine and back up, then you do the secondary.

If you are running Sage ERP on a newer Windows Server and using SQL Server as your database then there are really good hardware/software combinations of all the standard components to give you really good solid high availability. I talked about some of these in this article.

Disaster Recovery

This usually refers to having a tested plan to spin up your IT infrastructure at an alternate site in the case of a major disaster like an earthquake or hurricane wiping out you currently running systems.

natural_disaster

Again depending on your RPO/RTO requirements will depend on how much money you spend on this. For instance do you purchase backup hardware and have it ready to go in an alternate geographic region (far enough away that the same disaster couldn’t take out both locations)?

For sure you need to have complete backups of everything that are stored far away that you can recover from. Then it’s a matter of acquiring the hardware and restoring all your backups. Often people are storing these backups in the cloud these days, this is because cloud storage has become quite inexpensive and most cloud storage solutions provide redundancy across multiple geographies.

The key point here is to test your procedure. If your DR plan isn’t tested then chances are it won’t work when it’s needed. Performing a DR drill is quite time consuming, but really essential if you are serious about business continuity.

Azure

One of the attractions of the cloud is having a lot of these things done for you. Sage 300 Online handles setting up all its systems HA, as well as having a tested DR plan ready to implement. Azure helps by having many data centers in different locations and then having a lot of HA and DR features built into their components (especially the PaaS ones). This then removes a lot of management and procedural headaches from running your business.

Hard Decisions

If a data center is completely wiped out, then the decision to execute the DR plan is easy. However the harder decision comes in when the primary site has been down for a few hours, people are working hard to restore service, but it seems to be dragging on. Then you can have a hard decision to kick in the DR plan or to wait to see if the people recovering the primary can be successful. These sort of things are often caused by electrical problems, or problems with large SANs.

One option is to start spinning up the alternative site, restoring backups if necessary and getting ready, so when you do make the decision you can do the switch over quickly. This way you can often delay the hard decision and give the people fixing the problem a bit more time.

Dangers

Having a good tested DR plan is the first step, but businesses need to realize that if a major disaster like an earthquake wiping out a lot of data centers, then many people are going to activate their DR plans at once. This scenario won’t have been tested. We could easily experience a cascading outage from the high usage that causes many other sites to go down, until the initial wave passes. Generally businesses have to be prepared to not receive good service until everyone is moved over and things settle down again.

Summary

Responsible companies should have solid plans for both high availability and disaster recovery. At the same time they need to compare the cost of these against the time they can afford to be down against the probability of these scenarios happening to them. Due to the costs and complexities of handling these scenarios, many companies are moving to the cloud to offload these concerns to their cloud application provider. Of course when choosing a cloud provider make sure you check the RPO and RTO that they provide.

Written by smist08

April 12, 2014 at 5:56 pm

Elastic Search

with 5 comments

Introduction

We’ve been working on an interesting POC recently that involved Google like search. After evaluating a few alternatives we chose Elastic Search. Search is an interesting topic, often associated with Big Data, NoSQL and all sorts of other interesting technologies. In this article I’m going to look at a few interesting aspects of Elastic Search and how you might use it.

elasticsearchlogo

Elastic Search is an open source search product based on Apache Lucene. It’s all written in Java and installing it is just a matter of copying it to a directory and then as long as you already have Java installed, it’s ready to go. An alternative to Elastic Search is Apache Solr which is also based on Lucene. Both are quite good, but we preferred Elastic Search since it seemed to have added quite a bit of functionality beyond what Solr offers.

Elastic Search

Elastic search is basically a way of searching JSON documents. It has a RESTful API and you use this to add JSON documents to be indexed and searched. Sounds simple, but how is this helpful for business applications? There is a plugin that allows you to connect to databases via JDBC and to setup these to be imported and indexed on a regular schedule. This allows you to perform Google like searches on your database, only it isn’t really searching your database, its searching the data it extracted from your database.

Web Searches

This is similar to how web search services like Google works. Google dispatches thousands of web crawlers that spend their time crawling around the Internet and adding anything they find to Google’s search database. This is the real search process. When you do a search from Google’s web site it really just does a read on its database (it actually breaks up what you are searching for and does a bunch of reads). The bottom line though is that when you search, it is really just doing a direct read on its database and this is why it’s so fast. You can see why this is Big Data since this database contains the results for every possible search query.

This is quite different than a relational database where you search with SQL and the search goes out and rifles through the database to get the results. In a SQL database putting the data into the database is quite fast (sometimes) and then reading or fetching it can be quite slow. In NoSQL or BigData type databases much more time goes into adding the data, so that certain predefined queries can retrieve what they need instantly. This means the database has to be designed ahead of time to optimize these queries and then often inserting the data takes much longer (often because this is where the searching really happens).

Scale Out

Elastic Search is designed to scale out from the beginning, it automatically does most of the work for starting and creating clusters. It makes adding nodes with processing and databases really easy so you can easily expand Elastic Search to handle your growing needs. This is why you find Elastic Search as the engine behind so many large Internet sites like GitHub, StumbleUpon and Stack Overflow. Certainly a big part of Elastic Search’s popularity is how easy it is to deploy, scale out, monitor and maintain. Certainly much easier than deploying something based on Hadoop.

elasticsearch_topologies

Analyzers

When you index your data its fed through a set of analyzers which do things like convert everything to lower case, split up sentences to lower case, split words into roots (walking -> walk, Steve’s -> Steve), deal with special characters, dealing with other language peculiarities, etc. Elastic Search has a large set of configurable analyzers so you can tune your search results based on knowledge of what you are searching.

Fuzzy Search

One of the coolest features is fuzzy search, in this case you might not know exactly what you are searching for or you might spell it wrong and then Elastic Search magically finds the correct values. When ranking the values, Elastic Search uses something called Levenshtein distance to rank which values give the best results. Then the real trick is how does ElasticSearch do this without going through the entire database computing and ranking this distance for everything? The answer is having some sophisticated transformers on what you entered to limit the number of reads it needs to do to find matching terms, combined with good analyzers above, this turns out to be extremely effective and very performant.

Real Time

Notice that since these search engines don’t search the data directly they won’t be real time. The search database is only updated infrequently, so if data is being rapidly added to the real database, it won’t show up in these type of searches until the next update processes them. Often these synchronization updates are only performed once per day. You can tune these to be more frequent and you can write code to insert critical data directly into the search database, but generally it’s easier to just let it be updated on a daily cycle.

Security

When searching enterprise databases there has to be some care on applying security, ensuring the user has the rights to search through whatever they are searching through. There has to be some controls added so that the enterprise API can only search indexes that the user has the rights to see. Even if the search results don’t display anything sensitive they could still leak information. For instance if you don’t have rights to see a customer’s orders, but can search on them, then you could probably figure out how many orders are done by each customer which could be quite useful information.

Certainly when returning search results you wouldn’t reveal anything sensitive like say salaries, to get these you would need to click a link to drill down into  the applications UI where full security screen is done.

Summary

Elastic Search is a very powerful search technology that is quite easy to deploy and quite easy to integrate into existing systems. A lot of this is due to a powerful RESTful API to perform operations.