The Road to DevOps Part 1
Historically Development and Operations have been two separate departments that rarely talk to each other. Development produces a product using some sort of development methodology and when they’ve finished it, they release it and hand it off to Operations. Operations then installs the product, upgrades the users from the old version and then maintains the system, fixing hardware problems, logging software defect complaints and keeping backups. This worked fine when Development produced a release every 18 months or so, but doesn’t really work in the modern Web world where software can often be released hourly.
In the new world Development and Operations need to be combined into one team and one department. The goal being to remove any organizational or bureaucratic barriers between the two. It also recognizes that the goal of the company isn’t just to produce the software but to run it and to keep it available, up to date and healthy for all its customers. Operations has to provide feedback immediately to Development and Development has to quickly address any issues and provide updates quickly back to Operations.
In this posting I’m covering the aspect of DevOps concerned with frequently rolling out new versions to the cloud. In a future posting I’ll cover the aspects of DevOps concerned with the normal running of a cloud based service, like provisioning new users, monitoring the system, scaling the system and such.
Agile to DevOps
We have transitioned our software development methodology from Waterfall to Agile. A key idea of Agile is that you break work into small stories that you complete in a single short sprint (in our case two weeks). A key goal is that during the sprint each story is done completely including testing, documentation and anything else required. This then leads to the product being in a releasable state at the end of each sprint. In an ideal world you could then release (or deploy) the product at the end of each sprint.
This is certainly an improvement over Waterfall where the product is only releasable every year or 18 months or so. But generally with Agile this is the ideal, but not really what is quite achieved. Generally a release consists of the outcome of a number of sprints (usually around 10) followed by a short regression, held concurrently with a beta test followed by release.
So the question is how do we remove this extra overhead and release continuously (or much more frequently)? This is where DevOps comes in. It is a methodology that has been developed to extend Agile Development and principles straight through to actually encompassing the deployment and maintenance of the product. DevOps does require Development change the way it does things as it requires Operations to change. DevOps requires a much more team bases approach to doing things and requires organizational boundaries be removed. DevOps requires a razor focus on automating all manual processes in the build to deploy to monitor to get feedback cycle.
Most development processes have an automated build system, usually built on something like Jenkins. The idea is that when you check in source code, the build system sees you checked it in, then it has rules that tell it what module it is part of, and rebuilds those modules, then it sees what modules depend on those modules and rebuilds those and so on. As part of the build process, unit tests are run and various automated tests are set off. The idea is that if anything goes wrong, it is very clear which check-in caused it and things can be quickly fixed and/or rolled back.
This is a good starting point but for effective DevOps it needs to be refined. Most modern source control systems support branching (most famously git). For DevOps it becomes even more crucial that the master branch of the product is always in releasable state and can be deployed at any time. The way this is achieved is by developing each feature as a separate branch. Then when a feature is completely ready for release it can be pulled into the master branch, which means it can be deployed at any time. Below is a diagram of how this process typically works:
Obviously in this environment, it isn’t going to work if for every frequent release, you need to run a complete thorough manual test of the entire product. In an ideal world you have very complete test coverage using various levels of automated testing. I covered some aspects of this here. You still want to do some manual testing to ensure that things still feel right, but you don’t want to have to be repeating a lot of functional testing.
Operations can then take any release and in consultation with the various stakeholders release it to production. Operations is then in charge of this part of things and ensures the new versions in installed, data is converted and everything is running smoothly.
Some organizations release everything that is built which means their web site can be updated hourly. Github is a good example of this. But generally for ERP or CRM type software we want to be a bit more controlled. Generally there is a schedule of releases (perhaps one release every two weeks) and then a schedule of when things need to be done to be included in that release, which then controls which branches get pulled into the master branch. This is to ensure that there aren’t any disruptions to business customers. Again you can see that this process is blending elements of QA along with Operations which is why the DevOps team approach works so well.
A key idea that has emerged is the concept: “Infrastructure as Code”. In this world all Operations tasks are performed by code. This way things become much more repeatable and much more automated. It’s really this whole idea that you build your infrastructure by writing programs that then do the real work, that has largely led to movement of Developers into Operations.
This is where Development and Operations must be working as a team. Operations has to let Developers know what tools (scripts) they require to deploy the new version. They need automated procedures to roll out the new version, convert data and such. They have to be working together very closely to develop all these scripts in the various tools like Jenkins, PowerShell, Maven, Ant, Chef, Puppet or Nexus.
Performing all this work takes a lot of effort. It has to be people’s full time job, or it just won’t get done properly. If people aren’t fully applied to this, manual processes will start to creep in, productivity and quality will suffer.
Beyond successfully deploying the software, this team has to handle things when they go wrong. They need to be able to rollback a version. Reverse the database schema changes and return to a known stable good state.
DevOps is a whole new profession. Combining many of the skills of Development with those of Operations. People with these skill sets will be in high demand as this is becoming an area that is making or breaking companies on the Web and in the Cloud. No one likes to have outages; no one likes to roll out bad upgrades. In today’s fast paced world, these can put huge pressures on a company. DevOps as a profession and set of operating procedures is a good way to alleviate this pressure while keep up to the fast pace.