Sam Sharpe is an Operations Engineer at GDS, working within our Hosting and Infrastructure Team. This post was contributed to by others - see the change history for more details.
After a talk by Jordan and Tom at the OpenTech 2013 Conference recently, I was asked if the number of technologies we use scares us. I thought this was an excellent question leading to lots of interesting points about diversity which I would like to explore here.
GOV.UK is a diverse collection of individual applications and supporting services. We use at least five different programming languages, three separate database types, two versions of an operating system and other variations too numerous to count. When issues arise, the first problem is often to determine which of these many things are related to the problem.
If all you have is a hammer, everything looks like a nail
The reason we operate such a diverse ecosystem is that we are focused on solving real problems. Our first task is to understand the problem or need we are solving and then to choose the best tool for the job. If we restrict ourselves to moulding the need to the tools we already have, then we risk not solving the initial problem in the best way possible for the user. By restricting software diversity or enforcing rigid organisational standards on a project, there is a possibility of descending into a cargo cult, where we simply repeat the same patterns and mistakes in everything we make.
GOV.UK is designed as a set of modular applications that each fulfil a defined set of needs. The code that generates the maternity pay calculator is completely different from the code to publish information from government departments. By having these independent pieces, we make sure that the application is suitable for the job and we also allow ourselves to increase our scale of operation by having different people or teams work on something independently.
The advantages and challenges of code diversity
- using the best tool for the job
Sometimes the best tool for a job is not the one you are currently using. My colleague Nick Stenning recently prototyped a new router to direct requests to the right applications in Go - yes, it could have been done in Ruby or Python or another language we use elsewhere (we've created a router in Scala before), but Go is designed for massive concurrency which is a feature our router needs; the code also has fewer dependencies. Implementing something similar in one of those other languages would require many more lines of code (compare the Scala and Go versions); more lines of code increases the chance of errors.
- minimising the risk of one tool causing a problem across the entire site
We have patched some of our applications several times in the last few months due to vulnerabilities in the Rails application framework. If every component application on GOV.UK was written in Rails, we would need to upgrade every single application each time a new vulnerability was discovered. It's true that using lots of frameworks means more possible bugs, but each one of those bugs would have a smaller risk to the site.
- encourages a wide variety of contributions and skills
Often developers who are comfortable in more than one language are more creative, choosing the right tool and code pattern for the job. If you only know one language then you're likely to shoe horn the job to the code patterns you already know. Probably not the best approach.
At GDS we concentrate on hiring good developers who display a wide variety of skills. We do not have a bias for the people being masters of the tools we already use. A good developer who can already use more than one language should have no difficulty grasping another. We may also learn something new from them and the team will become stronger as a result.
Challenges and how to mitigate them
- breadth of knowledge required to operate
GOV.UK embraces the DevOps model. The developers of an application are actively involved in supporting that application in production. While that means that the breadth of knowledge is still large, the depth of that knowledge is held within the application team who provide additional support. As an Operator, I need to understand roughly how it works, but for the detail I can defer to the experts.
- large number of pieces that can break
Yes, we have a lot of things that can break, but if they do, large parts of the site and content available to the public (which is ultimately why we exist) will live on. Although breaking a very complex site down into a number of small components does increase complexity, it also reduces the risk that a single error can cause overall failure.
- unfamiliarity with tools leading to poor quality
If a team chooses to implement something in a new tool that none of them are familiar with, it may lead to poor quality code. At GDS, we like Pair Programming - two people collaborating on something generally increases its quality, and can allow those with more familiarity to teach others a new tool. Once that piece of work has finished, the pair can split up and pair with two new people, meaning that the number of developers who are familiar with the code doubles quickly.
We follow other best practice, for example code reviews which ensure the output of the pair is understandable and easily maintained by others. We also use Test Driven Development, to confirm our code works and to build up a regression test suite over time.
Standardisation is still important
Prior to launch we used two separate search tools, Apache Solr and ElasticSearch; each one does roughly the same job and the effort to convert applications to use either of them is relatively low. For that reason we standardised to reduce complexity - shortly after launch, all of our applications were converted to use ElasticSearch.
Imposing loose standards can sometimes reduce the support burden. We use Ubuntu Linux, but the version (10.04 LTS) we were using prior to launch didn't run some of the software we needed. Rather than pick a completely different distribution of Linux, we made a decision to install a more recent version of Ubuntu Linux (12.04 LTS) on some of our machines. This allows us to share tools and knowledge.
So is it scary?
Yes, software diversity can be scary, but sometimes scary things make us better. If you like software diversity, maybe you should work for GDS!