Skip to main content

https://gds.blog.gov.uk/2011/05/12/a-brief-overview-of-technology-behind-alpha-gov-uk/

A brief overview of technology behind Alpha.gov.uk

Posted by: , Posted on: - Categories: GOV.UK, Technology

The blank slate the Alpha.gov.uk team was handed was a huge privilege, but threw up some unusual technical challenges. Normally when embarking on building web apps you'd expect to have some sense of the corpus of content or the core functionality early on and could make the initial technology choices based on that. Here we were starting with some notions, a few sketches, and a determination that we not constrain our user-focus by early commitments to specific technical solutions.

We were lucky to have a development team with experience of a number of different languages and frameworks, and quickly agreed that rather than try to settle on a single one of those we'd build each tool using whichever technology would most quickly get us to a relatively-durable prototype, and then "federate" them. We started with the python-based framework Django for the Department pages, added Ruby on Rails for a suite of tools focussed on specific tasks, and used Sinatra (another ruby framework) to glue together our search. If we'd continued for longer and expanded the scope of what we were building it's quite likely that list would have grown.

That got us a few small pieces built, but we needed to join them somehow. While we're building for google-as-the-homepage, and are committed to consistency-not-uniformity, we needed a reasonable way for our front end developers to introduce that consistency, and to overlay a few site-wide components: like the geo tools that let us target information once you've told us where you are. As we anticipated early on, this was where most of the development pain lay.

Our first pass was a custom proxy that all visits were fed through. It knew which pages should go to which apps, applied a standard template, and tied in those site-wide elements. But as time went on it clear that it was awkward to develop with, rather slow, and quite brittle. Eventually we broke it into several pieces, and then replaced them each in turn.

We now have a couple of reusable components that each of our apps can include as middleware that provide templating, and shared services. Above that we're using an excellent package called Varnish to direct each visit to the correct app behind the scenes. Varnish is primarily used for caching web requests so that you don't have to do expensive computation or talk to your database every time you display a rarely changing web page, but can also be configured to do the rest of what we needed.

Everything's hosted on Amazon's EC2 cloud servers (in their EU-west cluster). We're also using their Elastic Load Balancer, and we've got some of our data stored on S3. Using those services has meant that we could experiment with our server configurations, add in more as needed, and quickly scale up where necessary. To co-ordinate it all we're using a tool called Puppet, which lets us rapidly change the configuration of a whole suite of servers with a single command.

As with every aspect of Alpha.gov.uk, the code that's been written is intended as a proof of concept. We've got some APIs in place (but not yet well documented) for a few of our tools (and our use of ScraperWiki has established quite a few more), we've established a reasonable architecture for our code, and given useful clues as to how a more mature federated system would evolve. But the real measure of what we've done is the degree to which it's allowed the whole team to stay focussed on real user needs, rapidly iterating all the way.

We're aiming to have some of that code ready to open source within the next couple of weeks, and will be giving more detail on some specific components as time goes on.

Update: We've also produced a colophon, so that's the place to go if you want a fairly exhaustive list of the different tools.

Sharing and comments

Share this page

34 comments

  1. Comment by Dezmembrari posted on

    I should add that we had a thorough penetration test run on the site. The report was very positive but also made a number of recommendations, most of which we have subsequently implemented.

  2. Comment by Gov.uk – From Alpha to Beta | Government Digital Service posted on

    [...] will be developing a flexible, adaptable, scalable, modern technology platform. I barely need mention that we’ll be continuing to use open source software, not because [...]

  3. Comment by imwilliam posted on

    The UK has laws, policies and so on regarding the usability/accessibility of government websites. Best practice suggests this should be addressed early in the design process; e.g., I can see problems with document structure as some headings are not in logical order. What approaches have you taken to ensure conformance with, let's say, WCAG 2.0? Or, "universal design?" Your methodology in this regard ought to documented as part of the colophon.

  4. Comment by Mark Johnson posted on

    Be interested in your methodology for your route to live for the things that Puppet manages, considering it myself and looking for real world examples of puppets use in mission critical environments like this.

    I also would be interested in rationalle behind the architectural decision on a US owned public cloud.

    • Replies to Mark Johnson>

      Comment by James Stewart posted on

      We based a lot of our puppet setup on some scripts created by an agency called "Go Free Range". You can see theirs at https://github.com/freerange/freerange-puppet - rather than having a dedicated puppet server we have a capistrano script that will bootstrap puppet on any given node(s) and apply our configuration. We then used EC2 security groups to distinguish the different types of nodes.

      We used EC2 because it was the convenient option. The tools were in place to make it do what we wanted and to save us spending too much time on the infrastructure. While amazon is US-owned we used their EU-West availability zone.

  5. Comment by T.J. posted on

    I appeciate this post and the colophon are really about the main alpha.gov.uk site, but why no mention of WordPress?

    • Replies to T.J.>

      Comment by James Stewart posted on

      You're right, we should have mentioned wordpress in the colophon and I've amended that.

      We've not talked about wordpress because this blog is the only place it's used and that's quite independent of the main site (separate server, separate database, etc). Simon Dickson wrote about the blog setup at http://puffbox.com/2011/05/03/alphagov-blog-open-for-business/

      • Replies to James Stewart>

        Comment by T.J. posted on

        Thanks for the quick reply, (sorry mine is so much later).

        Thanks for the link to the info too.

        Will the link to the blog stay as prominant on the alpha site when it goes fully live?

        At the moment it isn't really very clear how to get back to the main site from the blog and people may end up viewing more pages in the blog than the main site.

  6. Comment by Bill posted on

    It's very good to see that a start has been made on getting things done with the Government sites. And what's even more impressive is that good use being made of what's available. You may think that this a is a rather strange comment but after managing to still be in one of the last IT sections in a main Government Dept where the staff actually know how to program, set up a Web server, know which is the best O/S to have on the server, understand the difference between the Web Server (software - IIS, Apache, etc) and the Web Server (hardware) and basically have up-to-date knowledge about "hands-on" IT, it isn't as weird as it sounds. Most IT sections within Government are often populated by Project Managers who wouldn't know the difference between Apache and Tonto (that's the Lone Ranger's side kick by the way). They can come up with buzz works and business speak but believe that that if they don't understand something they can get a consultant to explain it at our expense. I can say that because, and here's the punchline, the Government don't have any money. It's the taxpayers money and they're are paying consultants and out-sourced IT providers a fortune at our expense. Don't get me wrong I believe that IT firms such as yours are the way to go but experienced, hands-on IT professionals within the Government Depts who can take over if things go wrong or, more to the point, can oversee the development process are essential.

  7. Comment by Philip Hands posted on

    I presume that the underlying Operating System you are running all this on is GNU/Linux, but I see no mention of the particular distribution in use.

    Is it a secret?

    Not that it makes a lot of difference, once you've automated it into submission with puppet, but as a Debian Developer I'd be intrigued to know what you chose, and how you made that decision.

    • Replies to Philip Hands>

      Comment by James Stewart posted on

      It's Ubuntu - we've produced a Colophon which is doing the rounds on twitter but hasn't been clearly linked to here (I'll update the post to link to it). All of the dev team have been using Ubuntu for some time so it seemed the obvious choice for us.

    • Replies to Philip Hands>

      Comment by Bill posted on

      If you want to check on the O/S and the Web server used on a site check out Netcraft who do a addon for Firefox. It usually works but I found one odd anomaly when checking a Microsoft site and saw IIS running on Linux???!!! It seems that it's all to do with headers and the O/S of the caching server is returned, if used, and the Web server is also reported. So the Web server is running on Windows (not reported) and the caching servers are running Linux (reported). Seems even Microsoft are coming around to the benefits of Open Source even if the caching servers aren't Microsoft's.

  8. Comment by Lawrence posted on

    Out if curiosity, are you using any online project management tools?

    • Replies to Lawrence>

      Comment by James Stewart posted on

      We've got all the code stored in github and initially used their issues tracker to keep on top of things. We then moved to lighthouse for bug tracking. But the project management was largely done without specific online tools other than google docs. There's a bit more detail on the project management side in Jamie's entry at http://blog.alpha.gov.uk/blog/agile-does-work-in-government

  9. Comment by Pete C posted on

    Please don't recommend Amazon hosting for deployment of real sites. My qualm is not technological.

    I can see the headline now: UK Government disappears from Web as Amazon arm-twisted by powerful US lobbyists opposed to proposed banking reforms.

    I kid you not.

    Also, promoting more outsourcing to the US will add insult to injury to the UK's IT industry. If a gov cannot rely on/invest in it's own infrastructure…

    The hosting of police.uk on Amazon is really not helpful. Neither is their use of google maps. Or anyone else building government sites. Time for Ordinance Survey to build and openly release a decent service – alpha.gov.uk's excellent integration of maps is just embarrassing with google's logo splattered everywhere, not least because the mapping data is originally crown copyright. The irony!

    • Replies to Pete C>

      Comment by Matt T posted on

      >> The hosting of police.uk on Amazon is really not helpful.
      Cripes ! Really ? That does seem a little alarming.
      Just goes to show how far people will go for something trendy..!

  10. Comment by Jim Thomas posted on

    Have you considered using something like http://www.interstateapp.com to make the development of Alpha.gov.uk more transparent?

  11. Comment by Bèr Kessels posted on

    The Open Source CMS Drupal is used for many government-sites, including several in the UK. Most famous, however, being whitehouse.gov in the US. For many governments Drupal has become almost a default choice for their sites.

    Did you consider Drupal at all? You seem to have chosen several RAD frameworks, rather then readymade CMSes. Was a CMS an option as a basis at all, or did it have to be a RAD?

    From Open Source POV, are you considering sharing or releasing some gems, or Django packages?

    Thanks a lot for sharing the inside-information already. This openness is refreshing and enlightening. Probably even more so then the resulting website, which already is fantastic!

    • Replies to Bèr Kessels>

      Comment by James Stewart posted on

      I've used drupal a few times in the past so it was definitely on our radar, but with a very strong emphasis on tools application frameworks fitted our needs better. Django's built in admin tools served us pretty well for those elements that were most like a traditional CMS.

      We're definitely planning to release some of our geo tools. We just need a bit of time for the dust to settle so that we can add a bit more documentation, pull a few pieces together, etc. There are a few other pieces we'd like to release too if time allows.

  12. Comment by Mark King posted on

    Interested to see that in order to make a suggestion (that location should allow BFPO or at least 'abroad') I would have to register with something else. The first notes that everything will be done under the law of the US. (And I see this page is happily keeping google-analytics informed...) Surely it should be possible to use without having to sign up to anything in the US.

    • Replies to Mark King>

      Comment by Tom Loosemore posted on

      Yes, we're looking hard at whether we've adopted the right approach to feedback using GetSatisfaction. It's a very fine, simple, tool for feedback in situations like this - we're getting incredibly useful feedback if you look on http://getsatisfaction.com/alphagov - but I'm not comfortable with the need to sign up, and I'm not sure we're being transparent enough in how we signal to people what's occurring.

      We're putting our heads together as I type... it's a tricky balance.

      • Replies to Tom Loosemore>

        Comment by Paul Annett posted on

        To clarify, our Get Satisfaction page does not require a Get Satisfaction account to contribute. People can log in with their Twitter, Facebook, Google or Microsoft Live accounts.

        Also, you can interact with our Get Satisfaction page via our Facebook page (they use the same content), so you never even have to go to the Get Satisfaction account, which fits in where Martha's report encouraged Government to be out and about on the internet where citizens already interact.

        That said, you do need to have one of those accounts to participate.

        We're not using the fact that you have to log in to gather any personal information or monitor your usage.

        • Replies to Paul Annett>

          Comment by Andy McGarry posted on

          I'm really interested how you implemented identity so that "People can log in with their Twitter, Facebook, Google or Microsoft Live accounts."

          I think this is the most important area for new government sites, so that they interact with people instead of treating everyone as anonymous.

          Keep up the good work!

          • Replies to Andy McGarry>

            Comment by Paul Annett posted on

            That's only for the feedback page – Get Satisfaction – which is a third-party system, and is only being used for the prototype. It wouldn't be used on a live Government site.

  13. Comment by Andrew B posted on

    Who's responsible for the security of the site, and all the frameworks you mentioned ?

    All those technologies will require prudent patch management to ensure the integrity of the site and all its accessible data.

    I have a picture in my mind of your security manager banging his head off a wall somewhere =)

    HTTP compression could make the site snappier too; including compressing your JS files and make 'em cacheable. Images are large too.

    A.

    • Replies to Andrew B>

      Comment by James Stewart posted on

      At present, I'm responsible. We've got a lot of the updates automated and are working on some tools to help us better keep track of changes to components. RIght now despite the number of tools we're using there's not a huge amount to keep track of, but obviously security will play a part in any future decisions.

      I should add that we had a thorough penetration test run on the site. The report was very positive but also made a number of recommendations, most of which we have subsequently implemented.

      We had HTTP compression in place but it doesn't work well with Varnish when using Edge Side Includes. There's a new version of Varnish coming soon that addresses that. We also considered doing more to minimise and compile JS/CSS/etc but it wasn't enough of a priority to fit into our very tight schedule.

  14. Comment by Tom O posted on

    I hope you're using multiple availability zones on Amazon EC2, and not putting all your eggs in one basket.

    • Replies to Tom O>

      Comment by James Stewart posted on

      We are indeed. Our load balancers are spread evenly between availability zones, our two database servers are also in different zones and the app servers are scattered between them. We were watching the recent US-East EC2 outage very carefully...

  15. Comment by Matt T posted on

    OK thanks. I ask because as we all know, dedicated hosting tends to be expensive (and govt. departments tend to have multiple hosting contracts). Amazon has the potential to be much cheaper but I'm wondering how it works out in practice.
    If anybody has the time, it would be interesting to know how the first bill works out, whenever it comes in (presumably there will be one, at some point!)

  16. Comment by SamWM posted on

    Purely out of curiosity, how feasible would it be to have some kind of stats page, showing amount of traffic, pages visited etc (e.g. stats.alpha.gov.uk)? Live, or if that would be too resource intensive, daily or weekly. Sites that are big (or may likely be at some point) don't tend to have any real detailed statistics

    • Replies to SamWM>

      Comment by James Stewart posted on

      It's something we discussed but haven't had time to implement. Definitely on the table for the future.

  17. Comment by James Stewart posted on

    Amazon charge based on usage, so the total fee will be calculated based on how many servers we bring up, how long we keep them running for, and how much storage we use. Which means I can't give you an accurate figure right now as it's a moving target.

    You can see all amazon's pricing details at http://aws.amazon.com/ec2/#pricing

  18. Comment by Matt T posted on

    Out of interest, how much does the amazon hosting cost (per month - or however they charge..)?