Skip to main content

What was the evidence? Users’ information needs and analytics

Posted by: , Posted on: - Categories: GOV.UK, User research

As mentioned in other blogs, one of the first and primary tasks for the Alphagov project was to build a site that provided the information and services users need. So where were we going for the evidence? Within the timescale and resources of the project we saw value in taking a quantitative approach - looking at analytics and particularly search analytics across the central government web estate.

Writing this, I’m aware of Leisa Reichelt’s blog on UX and Alphagov, but here I’m going to talk about the learnings we got from analytics about what people are doing. Of course, this should be informed, qualified and enriched by qualitative studies to understand intent.

How do visitors get to central government information?

We looked at where users were coming from when arriving at central government websites. Hitwise provides information on the which referring (upstream) properties send traffic to a site. Unfortunately, it only reports on the top twenty referrers. Grouping these together, we see that search is the main driver:

Upstream Websites visited before Government – Jan 2011
All govt sites Central govt sites
Search Engines 47.40% 47.50%
Govt sites 3.80% 4.00%
Social 4.30% 4.30%
Portals 2.50% 2.40%
Others* 4.40% 4.30%
Ttl of top 20 62.40% 62.50%

Others, include BBC, Wikipedia


Upstream Websites visited before 2 major departments – Jan 2011
a b
Search Engines 53.15% 39.45%
Govt sites 8.59% 24.58%
Social 2.98% 2.54%
Portals 5.78% 3.42%
Others* 3.29% 1.70%
Ttl of top 20 73.79% 71.69%

Others, include BBC, Wikipedia

We also had access to a variety of site metrics, and here are some figures for Directgov:

Upstream Websites visited before Directgov – Jan 2011
Search 61.90%
Brand aware 5.40%
Affiliate (govt sites & others partners) 13.60%
Organic referrer (other links) 18.90%
Other 0.20%

So, of course it's vital to provide other routes to information, but evidence points to the important of Google, Bing and social websites as the starting point of people's access to government information. It's therefore useful to review what people are searching for.

Users' information needs expressed through search

To identify candidate user needs for Alphagov we reviewed data from a variety of sources:

  • Referring terms from web search engines to central government websites
  • Referring terms from web search engines to Directgov, and a number of Government Departments
  • Terms enter in site search for Directgov, and a number of Government Departments

As a fan of Louis Rosenfeld's approach to using search analytics and the attraction for me is that search logs provide a direct expression of many, many users' information needs. The data are usually readily available and the main cost is to interpret the data.

Firstly, some characteristics of search data:

  • The average length of a search phrase is around three words. Users are trying to express a wealth of complex needs and intent in a short phrase.
  • Search results show a classic Zipf curve, with a short head (relatively few terms searched for a lot); and a long tail (very many terms searched for by only one or two users).

Just looking at the short head tells you the most important search tasks, but at a simplistic level. But much richer data starts to appear in the 'middle torso', where you start to see not only new phrases but added nuances or facets of the top phrases, which can give a clue to what the intent expressed in three short words might really mean. For example, Council Tax is a popular search term, but digging deeper into the long tail, a number of facets being apparent:

  • council tax bands/council tax rates
  • council tax exemptions
  • how much is council tax?
  • pay council tax/ pay council tax online
  • council tax benefit
  • council tax exemptions
  • council house discounts
  • council tax moving house
  • Registering for Council Tax

These are just the main concepts – that need distinct answers, but there are many more variations of phrases. So our approach was to go deeper into the middle and build up a series of top concepts, with important variations that probably need a separate answer. Going back to Council Tax, here's some of the landing pages on Alphagov:

  • How much are Council tax bands?
  • Apply for a Council Tax band reduction
  • Pay Council Tax
  • Check your Council Tax balance

This data was then reviewed against top landing page data from Directgov,, the Local Directgov service and a number of departmental sites; and by talking to stakeholders about business priorities to establish a list of top 'tasks'.

Now, tasks could be a range of things:

  • Completing a transaction (renew car tax)
  • Using an online tool or look-up (calculate holiday entitlement)
  • Finding a quick answer (when is the next bank holiday?)
  • Find out some information - to orientate, find out more (child tax credit). This could be provided by a guide or even a disambiguation page.

And people's intent changes according to where they are on their search journey. As people progress along the journey their search terms tend to be more specific and more focused:

Task Type of search term Example
Informational broad Diabetes
Compare resources narrower Diabetes symptoms
Compare options narrower Managing diabetes
Transactional specific Needles
Locational With a location Pharmacists in SE1

So, from our perspective, this data gave a strong steer to the approach Richard Pope outlined in his earlier blog, It’s all about the nodes and what lives at them.

It's the landing page or node which is important; whether people are coming from web search, site search, deep links or social media. Increasingly, in future, people will be consuming content and tools elsewhere, through syndication and APIs. Where they land up needs to make sense as a granular, stand alone item; with a strong 'scent of information' from the link to the answer.

Signposting to related information and services is of course very important. People may need to know more, or it may be to their benefit to be pointed to additional information. I hope future iterations will experiment with leveraging metadata and concepts of relatedness to enhance this experience.

Many thanks to colleagues for contributing to this analytics work, especially Helen Lippell, who provided valuable insight based on user-centred taxonomy development.

Photo credit: Ian Britton, used under Creative Commons licence.

Sharing and comments

Share this page


  1. Comment by Tristan posted on

    I think is a great idea to learn where your traffic is coming from and how to help them in finding the information they need.

  2. Comment by Nick Moon posted on

    Well don't those numbers tell a very different story to the claims being made by the team developing Reading through this blog I was under the impression that the focus was on search because 90% of people arrive at sites via search.

    Turns out it's barely half. So you are using statistics gleaned from half the user base - people who search - to design the site for all users. Half of whom don't search. Now that would be OK if you thought the people who didn't arrive via search have similar wants/wishes/motivations/mental attitudes to those that did use search. But that ain't going to be the case. These people are going to be very, very different.

    The suspicion grows that you've chosen search because it's easy. Government information is, lets face it, in a right old mess. it's almost impossible to catalogue or structure. So lets not try. Bung it all on a site in any old bloody order and just give the users a search box.

    • Replies to Nick Moon>

      Comment by Different Anon posted on

      Why do you sound so surprised, Nick? That's what the whole of Alphagov has been about - pick off the low-hanging fruit, deal with the easy stuff, make it shiny and white and non-IE6-compatible because that what impresses the other geeks out there.

      Meanwhile, all the really hard work is yet to be done - structuring info, working out how it gets maintained, simplifying processes, dealing with the politics of it all, sorting out the contracts with external suppliers that guarantee them profits for the next five years, and so on. (Actually, I bet it's worse than that. Some of that work has probably *already* been done, but Alpha chose not to use it because it wasn't sexy enough).

      So when Alphagov never turns into anything more than a prototype, all those people involved in it - who by then will have moved on to other projects, partly trading off the cachet of having been involved with Alpha - will throw their hands up in horror and say, "But that's not our fault! Look how bright and shiny our site was, when we didn't have to be accountable to anyone or worry about whether it worked in the real world".

      • Replies to Different Anon>

        Comment by Doris Gray posted on

        Hi Different Anon, why are you posting anonymously? Do you work for an organisation who isn't allowed to criticise Directgov, like me?

        You've hit the nail on the head there. No one will stand up and say "Yes, I'm responsible for Directgov. I ran some customer engagement, made decisions and launched a product I stand by." Everything's done by committee, using hired guns, which is why Project Austin, the enhanced templates and all the other grand redesigns ran out of steam.

        Alphagov currently consists of an html website with a search box. It cost £271,000, but I don't see any evidence of a brave new dawn.

  3. Comment by peter posted on

    Definitely. If you follow the paradigm that the search engine is the home page, then I think it's a good thing to provide marked up information, that Google or Bing etc. can use and display in the search results.

    So. for example, if you just want to know what's the minimum wage, why not get a little table served up in Google or Bing? I think we need to think about kite marks or some proof it's correct information though.

  4. Comment by Will posted on

    Fascinating, thanks for the post.

    Thinking aloud, if the majority of Govt users arrive via search engines is it worth looking at ways of exposing more info to them?

    I'm thinking of the tieups Google has done to embed cinema times, recipes, snow reports ( and more in results.

  5. Comment by Steph Gray posted on

    Really interesting post, Peter. I'm curious as to what data you looked at as part of this analysis - did you get traffic/search/Hitwise data for some of the departmental and big agency sites too, and if so, do they show similar or different Zipf curves, search patterns etc?

    I appreciate Alphagov sensibly started with the main citizen transactions and information goals, but as it moves on to assimilate departmental sites, does the same logic apply? And can you serve nodes with really specific policy questions about EU regulations etc in the same way?

    • Replies to Steph Gray>

      Comment by peter posted on

      Given the limited time frame we used Hitwise to get a view of the cross-government piece, and looked at data from Directgov, and several departments. The Zipf curve is pretty consistent, but where there are smaller volumes of traffic the data is less rich to look for different patterns.

      For the citizen/business audiences, my personal thoughts are that analytics and insight provide strong information that helps prioritise the important stuff - but you still need to make sure the longer tail of info is findable.

      For departmental/agency info, perhaps the "supply" side of the equation has to have more emphasis, in that there will be statutory needs to publish a wide range of material. I still think the same approach can give a strong steer to priorities, but to comment further I'd like to better understand professional audience needs.