https://gds.blog.gov.uk/2012/10/11/no-link-left-behind/

No link left behind

From October the 17th GOV.UK will replace Directgov and Business Link as the best place to find government services and information, but what will that actually mean for people following links for these sites, or visiting bookmarked pages?

URLs – links and Web addresses – are the ‘strands’ in the Web metaphor. When URLs change, the strands break.

The advantages, and implications of preserving links has been nicely explained by the inventor of the Web, Sir Tim Berners-Lee in his essay “cool URLs don’t change”. But preserving URLs isn’t just about being a good citizen of The Web, it’s about putting users first.

Links in the wild

Directgov URLs abound on posters in Job Centres, official government stationery, mugs and even calculators handed out over the years. The same is true for Business Link, which has been amassing inbound links to its collection of Web sites since 2004.

It must be Gov ..

All of these are likely to be found out in the wild for some time to come. For example, there’s a Directgov URL direct.gov.uk/workplacepensions in the current Workplace Pensions campaign:

http://www.youtube.com/watch?v=9gcd8lj4v3M

Somewhat ironically, that video has been deleted from YouTube!

When GOV.UK is released we don’t want people to visit these and find broken or abandoned websites. We also want to make sure that departments don’t have to throw away their old stationery or get every leaflet reprinted simply because the link has changed.

So, from October the 17th people following links or bookmarks to Directgov and Business Link will be automatically redirected to an appropriate page on GOV.UK.

It must be GOV

GOV.UK has been live (as a beta) for 10 months, so the launch on the 17th isn’t so much about releasing new software; it’s about removing the beta warnings and decommissioning a number of existing Web sites.

Redirection Day

Many organisations decide that redirecting sites is an onerous task, so they either redirect the all the old links to the front page of the new site, or simply switch the site off in its entirety. We’re not doing that. Instead we’ve created individual redirections for each and every page on Directgov and Business Link to an equivalent page on GOV.UK.

This means that someone looking for the Directgov Bank Holidays page will be taken straight to the new GOV.UK Bank Holidays page

Proxy

To ensure we redirect as many URLs as possible, we’ve collected log files from the many machines which run Directgov and Business Link.

The logs themselves are reasonably large; a single day from one Directgov server contains in the order of 25 million lines, containing 9 million distinct URLs. These URLs have proven to be useful test data which we replay against our redirection service to test our redirection as a part of our continuous integration environment.

Map all the things

What we then do is map those URLs: create a table that matches an old URL with a new URL, and assigns an HTTP status code to each so we know how to treat each one.

Assembling the mappings and ensuring they redirect people to the correct place has been an enormous undertaking, involving hundreds of people across dozens of organisations. For each redirection we have to collectively decide on which user-need is being met by the old page, and how best to satisfy the need on GOV.UK.

Some pages are straightforward to map, but in some cases a mapping may not be obvious. This is particularly true for pages which cover multiple topics, or serve to help people navigate several pieces of content. People might have bookmarked these pages for any number of reasons, so we’ve reached a decision based in part on data from use of the sites, and in part based on the judgement of the team at GDS and beyond.

Departments were given tools to review the mappings side-by-side onscreen, so we could get feedback straight away if the mapping appeared incorrect.

Not every road leads to GOV.UK

As Etienne mentioned earlier this week, there are some pages for which no user need has been identified on GOV.UK.

In those cases we’re going to show users a notice that tells them what has happened to the page, and offers them a link to a copy of that page on the National Archives website. In some cases we might also show users a suggested link to the canonical source – another website that meets the user need represented by the old page.

We’re confident all this effort and the hard work and collective attention to detail made by so many people will prove to have been worthwhile. Come October the 17th, rather than waking up to a broken Web, you’ll find it just that little bit better.

37 comments

  1. Terence Eden

    This is a really refreshing attitude – and you are to be commended for it.

    One question – when someone is redirected, will they see a banner saying “You are here because…”?
    I can imagine that someone who has been faithfully following a specifc link for the last 10 years, may think that the site is broken, or they’re being phished etc.

    Which, I guess leads to a suplimentary question – is the above also true for files as opposed to web pages?

    Keep up the good work.

    Reply
  2. David Bennett

    Will you pick up content that for some reason is shown as webarchive pages – such as http://webarchive.nationalarchives.gov.uk/+/www.dh.gov.uk/en/Healthcare/Entitlementsandcharges/OverseasVisitors/Browsable/DH_074373 ?

    Reply
  3. Mike O'Neill

    Does that mean 1st party cookies that are placed by directgov etc. sites will still be there? Have you mentioned that in your cookie policy and will citizens be able to opt-out of the user identifying ones?

    Reply
  4. John Greenaway (@JohnGreenaway)

    “Awaiting Content 418″. HTTP 418? Well played ;)

    Reply
    • seo wales

      A short and stout answer for sure but shouldn’t that be a 404?

      Reply
      • Paul Downey

        Using this code allows us to f distinguich between redirections which don’t yet have a destination URI, and missing mappings to our CI environment. Sadly we don’t ship if we’ve 418s left in our configuration, even though it’s a perfectly legitimate use of a fun 4xx status code.

        Reply
  5. Paul Downey

    Terence: GOV.UK does have a “formerly DirectGov or Business Link” notice on the page, but we are concerned about people worrying about being phished. We did consider a staging page for the redirection, but that faired poorly in “guerrilla” user-testing and can have an adverse impact on search engines following links. I personally think the shorter, cleaner http://www.gov.uk domain will help that, greatly.

    As for “files”, we’re concerned not to break applications which may rely upon DirectGov or Business Link assets, such as images, JavaScript, CSS and PDFs, so we’ll continue to serve many of those for some time to come.

    DEFRA aren’t a part of the October launch, but many departments are moving to GOV.UK as a part of the Inside Government programme.

    David, dh.gov.uk isn’t a part of the GOV.UK launch. The DirectGov and Business Link sites have been repeatedly indexed by The National Archives, so links to the webarchive will continue to work, and you’ll be able to find old copies of any pages we redirect or remove.

    Mike, your browser won’t pass cookies from direct.gov.uk to http://www.gov.uk thanks to “Same Origin Policy”. A lot of thought lies behind GOV.UK’s implementation of cookie law, making it exemplar. There’s a fairly clear explanation of our policy here: https://www.gov.uk/help/cookies

    Reply
  6. Rory Gibson

    The link to the National Archives doesn’t work – no protocl, so it’s being treated as relative.

    Reply
    • Louise Kidney

      Hi Rory,

      I think someone came along and fixed this just as you posted as I can’t see an issue with the link from this end either in the tag or on the click outcome.

      Louise

      Reply
  7. This week at GDS | Government Digital Service

    [...] working hard to make sure that when GOV.UK goes live no Business Link or Directgov URLS are broken. Paul wrote yesterday about the work the Transition team have been doing, and they’ve been hard at work to make sure that mappings and redirections are ready in time [...]

    Reply
  8. Blogging for Queen and country « Matthew Sheret

    [...] there have been 23 posts published, covering topics as varied as what it means to code in the open, redirections, how the Finance team work and a load more. For the writers, it takes just a little bit of time [...]

    Reply
  9. Guy

    Is this project the reason that the HMRC site is down for the entire weekend?

    I love the approach you’re taking to creating the new GOV.UK structure and I look forward to pulling information on completing my tax return this coming week. It’s just a shame that the HMRC site was taken down for the entire weekend on what would be the last real weekend before filing deadline for paper returns.

    Hopefully the downtime is worth it and searching the new GOV.UK site for the notes pages that are referenced in the assessment forms will actually return the notes. The current (former?) HMRC search function doesn’t return anything useful.

    Reply
    • Paul Downey

      Guy, no, the HMRC outage was planned, and not related to the DirectGov and Business Link transition to GOV.UK.

      I also noticed the outage from our continuous integration tests which were failing, as we plan to redirect a number of Business Link pages to HMRC.

      Reply
      • Guy

        Thanks for the update Paul. Odd choice of weekend for a planned outage of HMRC.
        It’s great that you’re exposing the work behind the scenes on the renovation of the government portals. I had a poke around in the beta site to see how the tax disc purchasing process was changing. I don’t think the process designer owned a car as the user must click some variation of ‘buy a tax disc’ 5 times before even entering any information. Is your project tackling process improvements such as this?

        Reply
  10. Robbie Clutton (@robb1e)

    URL and not URI? You’ve changed your tune =p

    Reply
  11. rossc0

    The example url redirects and 404′s :(

    http://direct.gov.uk/workplacepensions – I think it should redirect to: https://www.gov.uk/workplace-pensions

    Reply
    • Paul Downey

      Crikey! Fixed! Should be in your browser, subject to caching trickling out to your browser. Thanks!

      Reply
  12. GOV.UK – One day in | Government Digital Service

    [...] that Directgov and Business Link had on Monday, and a slight increase on Tuesday, which suggests all the hard work done to ensure users were redirected from the old sites to the new has paid off. A snapshot of a [...]

    Reply
  13. Rhys

    I take it this doesn’t apply to the Welsh language pages then.
    Just tried three old links

    http://www.direct.gov.uk/cy/index.htm (works, as it goes to the the Welsh home page)

    http://www.direct.gov.uk/cy/CaringForSomeone/MoneyMatters/DG_10038111CY (redirects to a page saying ‘Directgov has been replaced by GOV.UK’ and points you to the gov.uk home page)
    http://www.direct.gov.uk/cy/HomeAndCommunity/YourlocalcouncilandCouncilTax/CouncilTax/DG_10037422CY (redirects to a page saying ‘Directgov has been replaced by GOV.UK’ and points you to the gov.uk home page)

    Reply
  14. Phil

    • Paul Downey

      Phil, all of those pages are for user-needs better met elsewhere, and so have been deliberately marked “410 Gone”. The feedback form if you spot missing mappings, which are returned as 404.

      Reply
      • Phil

        Why not link to those pages where user-needs better met elsewhere?

        Reply
        • Paul Downey

          Phil, there are suggested links for pages which had a reasonable amount of inbound links or are high-traffic, but adding links to every Gone page would have demanded even more discussion across departments, time better spent on creating good mappings for pages which contain information where government is the canonical source.

          Also, the government citing another source demands great care and attention to detail which will need to be maintained over time as suggested sites come and go.

          Anointing one site when there are many easily discovered alternatives also introduces risks. The reductio ad absurdum of this issue is a page with greener advice for holding a BBQ, where we could have linked to any one of a number of different NGOs, or garden centres, all of whom could legitimately claim to be the best source of such advice.

          Reply
      • Phil

        “No link left behind” isn’t entirely true then.

        Reply
        • Paul Downey

          Well the title is a snowclone on “no marine left behind” which wasn’t strictly true, either, but I think linking to the original in the National Archives for pages where there is no clear user-need on a government site, means it isn’t left behind.

          Reply
  15. Hello GOV.UK | Dafydd Vaughan

    [...] the beta messages from the site, then a team worked through the early hours of Wednesday morning to redirect as many Directgov and Business Link URLs to GOV.UK as [...]

    Reply
  16. How is GOV.UK performing? | Government Digital Service

    [...] visitors end up being too low then we need to understand where people are getting lost and improve our redirects and search engine [...]

    Reply
  17. Sharing the news about GOV.UK | Government Digital Service

    [...] Paul’s explained previously on the blog we worked hard to make sure URLs didn’t break when we transitioned from the Directgov [...]

    Reply
  18. Chris Osborne

    Loving this blog, and being able to follow the story of, and approach taken to, the development of gov.uk – it’s been great to watch. Just to provide a bit of citizen-provided QA on this re-direct topic, the directgov tax disc URL (http://www.direct.gov.uk/taxdisc) shown on the middle image in “Links in the wild” in this post isn’t re-directing to gov.uk currently, and is ending up on a directgov page here: https://www.taxdisc.direct.gov.uk/EvlPortalApp/app/home/intro.

    Reply
    • Paul Downey

      Thanks Chris, that link is now pointing directly at the application for a taxdisc, rather than the old DirectGov page which then linked to the DVLA application.

      Confusingly taxdisc remains on a subdomain of direct.gov.uk — the post originally explained how there would be some orange on transactions for some time to come, but was edited for length. Sorry for the confusion this may have caused.

      Reply
  19. SEO performance on launch: comparing GOV.UK with Directgov | Government Digital Service

    [...] to GOV.UK, so it’s a double-win. You can read the transition team strategy here in the blog post No link left behind. The mappings have worked, as users coming from search were redirected to 2,380 different [...]

    Reply
  20. This week at GDS | Government Digital Service

    [...] that made the Directgov and Businesslink transitions so smooth for users was the effort put into making sure users were redirected to the right place on GOV.UK. In between working the same magic for departmental websites the team responsible have [...]

    Reply
  21. Testing the redirections | Government Digital Service

    [...] Now that GOV.UK has replaced Directgov and BusinessLink, and departments are moving to Inside Government, we want to make sure that people visiting links to the old sites get where they need to be. We want them to be redirected to the correct page on GOV.UK, with no link left behind. [...]

    Reply
  22. The internet is fragile | Dafydd Vaughan

    [...] Service is currently migrating a large number of websites to GOV.UK. My colleague Paul Downey wrote about how they are putting lots of effort into stopping broken links and redirecting users as much [...]

    Reply
  23. Alasdair

    My only comment could be considered a rather pedantic one but worth making. As the article itself talks about using redirects from the old sites to replacement pages the over use of ‘URL’ is incorrect as these web addresses do not point to where an object can actually be found. Rather they point to a resource that may or will redirect you to a relevant resource for the topic you are looking for. Therefore the article should really be discussing URI’s and not URL’s as the web addresses are not physical locations of a resource. Most servers these days use redirect’s, URI re writing modules or directory aliasing to some extent making the term ‘URL’ incorrect for most web address use and the IEEE has for some time now been advising the adoption of URI as the default term although not many actually pay any attention to it.

    Reply
    • Paul Downey

      Alasdair, as the doodler responsible for The URI Is The Thing, I hope you’ll forgive me for not using URI given they all may be dereferenced and URL is the common parlance of the intended audience of this post.

      Reply
      • Alasdair

        Hi Paul given that universities rarely get this correct I am happy to forgive you. Just it would be good to see the move towards the right use of terminology as in a lot of topics the common parlance tends to take ownership and turn things into buzzwords.

        Reply

Leave a comment