https://gds.blog.gov.uk/2015/09/01/registers-authoritative-lists-you-can-trust/

Registers: authoritative lists you can trust

Paul Downey illustration of registers

We’ve mentioned registers a few times on this blog, most recently in relation to the work of the Land Registry building on the steel thread, the brilliant new Companies House public beta, and their importance for building platforms.

For the past few months we’ve been exploring what is meant by “register”. That means we’ve been:

  • conducting user research
  • talking to colleagues across government who manage data (in organisations like the Food Standards Agency, the Land Registry, Companies House, MoJ, DVLA)
  • processing lessons learnt building services during the digital transformation programme, and
  • testing hypotheses by experimenting with software

We’ll show some of the things we’ve been doing in future posts, but here are some of the things we’ve discovered so far about registers.

Why we need registers

A register is an authoritative list of information you can trust. A canonical source of truth. Registers are important, and there are already many of them. A search of “register” on GOV.UK finds nearly 11,000 pages, and a similarly high number of documents on legislation.gov.uk contain the word “register”.

There are different kinds of register:

  • open registers contain public data, and are open to everyone
  • closed registers ask you to do something before you can access the data, for example pay a fee (as with seeing a Land Registry title) or provide a token (such as your driver number when using the view my driving record service)
  • private registers contain sensitive information, but may be able to provide answers to simple questions, such as “Is this person registered as a potential organ donor?”, or “Is the registered keeper of this vehicle over 21 years of age?” without revealing further details about the individual

Many registers are kept by government because the law instructs it to “establish”, “maintain” or “keep” a register. A typical example is the Gangmasters (Licensing) Act 2004 which says:

The Authority shall establish and maintain a register of persons licensed under this Act.

This has resulted in the Gangmasters Licensing Authority keeping a public list using specialised software and run on their own website.

Registers can also emerge to meet the operational needs of service providers. For example, because a visa may be sponsored by one of a number of different types of organisation, UK Visas and Immigration publish lists of sponsors such as sports governing bodies on GOV.UK in a PDF document.

Teams building digital services

We’ve spent a lot of time looking at a large number of such lists, and what’s become apparent is that they’re all held and maintained quite differently.

That causes problems.

Services have no standardised way of accessing the data in these lists so they need to develop bespoke software to do it. Where a snapshot of a list is periodically published, a service may need to notice there’s a new list, and then download and process a copy. Not having direct access to the data through an API introduces potential errors, and a lag between a change to the data being available to users of the service.

More importantly, a service team needs to be able to trust the integrity of the data. They need to know that the list will be kept up to date, and not disappear or change shape, breaking their service. Understandable documentation including clear licensing can help people use the data, but registers demand an owner, a registrar. That’s a person responsible for maintaining the list, with a clear process for quickly fixing issues with the data.

People providing services shouldn’t have to worry about data integrity. That’s the registrar’s job. That means registers should be designed so that data is always maintained and fixed at source, simplifying the design of services.

To reduce errors and reduce duplication, data in a register should reference other registers. To make that possible, registers should use standard names and formats, and each register entry should have a unique, stable identifier. Ideally, a register of limited companies should be able to trust Companies House to be the canonical source of company directors, and cite the Companies House company number rather than just a company name, which may change and can be easily misspelt (or spelled in different ways).

A registrar, responsible for a register

Maintaining an authoritative, canonical register can be quite onerous. Currently, this often means building or procuring an ill-suited product or bespoke system, or having to remember to periodically upload documents to GOV.UK.

We want to change that. We have started thinking about what a register product might look like. It should support registrars, providing them with a standard way to establish and maintain a register and assure them that it’s being kept in good order. In particular, it should be able to prove the data hasn’t been tampered with.

Here’s a screenshot of an entry in a basic prototype (this will change as we iterate further):

Screenshot of an entry in a basic prototype

Simplifying and standardising how the data is stored frees registrars to concentrate on tasks specific to their domain, such as assessing and processing requests to change their register, or monitoring data quality.

Registers are for everybody

It’s not just people building services, or those in charge of data who are users of registers. A register should store a history of changes to itself and be open to independent scrutiny.

Government issues lots of artifacts, certificates, licences and other totems for information held in registers, which we should be able to check should things go wrong. We call these digital proofs: a digital register may supersede or expire your permission to do something, but it shouldn’t be able to later refute that permission was ever issued to you. Every change to a register is recorded with a digital fingerprint, and every fingerprint can be verified independently.

We believe moving from periodically publishing data to operating more standardised, open data will help everyone build better services – cheaper and faster, and across government, not just within a single organisation. We’ll continue to work with colleagues in departments and beyond, and we’ll blog regularly about what we are learning along the way.

Follow Paul on Twitter, and don't forget to sign up for email alerts.

24 comments

  1. Stefan

    This is all excellent stuff, but I was slightly confused by

    <blockquote>Government issues lots of artifacts, certificates, licences and other totems for information held in registers, which we should be able to check should things go wrong. </blockquote>

    Are you saying that if things go wrong with the register, the totem (token?) can be used to correct it? Or that we can check the register if things should go wrong with the tokens? I hope you mean the second, but the wording is ambiguous. If the token is canonical, that would have all sorts of unhelpful implications.

    Link to this comment
    • Paul Downey

      Thanks Stefan, The ambiguity comes from our still being in discovery for the user needs for digital proofs.

      Technically we are aware of how protocols such as the blockchain demonstrate how proofs could be distributed, and certificate transparency demonstrates how to use distributed copies to highlight where a canonical source of truth has been tampered with, but these are only two of a number of different models for increasing the trust in integrity of record we've started to explore.

      Link to this comment
  2. Sian Thomas

    Delighted FSA is involved in this. Helping make improvements to our 'registers' as well as looking at the Food Hygiene Ratings Scheme API, aspects of which others use as a proxy for a real register.

    Great to see you making progress Paul and team -and look forward to building services on the back of them.

    Link to this comment
    • Paul Downey

      Thanks for all your help Sian, we're still learning a lot from the great work you're leading at the Food Standards Agency!

      Link to this comment
  3. Ian Litton

    Would be great to see this linked up with the work we have done on attribute exchange - see:

    http://oixuk.org/wp-content/uploads/2015/08/WCC-2-alpha-white-paper-final-draft.pdf
    http://oixuk.org/wp-content/uploads/2015/08/WCC-2-alpha-technical-paper-final-draft.pdf
    http://oixuk.org/wp-content/uploads/2014/09/WCC-2-white-paper-FINAL.pdf

    There is a close fit between this and the API-enabled registers you describe

    Link to this comment
  4. Tim Blackwell

    "...but it shouldn’t be able to later refute that permission was ever issued to you. Every change to a register is recorded with a digital fingerprint, and every fingerprint can be verified independently."

    These very useful concepts might fruitfully be applied to all content published on .GOV.UK.

    Link to this comment
  5. Will Avery

    It would be great to have an api for http://reports.ofsted.gov.uk/

    There's far more there than most people might guess.

    Link to this comment
    • Paul Downey

      Thanks Will, that's an excellent resource. We particularly like the way Ofsted share the same identifiers for educational establishments with the Department of Education and Skills Funding Agency.

      Link to this comment
  6. William

    This is great. But wouldn't it be better if the standard verified attributes in the registers could be exchanged with individuals using secure personal data stores? That way I get to hold digital government-validated proofs (eg of verified address, driving licence, right to work, fishing licence, whatever). I can share and adjust my organ donation preferences, or prove I'm certified by HMG as over 21 without revealing anything else about myself.

    This is an inspiring post. "Organisation shall speak unto organisation via API" is a huge and transformational shift. But it's more inspiring (less Kafkaesque, and legally on surer ground re consent etc) when you add "organisation shall also speak direct unto individual via API". That's the data architecture we need for digital by default public services to give individuals real convenience and control.

    To achieve what you set out here will be huge.

    Link to this comment
    • David Moss

      Suppose that you are the proud owner of one of William's personal data stores and that your PDS includes proof that you are licensed by the CFA to offer your services as an investment manager.

      That attribute of yours can be checked by a prospective employer or client, William says, by interrogating your PDS.

      But that's not right.

      Being one of William's PDSs, you can turn on permission for the CFA to update your PDS and you can turn it off. If the CFA revoke your licence but you have taken away their permission to update your PDS, it will look to an interrogator as if you are still licensed when, in fact, you aren't.

      In that case, the only sensible course open to the interrogator is to check with the CFA. Not with your PDS. The use for which in attribute exchange is thereby diminished. Possibly to nil.

      You could say that the PDS isn't "canonical". It can't be canonical as long as you have the power to grant update permission and to revoke it. If you lose that power, then the PDS could be canonical but you would no longer in control.

      But being in control is precisely the attribute that William offers. It's an enticing offer. Control over your own data. But is it in William's gift?

      Link to this comment
      • Adrian Hope-Bailie

        The ability to control your personal information need to prevent 3rd parties from being able to verify any claims you make in that information.

        As someone wishing to verify that William is licensed by the CFA I can query his PDS and be given a credential, digitally signed by the CFA, that attests to this. That credential should have a unique identifier that I can use to query the CFA directly to verify that this credential is still valid.

        Loosely coupled, but verifiable, links between credentials, their subjects and their issuers are important but need to be part of a framework that allows the subject (owner) to share them selectively and also prevent the issuer from knowing who the owner is sharing them with.

        This last point is key to preserving the subject's privacy and preventing a credential issuer or storage service (who's to say William's PDS is physically held by him) from tracking what he does online.

        I would highly recommend anyone interested in this field look at (or better participate in) the work being incubated at the World Wide Web Consortium (W3C) regarding credentials: http://opencreds.org/

        Link to this comment
  7. Nimish Patel

    Thanks for an interesting blog. I do, however, feel that some clarification on terminology might be useful. You refer to viewing Land Registry title as an example of 'closed register' because some action is required before accessing the information.
    However from Land Registry's perspective, we describe it as an 'open register' because the information is available to anyone, albeit after paying a small fee. The Land Register was 'closed' prior to 1990, when information was only accessible under certain conditions.

    Link to this comment
    • Paul Downey

      Thanks Nimish, I understand your point and think it's an excellent way of highlighting the difficulties we face standardising names and terms across government! I know the Land Registry maintains an "open register", but having to pay to access individual entries, on indeed find title numbers for an entry makes it somewhat less open than having all the entries available on the web and downloadable in bulk. Maybe "Throttled" or "controlled" might be a better word than "closed" for this case.

      Link to this comment
  8. David Moss

    Paul

    A.
    Registers are important, you say, in building platforms. Services must be able to trust the integrity of the registers they depend on. Interoperability requires registers to be standardised in many ways. For any given attribute, there should be just one canonical register. By "canonical", I take it that the register is not only accurate and up to date but also complete and authorised.

    I think that A is a fair summary of your blog post but I am very much open to correction.

    B.
    Consider GOV.UK Verify, the putative identity assurance platform. Being a platform, it requires a register. In fact it has four registers at the moment, maintained by Experian, Digidentity, the Post Office and Verizon. We have no way of assessing the integrity of their registers. As far as we know, the four so-called "identity providers" do not even aspire to interoperability through standardisation, let alone offer it. And there is no single canonical identity assurance register.

    It seems legitimate to infer that GOV.UK Verify is not a platform. Or have I misunderstood?

    Link to this comment
    • Paul Downey

      Verify is a platform shared by multiple services; there's no implication that a platform must use Registers.

      Link to this comment
      • Steve Jones

        "Verify is a platform shared by multiple services; there's no implication that a platform must use Registers."

        This is getting very confusing.

        Your post says Platforms need Registers (and Registrars), but your comment says they don't. Which is it?

        Link to this comment
  9. Eastmad

    Would people really look at a hash? Convert to ascii art or something?

    Link to this comment
    • Paul Downey

      Hi David, the register stores and presents data in its rawest form, and allows services to show hashes and other data in whatever is discovered to work best for users.

      Link to this comment
  10. Paul Davidson

    This is a crucial development for Local Authorities looking to improve services, and transparency, using digital and platform techniques. A consistent use of ‘registers’ can allow local data and services to be discovered, joined up, and re-used.

    Its all about the standards. Standards for
    • data quality characteristics
    • persistent and unique identifiers
    • semantics and syntax

    … so that Local Authorities, local partner agencies, community and voluntary agencies, and local people and business, can confidently both publish and consume registers.

    Keen to get the Local Authority perspective fed in at the right time.

    Link to this comment
  11. Gesche Schmid

    HI Paul,
    local authorities are required to produce a range of registers. However, they often differ and cannot be combined or compared. Hence, local government has realised many years ago that common standards, vocabularies and classifications are important for making registers comparable.

    The Local Government Association has therefore worked with local authorities to establish a set of lists that underpin and enable the linking and comparability of local government data.
    The LG standards include lists of local government functions, services, powers, duties, procurement classifications etc which are interlinked through common identifiers. Please, see http://standards.esd.org.uk/. Those standards are currently managed by the LGA on behalf of the local government sector.
    These lists form defacto standards which are used by local authorities in registers for planning applications, inventory of datasets, registers of licensed premises, etc. (See http://opendata.esd.org.uk).
    We would welcome to work with GDS to further extend and promote the use of those lists for use in registers.

    Link to this comment
  12. Phil Allen

    Thanks, Paul. I love things that make everyday life just a bit letter. Are there any special properties of registers that you can exploit to offer consistency and timeliness for their subscribers?

    Link to this comment
    • Paul Downey

      Hi Phil, that's quite a big subject, but we see a commitment to maintaining a Register, and evolving registers in a way which doesn't break existing subscribers ("teams building digital services") as being essential.

      Link to this comment
  13. Michael Kunstel

    Hi Paul, interesting concepts you've been exploring.
    We've also been looking at similar new technologies for our needs.
    Some seem to fit some of the needs and use cases that you have.
    Would be great to meet up and compare notes.

    Link to this comment