Innovations in data science provide huge opportunities to improve the way we develop policy, and design and deliver public services.
We need to make sure that we are responsible in the way we use data. So, last year, GDS published the beta version of the Data Science Ethical Framework.
The framework
Teams should feel confident that they are using data science responsibly. Policy makers also need to be well informed in order to take advantage of the insights data science provides. The framework helps them do this.
The current framework brings together good practice guidance in data science and gives advice beyond data protection legislation. It stipulates that data science work should adhere to 6 principles:
- Start with clear user need and public benefit
- Use data and tools which have the minimum intrusion necessary
- Create robust data science models
- Be alert to public perceptions
- Be as open and accountable as possible
- Keep data secure
Over the past year, we have collaborated with experts both inside and outside of government to improve the beta version of the framework. We worked with government data scientists as well as academics, civil society and the wider industry.
Working with the public sector
One organisation helping us to develop the framework is Essex County Council (ECC), as part of its work on the Essex Data programme, an Essex Partners initiative. Like others in the public sector, Essex Partners is exploring how to share and use data to improve services.
They are investigating whether they can develop an approach to predicting risk which might enable early intervention.
Liz Ridler, Delivery and Evaluation Lead in Public Service Reform at ECC, is leading this sensitive work in the Essex Data Programme to link personal pseudonymised data between partner organisations.
The information being shared is personal, from vulnerable groups in challenging circumstances, so ECC and its stakeholders recognise that it is vital to treat the data ethically.
Our work with Liz has taught us a great deal about what considerations users need to make. A good example is a key question Liz highlighted:
If we use data to predict where those at high risk of domestic abuse were, before it happened, we could act to reduce the risk. Because we're talking about risk, are we pre-empting and labelling groups that haven’t actually experienced this outcome? Are we affecting an outcome that would otherwise not have occurred? On the other hand, failure to share data could lead to a negative outcome that could have been avoided – which is the ethical thing to do?
Ensuring that ethical considerations of research projects are given appropriate attention is not new for ECC but in the context of data science and predictive analytics it is. Essex Partners have used the framework to address this new challenging area.
The checklist which forms part of the framework has been particularly useful. It helped the team to analyse the ethical issues around their ECC Domestic Abuse prototype and the Gangs, Violence and Vulnerability prototype.
The framework helped them structure conversations with key stakeholders involved in the project. As a result of these conversations, they’ve been able to demonstrate that the right modifications were built into the project approach. For example, to answer the question 'what is the quality of the data?', they explored issues around potential bias when making data-driven decisions.
Instead of analysing one dataset to produce recommendations for decision makers to use, they combined that dataset with other sources of information to create a more accurate analytical output. This is because one dataset on its own may contain a sample of data that is skewed because it's not representative enough.
It’s essential that the programme sparks a shift in the perception of data projects and sharing data across Essex Partners. The framework is a practical tool which has helped do this while understanding the ethical issues at play, ensuring innovation won't compromise ethics.
What’s next?
The collaboration and feedback from people like Liz at ECC has been invaluable. We’ve learnt how important the framework is for structuring conversations between data science practitioners and policy makers. This promotes a culture change – essential for making better use of data across the public sector.
We’ve also held a series of roundtables to discuss the update. We have spoken to people across the industry and academics in the fields of data science, law and philosophy. We have also spoken to our primary users – government data scientists and policy makers.
We worked closely with the British Academy and Royal Society reviewers of the 'Data management and use: governance in the 21st century' report to learn from their expertise.
This process has allowed us to identify key areas for improvement in the framework.
We also want to clarify the framework’s purpose – to share the standards we expect data projects to follow and clearly communicate this to maintain trust.
A new iteration of the framework will be published soon. However, there will never be a ‘final’ version. It will continue to evolve as technology evolves.
We want to stay at the forefront of that conversation not only in the UK but globally, too. We will therefore continue to work with departments to ensure that the framework adds value.
Email Sarah if you'd like to feed in ideas for the next iteration of the framework. Follow GDS on Twitter.