9 January 2024

Why the UK census should not be replaced with alternative sources of data

By Richard Harris

Every ten years since 1801 – save for a wartime interruption in 1941 – the UK government has conducted a national census of England and Wales. This is a big event. The data collated in the last survey, in 2021, is still being published, with final reports only scheduled for 2025. Yet, doubts have emerged about whether the next one – in 2031 – will actually take place.

The Office for National Statistics (ONS) is currently preparing recommendations on the back of a public consultation about the future of population and migration statistics in England and Wales, conducted over four months in 2023. Scholars have expressed concern that the government intends to scrap the census altogether, in favour of using other, administrative sources of population data.

The issue is not whether administrative data can supplement census data – it undoubtedly can. However, the 85 scholars, signatories to an open letter published in October 2023, say that the government has not convincingly made the argument for administrative data being able to replace all the functions of the census.

They argue that using only alternative sources of information without the census to compare them against could ultimately lead to inaccuracies.

An unparalleled resource

The ten-yearly census aims to collect information about everyone who is resident on census night, the last of which was March 21, 2021, in England, Wales and Northern Ireland, and a year later (March 20, 2022) in Scotland, where it had been delayed due to Covid-19. It is this ambition to collect data about the complete population that makes the census so unique, unparalleled by much smaller social surveys.

The data collected in this way is crucial to understanding the changing social and demographic geographies of the UK. It is used by organisations, businesses, local authorities and academics to inform business and service planning, to map who is living where and to allocate funds in response to changing demands and needs.

But collecting it, then processing, storing and publishing it, is expensive. The ONS puts the cost of the 2021 census at around £900 million. That may only work out at £1.50 per person, but it is still a large sum of money – one that has not escaped the notice of the governments that fund it.

As well as the finances, there are also questions around the survey’s efficacy, when multiple organisations already gather citizen data as a matter of routine. As National Statistician for England and Wales Ian Diamond has put it, ‘We have reached a point where a serious question can be asked about the role the census plays in our statistical system.’

Alternative sources

The first question that arises is whether a form of data collection that originated in the 19th century might be radically modernised in the 21st. There have, of course, been changes to the census over the decades. Most of the data is now collected and disseminated electronically rather than on paper. There are also ever more ways to freely explore and visualise the data.

Further, the questions the census asks are updated over time. In 1991, an ethnicity question was included and, in 2001, a religious affiliation question was added. In 2021, changes were made to the gender identity variable.

The more important issue, however, is whether, in an era when other data about people and places is routinely collected by public (and private) organisations, we need the census at all.

The ONS’s consultation document, entitled ‘The future of population and migration statistics in England and Wales’, suggests that various sources of administrative data can be linked and collated to create what is, in effect, a pseudo-census. This is not a new idea.

Usefully, there is no reason to restrict these linkages to a ten-yearly cycle of updates. We could have more timely data reflecting changes to society as they happen, instead of waiting a decade or more for the next census to collect the data and make it available for analysis. That would be extremely useful for studying, understanding and mapping social and demographic change.

Potential for inaccuracy

A lot of effort has been made by the ONS to explore what it terms ‘census alternatives’ and to understand their potential advantages and disadvantages.

However, assuming that the census is the gold standard of population statistics – not perfect, but with data that provides information on all neighbourhoods in the UK and their populations – then, without that standard, it becomes harder to calibrate other sources of data and ensure that what they measure is an accurate reflection of social patterns and trends.

Imagine, for example, that we used the national pupil database to estimate the ethnic composition of neighbourhoods. As it records which schools pupils attend and which ethnic group they belong to, this very rich source of data has been used to show that ethnic segregation – the possibility that different ethnic groups choose different schools from one another – is falling is England.

It also records where these pupils live, so has been used to calculate the percentages of pupils by ethnic group in any given neighbourhood. The obvious problem is this calculation applies only to those who are of school age. The less obvious problem is that the national pupil database does not include information on fee-charging schools. In other words, the data is contains is incomplete.

There are ways, of course, to weight (aka modify) data, and link it to other data, to improve the accuracy. And the ONS is unlikely to release anything it knows to be misleading. Generally, however, the more we zoom into such smaller data-sets, to explore neighbourhood-level patterns and differences, the more the possibility for inaccuracy increases.

The great strength of the census is that it provides geographically granular data that is hard to replicate through other sources (and, of course, doing so also encounters issues of personal data protection).

Conversely, the census’s weakness is that it is not temporally granular. It provides a lot of geographically detailed data about people and places, but that information is updated infrequently.

We could, of course, have both: a traditional census and a range of administrative and survey data to draw upon too. As a geographer, interested in detailed understandings of where people live, that is my preference.

This would not reduce the cost of the census, however. But there are social and economic costs in using data that lacks the geographical coverage that the census provides. Administrative data is good for measuring parts of the population, but it remains unclear whether those various parts come together well enough to sufficiently measure the whole.

Even if they do, data that is reliable for use at a national, regional or sub-regional scale does not automatically offer accurate portrayals of specific local and community conditions. I agree with the signatories of the open letter that the government has not convincingly argued for scrapping the census.The Conversation

This article is republished from The Conversation under a Creative Commons license. Read the original article.

Click here to subscribe to our daily briefing – the best pieces from CapX and across the web.

CapX depends on the generosity of its readers. If you value what we do, please consider making a donation.

Richard Harris is Professor of Quantitative Social Geography at the University of Bristol.

Columns are the author's own opinion and do not necessarily reflect the views of CapX.