Skip to content

9 September 2022

Using machine learning to extract information about loneliness in older people from English adult social care administrative records

Sam Rickman , London School of Economics and Political Science , United Kingdom


The purpose of this paper is to establish the feasibility of using computational natural language processing methods to extract information from adult social care free text records. We start with extracting information about loneliness and social isolation, as this is an important area where data is currently limited.

Loneliness and social isolation are associated with worse health and care outcomes: they are as predictive of mortality as smoking, obesity or hypertension. Internationally, the WHO has estimated following the Covid pandemic that 20-34% of older people are lonely in Europe, Latin America, China and the USA. The UK government published a loneliness strategy in 2018 aiming to “embed loneliness as a consideration across government policy”. At the same time, there have been changes to commissioning patterns to long-term care interventions in England to increase social connectedness for older people with care needs, such as day centres.

It is difficult to robustly estimate from national survey data the proportion of individuals in England receiving statutory adult social care services who are lonely or socially isolated. Researchers have looked at local authority administrative care records, such as Care Act assessment forms, to extract information that cannot be gathered from national surveys. However, local authority administrative records do not generally record this information in structured form, such as a checkbox indicating that a person is lonely or socially isolated.

A significant proportion of local authority care records are recorded as free text notes. This information often includes information about social circumstances. This paper analyses 1m free text case notes (5m sentences) recorded about a cohort of over 3,000 65+ users of statutory care services over a 10-year period. The goal is to classify whether each sentence states that an individual is lonely or socially isolated.

We use a variety of quantitative text processing methods, from simple word count models to more complex transformer-based neural network architectures developed by Google Brain and Facebook AI. Analysis is still on-going, but early results indicate that it is possible to train a model to identify loneliness and social isolation in free text data with a high degree of accuracy. We plan to extend this model architecture to domains other than loneliness, and to conduct quantitative analysis of service receipt data using loneliness as an independent variable.

Skip to toolbar