On 3 February 2020 the Department of Work and Pensions (the DWP) announced that full implementation of the redesigned welfare system ‘Universal credit’ would be delayed until 2024. This announcement followed widespread and consistent criticism of DWP benefit policies, which have been linked to rising food bank use, at least 69 suicides, and multiple deaths of disabled benefit claimants. One such headline was that of disabled man Errol Graham, who starved to death when his benefits were stopped. When his body was discovered by the bailiffs who had come to his home to evict him he weighed little over four stone, echoing the case of Stephen Smith (reported in 2019) who weighed just six stone and could barely walk when he was found ‘fit to work’ by the DWP.
Such cases prompted Professor Phillip Alston (the United Nations Special Rapporteur on extreme poverty and human rights) to visit the UK in 2018 to report on the impact that the austerity measures implemented since 2010 had had on the disabled. One of the aspects of these measures that he repeatedly emphasised in his resulting report was the extent to which the system for claiming employment and support allowance (ESA) had been designed to be difficult. This, he explained, was generated by the system’s ‘digital by default’ design, which was intended to work as a ‘digital barrier’ to both put off applicants from seeking the help they needed and to help ensure application failure. He further commented that:
There is nothing inherent in Artificial Intelligence and other technologies that enable automation that threatens human rights and the rule of law. The reality is that governments simply seek to operationalize their political preferences through technology; the outcomes may be good or bad. (p.11)
Alston is right. As I argue in my book Measuring difference: numbering normal, setting the standard for disability in the interwar period, technologies can and do feature innate political preferences. This is a crucial contention of much pioneering work in the field of science and technology studies: technologies are neither neutral nor objective and can even be overtly political. For example, biases are embedded in dataset compilation through choices of categories for inclusion and exclusion. We are becoming more aware of the potential ramifications of this in machine-learning software, which amplify the stereotypes and biases embedded in datasets.
Technology perceived as unbiased or ‘objective’ has been used to control and manage the messy variability of human bodies. And as I have previously discussed in both The Lancet Respiratory and the British Journal for the History of Science, the spirometer was a crucial tool used to moderate compensation for respiratory disability in miners. Spirometers were used from the late nineteenth century to measure how much air an individual could exhale and express this in numerical terms to indicate ‘vital capacity’ as representative of lung capacity. These measurements were later developed to account for the air that remains in the lungs and spirometric tests are now timed, but initially the spirometer simply measured exhalatory ability. It was seized upon as a tool to provide evidence of levels of respiratory disease in numerical terms which could be used in the complex compensation network for miners.
The fact that mine-workers were unduly impacted by pulmonary disease had been recognised since the early nineteenth century, but the cause of so-called ‘black lung disease’ was often disputed or merged with tuberculosis. Miners were not offered compensation for industrial disease until 1929 with the enactment of the Various Industries (Silicosis) Scheme of 1928, far later than workers in other industries who had largely been protected by the 1906 UK Workmen’s Compensation Act. Moreover, the 1928 scheme only covered cases of death or total disability until 1931, when the introduction of partial disability necessitated development of a new diagnostic classification scheme. This was a challenge for the medical community because of the difficulty of correlating breathlessness with physiological markers.
This is still a challenge. Especially now, as reports circulate about Covid pneumonia patients with deathly low oxygen saturation but without any apparent respiratory distress. Breathlessness is a uniquely challenging experience to capture and measure. The Wellcome funded Life of Breath project has been designed to explore this as part of its investigation into how the humanities can shed light on the breath and breathing. As a result, many of its publications have illuminated the personal and intangible nature of breathlessness. Such research is indicative of the increasing awareness of the disconnect between the personal and subjective individuality of breathlessness and its numerical correlation. Recent neuroimaging studies have shown that an individual’s past experiences, expectations and personal psychology determine their experience of breathlessness. Both mind and body process breathlessness, and its severity does not often correlate with disease stage. This was a huge problem for the Medical Research Council in 1936, when they were asked by the Home Office and the Mines Department to examine pulmonary disease in coalminers working on the South Wales coalfields.
Between 1936-1942 medical surveys were carried out on an anthracite colliery in South Wales under the supervision of Dr Phillip D’Arcy Hart and Dr Edward Aslett. They inspected 560 of the men working there, both radiologically and through the use of clinical tests, including spirometric measurements of lung volume. Their resulting report conclusively demonstrated that there was a link between exposure to coal dust and respiratory disability, and from that point on there was widespread recognition of a disease due to coal dust that was distinct from tuberculosis or silicosis.
At this point, the spirometer’s measurements were often used as though they were representative of a measure of normal breathing. Yet using this measure as representative of health or even levels of breathlessness was problematic. Moreover, using spirometry to diagnose pneumoconiosis (black lung disease) necessitated a definition of ‘normal’ with which to make comparison. The normal adult male subjects used as controls for these determinations were in fact taken from miners working in the colliery of the investigation. That is, the data sets used a normal standard set by apparently healthy miners rather than a non-mining control group. The spirometry test took its measure of normalcy from the very population in which abnormality was already apparent.
This disguised the levels of illness present within that population and, since causation from dust or disease could not be established, it followed that the problem of black lung must be related to the essential constitution of the miner. Attempts to clarify normal reference values were marked by attempts to explain variability in lung function through racial difference. The MRC’s original investigation considered whether the Welsh could be a separate racial group, which would explain their high levels of lung disease. This idea was only rejected because some of the men at the colliery they examined had English parentage.
Attributing variability in lung function to racial difference was eventually enshrined in spirometric measurements by the MRC Pneumoconiosis Research Unit in South Wales. In 1974 the MRC refined their measurements to allow them to ‘correct’ for racial difference using a scaling or correction factor of 13%. This reinforced the idea that white lung function was normal lung function and, as historian Lundy Braun has established in her work Breathing Race into the Machine, this has had far-reaching effects in both the compensation system and in the promotion of the thesis that inequality between the races was biological rather than environmental. Normal breathing was never for all; rather, the spirometer was employed to enhance the differences between us. Spirometric data sets were specifically constructed around groups, which promoted the idea of normalcy within certain social groups (such as coalminers). This has had drastic consequences in impeding the availability of occupational compensation for respiratory disability.
Disability data gaps have been similarly distorting of normal capabilities. Such omissions are evident in the data sets used in the emerging field of audiometry during the interwar period. Data sets which excluded those with imperfect hearing meant that the average threshold, which represented normalcy, was distorted. Thus, the line of normalcy was artificially high, and the expanse of those categorised as deaf was too broad. This situation stemmed not from innovations in medicine, but rather from the British Post Office, the massive bureaucracy then in charge of a nationalised telephone service. To test the effectiveness of their telephone service, the Post Office developed an ‘artificial ear’. However, the artificial ear’s representation of normal was in fact ‘the ideal’ (eight normal men with good hearing) to the detriment of those at the outer edges of a more representative average curve. This device was designed as a way of efficiently and objectively measuring and reproducing sound quality without human involvement. This allowed the Post Office to manage the variability of hearing and standardise the norms of human hearing. However, designating the standard of normal hearing in such a narrow mechanistic fashion resulted in increased disconnect between the objective measurement of hearing (and deafness) and the subjective correlate.
Diagnosis of deafness has historically been difficult, and, as historian Jaipreet Virdi explains, one of the ways that nineteenth-century Aurists attempted to legitimise their work as scientific was by appealing to their use of technology. Later in the century otology established itself as a specialism and its practitioners began using tuning forks to establish both frequency loss and type of hearing loss. However, it was the 1879 invention of the audiometer by Welsh scientist David Edward Hughes that ushered in the means to make large-scale quantitative surveys on hearing levels in the population.
Crucially, the audiometer produced an audiogram, through which otologists could establish at last the ‘facts’ of ‘normal hearing’. The audiogram rendered the quantitative measurement of hearing in graphical form, allowing for the recording, reproduction, and graphical comparison of hearing. For some of those advocating for its use, the value of this inscription for research purposes lay precisely in the fact that it bypassed the need to rely on the testimony of those with hearing loss. Related to this was the audiogram’s status as an ‘objective test’ that could be used to avoid malingering, which deafness had long been associated with. The suspicion attached to deafness in the inter-war years was compounded by its invisibility, alongside the difficulty of measuring hearing or its absence ‘objectively’.
Hearing loss thus posed a problem related both to the subjectivity of the individual body and its invisibility outside of the individual body. The lack of tests to catch supposed malingerers became particularly problematic in the First World War because of the difficulty of diagnosing so-called ‘hysterical’ deafness in soldiers. Wartime hearing loss was considered by many psychiatrists to be psychosomatic or ‘functional’, which meant that the underlying pathological cause could not be seen, but, crucially, was supposed to exist. Such an ideology was in line with psychiatry’s long-standing insistence that there was a fundamental bodily cause for all mental illness. As it became increasingly apparent that shell shock had psychological origins, the idea that hearing loss was also psychological gained increased credence within psychiatry. This led to a wave of new theories about the causes of hearing loss, and corresponding new treatments to test and treat the malingerers. However, differentiating between malingering and hysteria was itself difficult. The category of ‘hysterical’ deafness was particularly fraught in the military context, as medical officers were generally highly suspicious of ‘malingering’ and hysteria was often considered to simply be a form of unconscious malingering.
The audiometer offered a more reliable way to test malingering, and instructions to this effect were given as part of the kit of a 1940s commercial Amplivox audiometer now held by the Thackray Medical Museum in Leeds. Amplivox was a successful hearing aid company and they would have used an audiometer primarily to prescribe hearing aids. However, its accompanying instructions emphasised its usefulness to those involved in moderating contested hearing loss and explained that, ‘malingerers feign deafness in various degrees and for various reasons’. Several strategies could be adopted to detect and confound the malingerer, including hiding the front panel from the patient’s view, using the tone interrupter so the patient would be unable to remember previous tone intensities, and by switching tones from ear to ear and from air to bone conduction, which ‘easily confuses the subject and makes it practically impossible to deceive the operator’.
Similarly, the audiometer enabled detection of those who could ‘pass’ as hearing by lip-reading. This was especially valuable in the testing of schoolchildren, who could be tested rapidly and in larger groups with the audiometer. This resulted in an increase in the more ‘precise’ sorting of children, ‘who were found to be defective’ and so ‘re-classified’. In this way, the audiometer worked as an inscription device, an apparatus whose end-product was the audiogram and the creation of ‘Normal Hearing’ and, crucially, its counterpart. We can see in this way how the power of the audiometer as a classifying device influenced the social construction of disability, which was reinforced as the audiometer created more data on ‘normal hearing’.
Thanks in part to the Post Office and its artificial ear, it was during the inter-war period that the audiometer was repeatedly lauded as a significant advance in the diagnosis and classification of deafness. The audiometer’s ability to provide quantitative units of hearing ability and to allow for their comparison was especially valuable in achieving this. This was a clear advantage compared to the more commonly used voice test, which could not be standardised and necessitated the involvement of the clinicians’ own (variable) body. The audiometer was also critical to the inter-war commitments of deaf educators and especially fuelled the legitimacy of the commitment to oralism, an educational method that prioritised speech and lip-reading to ‘normalise’ deaf children and force their integration into the hearing world.
It was further embraced by the industrial/military fields as a way of judging who should get compensation for hearing loss, as it was a useful tool for identifying ‘impaired hearing’ and thus deciding on the suitability of insurance candidates, Army and Navy recruits, and those working in specific industries such as on railways or steamships. Indeed, the field of audiometry frequently defines its origins as truly beginning in the Second World War, when it was embraced by the U.S. military for compensation purposes to provide a numerical assessment of hearing loss before and after service, which was critical to manage compensation claims. In the military it was necessary to test many people quickly and have a numerical result that could be compared before and after service in order to award or refuse compensation for noise induced hearing loss. Audiology then solidified as a field through the work done with deafened ex-service men during the war. That compensation required the creation of numbers to indicate critical thresholds of disability is a key component of my book’s main thesis, as this process worked to categorise disability, and did so in a way which omitted the need for individual testimony.
The audiometer was embraced as an objective tool to define noise limits and thresholds. Its utilisation of fixed thresholds for the normal ranges of hearing were elevated as a tool for testing both noise levels and hearing loss, I argue, because it provided an objective numerical inscription, which could be used to guard against malingering and to negotiate compensation claims for hearing loss.
In the interwar period, breathlessness and hearing loss that did not clearly correlate with clinical evidence were often dismissed as ‘hysterical’. Tools like the spirometer and audiometer defined disability as measurable pathology which trusted instruments could measure with objectivity and accuracy. This determination to consider bodily processes as quantifiable was driven by the need to compensate for respiratory disability and hearing loss occasioned by warfare or industry. Spirometry and audiometry were therefore embraced as objective ways of testing, which could confound malingerers and allow for testing of large groups of people.
The standards embedded in these instruments created strict, but ultimately random thresholds of normalcy and abnormalcy. Considering these standards from a long historical perspective reveals how these dividing lines shifted when pushed. The necessary pressure was brought to bear by diverse and varied impacts: different datasets, newly created categorisation systems, updated technologies, and through the conscious and unconscious manipulation of political actors working to negotiate compensation frameworks. Highlighting these issues is not intended to undermine or call into question the necessary procedures of biomedicine without offering any kind of solution — an oft-repeated criticism of medical humanities researches. Nonetheless, the prevalent assumption in the clinic is that patients’ sensory experience of a symptom is directly related to measurable physiological disease. We trust that the relationship between symptom experience and measurement is accurate. Historically, our faith in this accurate relationship has been misguided.
Indeed, amidst the Covid-19 pandemic we have been encouraged to ‘count symptoms’ rather than tests as it has become clear that we are unable to rely on objective biomedical testing. Health Secretary Matt Hancock’s response has been to announce a new NHS app to track coronavirus symptoms and trace those who have been in contact with infected carriers. This kind of big data has been the lingua franca of the pandemic as we count cases, count deaths, and compare numbers. Simultaneously, divergent standards between countries in the manner of calculating death rates have been linked to political expediency. In Britain, for instance, the original decision to separate care home deaths from hospital deaths was criticised for artificially lowering the death rate.
But numerical data alone can be reductive or incomplete. For instance, countries like France do not gather statistics based on ethnicity, so the extent to which Covid-19 has excessively impacted on BAME populations is not apparent there in the way it is in the UK. However, it is critical that these numbers do not elide the realities of the lives that they represent. Historically, using racial categories for analysis of bodies has tended to be used to justify and cement the position of ‘inferior’ bodies. And there is still a risk that these statistics will entrench biological essentialism and allow us to ignore the ways in which the environment and processes of discrimination impact on health inequities. Similarly, preliminary data from China, Italy, and South Korea have indicated that men have a higher fatality rate from Covid-19 and while some researchers initially highlighted biological factors such as a stronger female immune response, others have argued that this difference comes down to the gendered differences in uptake of smoking in those countries.
Understanding health inequity does not come from statistics alone. If it did, the linking of austerity to 130,000 deaths would have prompted the reversals of cuts we are only now seeing. Some numbers have a greater impact than others, and this is partly linked to the way we contextualise these numbers with stories. We cannot return to ‘normal’ without reversing the policies that have devastated the disabled community over the last ten years. This reckoning must be done through the elevation and prioritisation of the voices of the disabled, who should be integral to the way the DWP works in the future and paid appropriately for their expert contributions.
There is now greater public awareness that ‘following the science’ does not and cannot promise objectivity or fairness. As I have demonstrated here, scientific objectivity has long been wielded by large bureaucracies, often acting as arms of government, in order to systematically deny benefits to the disabled whilst maintaining the pretence of fairness in assessment. Alston’s astute and incisive report was a damning indictment of the negative impact that government austerity measures have had on the disabled in Britain and the way that the DWP’s procedures have been designed to present obstacles to compensation. Those denied welfare are then again denied resource to complain against these impersonal, supposedly ‘objective’ tools of the state. Algorithmic unfairness and ‘digital by default’ systems have exacerbated these historical discriminations. Nationalised broadband would help, but an app will not save us. We need stories just as much as we need statistics. Non-digital complementary methods designed to reveal the people behind the numbers should be developed in consultation with those impacted by these policies. This should be supplemented by changes to DWP staffing policies, which ought to follow the model of the NHS: frontline staff instilled by their managers and by DWP’s Secretary of State (currently Thérèse Coffey) with compassion, not suspicion, for their vulnerable fellow citizens.
Note on terminology: I use ‘disabled’ rather than ‘people with disabilities’ in line with practices from disability studies intended to highlight the ways in which we are disabled by (e.g. people, practices, workplaces) and so as not to perpetuate the idea that the word is a pejorative
Braun. L., Breathing Race into the Machine: The Surprising Career of the Spirometer from Plantation to Genetics (Minneapolis: University of Minnesota Press, 2014).
Carel, H., Macnaughton, J., and Dodd, J., ‘Invisible suffering: breathlessness in and beyond the clinic’, The Lancet: Respiratory Medicine, 3:4 (2015), 278-279.
Faull, O. K., Hayen, A., and Pattinson, K. T. S., ‘Breathlessness and the body: Neuroimaging clues for the inferential leap’, Cortex, 95 (2017), 211-221.
McGuire, C. ‘”X-Rays don’t tell lies”: The Medical Research Council and the Measurement of Respiratory Disability 1936-1945’ The British Journal for the History of Science 52:3
McGuire, C. ‘The Categorisation of Hearing Loss in Inter-War Telephony’ in G. Balbi and C. Berth (eds) Special Issue 'A New History of The Telephone': Journal of History and Technology DOI:10.1080/07341512.2019.1652435.
Virdi [Virdi-Dhesi], ‘Curtis’s Cephaloscope: Deafness and the Making of Surgical Authority in London, 1816-1845’, p.349.
Download and read with you anywhere!
Sign up to receive announcements on events, the latest research and more!
We will never send spam and you can unsubscribe any time.
H&P is an expanding Partnership based at King's College London and the University of Cambridge, and additionally supported by the University of Leeds, the University of Liverpool and the Open University.
We are the only project in the UK providing access to an international network of more than 500 historians with a broad range of expertise. H&P offers a range of resources for historians, policy makers and journalists.