How Algorithms Discriminate Based on Data They Lack Challenges, Solutions, and Policy Implications

Betsy Anne Williams, Catherine F. Brooks, Yotam Shmargad

Research output: Contribution to journalArticlepeer-review

45 Scopus citations


Organizations often employ data-driven models to inform decisions that can have a significant impact on people’s lives (e.g., university admissions, hiring). In order to protect people’s privacy and prevent discrimination, these decision-makers may choose to delete or avoid collecting social category data, like sex and race. In this article, we argue that such censoring can exacerbate discrimination by making biases more difficult to detect. We begin by detailing how computerized decisions can lead to biases in the absence of social category data and in some contexts, may even sustain biases that arise by random chance. We then show how proactively using social category data can help illuminate and combat discriminatory practices, using cases from education and employment that lead to strategies for detecting and preventing discrimination. We conclude that discrimination can occur in any sociotechnical system in which someone decides to use an algorithmic process to inform decision-making, and we offer a set of broader implications for researchers and policymakers.

Original languageEnglish (US)
Pages (from-to)78-115
Number of pages38
JournalJournal of Information Policy
Issue number1
StatePublished - Mar 2018


  • Algorithmic discrimination
  • Omitted variables
  • Personal data
  • Signaling
  • Statistical discrimination

ASJC Scopus subject areas

  • Sociology and Political Science
  • Public Administration


Dive into the research topics of 'How Algorithms Discriminate Based on Data They Lack Challenges, Solutions, and Policy Implications'. Together they form a unique fingerprint.

Cite this