Learning from human-generated lists

Kwang Sung Jun, Xiaojin Zhu, Burr Settles, Timothy T. Rogers

Research output: Contribution to conferencePaperpeer-review

3 Scopus citations


Human-generated lists are a form of non-iid data with important applications in machine learning and cognitive psychology. We propose a generative model - sampling with reduced replacement (SWIRL) - for such lists. We discuss SWIRL's relation to standard sampling paradigms, provide the maximum likelihood estimate for learning, and demonstrate its value with two real-world applications: (i) In a "feature volunteering" task where non-experts spontaneously generate feature⇒label pairs for text classification, SWIRL improves the accuracy of state-of-the-art feature-learning frameworks. (ii) In a "verbal fluency" task where brain-damaged patients generate word lists when prompted with a category, SWIRL parameters align well with existing psychological theories, and our model can classify healthy people vs. patients from the lists they generate.

Original languageEnglish (US)
Number of pages9
StatePublished - 2013
Externally publishedYes
Event30th International Conference on Machine Learning, ICML 2013 - Atlanta, GA, United States
Duration: Jun 16 2013Jun 21 2013


Conference30th International Conference on Machine Learning, ICML 2013
Country/TerritoryUnited States
CityAtlanta, GA

ASJC Scopus subject areas

  • Human-Computer Interaction
  • Sociology and Political Science


Dive into the research topics of 'Learning from human-generated lists'. Together they form a unique fingerprint.

Cite this