Attention-guided classification of abnormalities in semi-structured computed tomography reports

Khrystyna Faryna, Fakrul I. Tushar, Vincent M. D'Anniballe, Rui Hou, Geoffrey D. Rubin, Joseph Y. Lo

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Scopus citations

Abstract

Lack of annotated data is a major challenge to machine learning algorithms, particularly in the field of radiology. Algorithms that can efficiently extract labels in a fast and precise manner are in high demand. Weak supervision is a compromise solution, particularly, when dealing with imaging modalities like Computed Tomography (CT), where the number of slices can reach 1000 per case. Radiology reports store crucial information about clinicians' findings and observations in CT slices. Automatic generation of labels from CT reports is not a trivial task due to the complexity of sentences and diversity of expression in free-text narration. In this study, we focus on abnormality classification in lungs, liver and kidneys. Firstly, a rule-based model is used to extract weak labels at the case level. Afterwards, attention guided recurrent neural network (RNN) is trained to perform binary classification of radiology reports in terms of whether the organ is normal or abnormal. Additionally, a multi-label RNN with attention mechanism is trained to perform binary classification by aggregating its output for four representative diseases (lungs: emphysema, mass-nodule, effusion and atelectasis-pneumonia; liver: dilatation, fatty infiltration-steatosis, calcification-stone-gallstone, lesion-mass; kidneys: atrophy, cyst, stone-calculi, lesion) into a single abnormal class. Performance has been evaluated using the receiver operating characteristic (ROC) area under the curve (AUC) on 274, 306 and 278 reports for lungs, liver and kidneys correspondingly, manually annotated by radiology experts. The change in performance was evaluated for different sizes of training dataset for lungs. The AUCs of multi-label pretrained models: lungs - 0.929, liver - 0.840, kidney - 0.844; multi-label models: lungs - 0.903, liver - 0.848, kidney - 0.906; binary pretrained models: lungs - 0.922, liver - 0.826, kidneys - 0.928.

Original languageEnglish (US)
Title of host publicationMedical Imaging 2020
Subtitle of host publicationComputer-Aided Diagnosis
EditorsHorst K. Hahn, Maciej A. Mazurowski
PublisherSPIE
ISBN (Electronic)9781510633957
DOIs
StatePublished - 2020
Externally publishedYes
EventMedical Imaging 2020: Computer-Aided Diagnosis - Houston, United States
Duration: Feb 16 2020Feb 19 2020

Publication series

NameProgress in Biomedical Optics and Imaging - Proceedings of SPIE
Volume11314
ISSN (Print)1605-7422

Conference

ConferenceMedical Imaging 2020: Computer-Aided Diagnosis
Country/TerritoryUnited States
CityHouston
Period2/16/202/19/20

Keywords

  • attention rnn
  • computed tomography
  • rule-based model
  • weak supervision

ASJC Scopus subject areas

  • Electronic, Optical and Magnetic Materials
  • Atomic and Molecular Physics, and Optics
  • Biomaterials
  • Radiology Nuclear Medicine and imaging

Fingerprint

Dive into the research topics of 'Attention-guided classification of abnormalities in semi-structured computed tomography reports'. Together they form a unique fingerprint.

Cite this