High-Throughput Extraction of Phase–Property Relationships from Literature Using Natural Language Processing and Large Language Models

Luca Montanelli, Vineeth Venugopal, Elsa A. Olivetti, Marat I. Latypov

Research output: Contribution to journalArticlepeer-review

Abstract

Consolidating published research on aluminum alloys into insights about microstructure–property relationships can simplify and reduce the costs involved in alloy design. One critical design consideration for many heat-treatable alloys deriving superior properties from precipitation are phases as key microstructure constituents because they can have a decisive impact on the engineering properties of alloys. Here, we present a computational framework for high-throughput extraction of phases and their impact on properties from scientific papers. Our framework includes transformer-based and large language models to identify sentences with phase-property information in papers, recognize phase and property entities, and extract phase-property relationships and their “sentiment.” We demonstrate the application of our framework on aluminum alloys, for which we build a database of 7,675 phase–property relationships extracted from a corpus of almost 5000 full-text papers. We comment on the extracted relationships based on common metallurgical knowledge.

Original languageEnglish (US)
Pages (from-to)396-405
Number of pages10
JournalIntegrating Materials and Manufacturing Innovation
Volume13
Issue number2
DOIs
StatePublished - Jun 2024

Keywords

  • Aluminum alloys
  • Large language models
  • Natural language processing
  • Phase–property relationship

ASJC Scopus subject areas

  • General Materials Science
  • Industrial and Manufacturing Engineering

Fingerprint

Dive into the research topics of 'High-Throughput Extraction of Phase–Property Relationships from Literature Using Natural Language Processing and Large Language Models'. Together they form a unique fingerprint.

Cite this