High-Throughput Extraction of Phase–Property Relationships from Literature Using Natural Language Processing and Large Language Models

Luca Montanelli, Vineeth Venugopal, Elsa A. Olivetti, Marat I. Latypov

Research output: Contribution to journalArticlepeer-review

3 Scopus citations

Abstract

Consolidating published research on aluminum alloys into insights about microstructure–property relationships can simplify and reduce the costs involved in alloy design. One critical design consideration for many heat-treatable alloys deriving superior properties from precipitation are phases as key microstructure constituents because they can have a decisive impact on the engineering properties of alloys. Here, we present a computational framework for high-throughput extraction of phases and their impact on properties from scientific papers. Our framework includes transformer-based and large language models to identify sentences with phase-property information in papers, recognize phase and property entities, and extract phase-property relationships and their “sentiment.” We demonstrate the application of our framework on aluminum alloys, for which we build a database of 7,675 phase–property relationships extracted from a corpus of almost 5000 full-text papers. We comment on the extracted relationships based on common metallurgical knowledge.

Original languageEnglish (US)
Pages (from-to)396-405
Number of pages10
JournalIntegrating Materials and Manufacturing Innovation
Volume13
Issue number2
DOIs
StatePublished - Jun 2024

Keywords

  • Aluminum alloys
  • Large language models
  • Natural language processing
  • Phase–property relationship

ASJC Scopus subject areas

  • General Materials Science
  • Industrial and Manufacturing Engineering

Fingerprint

Dive into the research topics of 'High-Throughput Extraction of Phase–Property Relationships from Literature Using Natural Language Processing and Large Language Models'. Together they form a unique fingerprint.

Cite this