TY - GEN
T1 - Inferring missing metadata from environmental policy texts
AU - Bethard, Steven
AU - Laparra, Egoitz
AU - Wang, Sophia
AU - Zhao, Yiyun
AU - Al-Ghezi, Ragheb
AU - Lien, Aaron
AU - López-Hoffman, Laura
N1 - Funding Information:
This material is based upon work supported by the National Science Foundation under Grant No. (1831551). Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.
Publisher Copyright:
© 2019 Association for Computational Linguistics.All right reserved.
PY - 2019
Y1 - 2019
N2 - The National Environmental Policy Act (NEPA) provides a trove of data on how environmental policy decisions have been made in the United States over the last 50 years. Unfortunately, there is no central database for this information and it is too voluminous to assess manually. We describe our efforts to enable systematic research over US environmental policy by extracting and organizing metadata from the text of NEPA documents. Our contributions include collecting more than 40,000 NEPA-related documents, and evaluating rule-based baselines that establish the difficulty of three important tasks: Identifying lead agencies, aligning document versions, and detecting reused text.
AB - The National Environmental Policy Act (NEPA) provides a trove of data on how environmental policy decisions have been made in the United States over the last 50 years. Unfortunately, there is no central database for this information and it is too voluminous to assess manually. We describe our efforts to enable systematic research over US environmental policy by extracting and organizing metadata from the text of NEPA documents. Our contributions include collecting more than 40,000 NEPA-related documents, and evaluating rule-based baselines that establish the difficulty of three important tasks: Identifying lead agencies, aligning document versions, and detecting reused text.
UR - http://www.scopus.com/inward/record.url?scp=85119425293&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85119425293&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85119425293
T3 - LaTeCH@NAACL-HLT 2019 - 3rd Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature, Proceedings
SP - 46
EP - 51
BT - LaTeCH@NAACL-HLT 2019 - 3rd Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature, Proceedings
PB - Association for Computational Linguistics (ACL)
T2 - 3rd Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature, LaTeCH@NAACL-HLT 2019
Y2 - 7 June 2019
ER -