TY - GEN
T1 - Improving Toponym Resolution with Better Candidate Generation, Transformer-based Reranking, and Two-Stage Resolution
AU - Zhang, Zeyu
AU - Bethard, Steven
N1 - Publisher Copyright:
© 2023 Association for Computational Linguistics.
PY - 2023
Y1 - 2023
N2 - Geocoding is the task of converting location mentions in text into structured data that encodes the geospatial semantics. We propose a new architecture for geocoding, GeoNorm. GeoNorm first uses information retrieval techniques to generate a list of candidate entries from the geospatial ontology. Then it reranks the candidate entries using a transformer-based neural network that incorporates information from the ontology such as the entry's population. This generate-and-rerank process is applied twice: first to resolve the less ambiguous countries, states, and counties, and second to resolve the remaining location mentions, using the identified countries, states, and counties as context. Our proposed toponym resolution framework achieves state-of-the-art performance on multiple datasets. Code and models are available at https://github. com/clulab/geonorm.
AB - Geocoding is the task of converting location mentions in text into structured data that encodes the geospatial semantics. We propose a new architecture for geocoding, GeoNorm. GeoNorm first uses information retrieval techniques to generate a list of candidate entries from the geospatial ontology. Then it reranks the candidate entries using a transformer-based neural network that incorporates information from the ontology such as the entry's population. This generate-and-rerank process is applied twice: first to resolve the less ambiguous countries, states, and counties, and second to resolve the remaining location mentions, using the identified countries, states, and counties as context. Our proposed toponym resolution framework achieves state-of-the-art performance on multiple datasets. Code and models are available at https://github. com/clulab/geonorm.
UR - http://www.scopus.com/inward/record.url?scp=85175399557&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85175399557&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85175399557
T3 - Proceedings of the Annual Meeting of the Association for Computational Linguistics
SP - 48
EP - 60
BT - StarSEM 2023 - 12th Joint Conference on Lexical and Computational Semantics, Proceedings of the Conference
A2 - Palmer, Alexis
A2 - Camacho-Collados, Jose
PB - Association for Computational Linguistics (ACL)
T2 - 12th Joint Conference on Lexical and Computational Semantics, StarSEM 2023, co-located with ACL 2023
Y2 - 13 July 2023 through 14 July 2023
ER -