TY - JOUR
T1 - Combining joint models for biomedical event extraction.
AU - McClosky, David
AU - Riedel, Sebastian
AU - Surdeanu, Mihai
AU - McCallum, Andrew
AU - Manning, Christopher D.
N1 - Funding Information:
We thank the BioNLP shared task organizers for creating the shared task along with its surrounding resources and for their quick responses to questions. This work was supported in part by the Center for Intelligent Information Retrieval and in part by UPenn NSF medium IIS-0803847. We gratefully acknowledge the support of the Defense Advanced Research Projects Agency (DARPA) Machine Reading Program under Air Force Research Laboratory (AFRL) prime contract no. FA8750-09-C-0181. This article has been published as part of BMC Bioinformatics Volume 13 Supplement 11, 2012: Selected articles from BioNLP Shared Task 2011. The full contents of the supplement are available online at http://www. biomedcentral.com/bmcbioinformatics/supplements/13/S11.
PY - 2012
Y1 - 2012
N2 - We explore techniques for performing model combination between the UMass and Stanford biomedical event extraction systems. Both sub-components address event extraction as a structured prediction problem, and use dual decomposition (UMass) and parsing algorithms (Stanford) to find the best scoring event structure. Our primary focus is on stacking where the predictions from the Stanford system are used as features in the UMass system. For comparison, we look at simpler model combination techniques such as intersection and union which require only the outputs from each system and combine them directly. First, we find that stacking substantially improves performance while intersection and union provide no significant benefits. Second, we investigate the graph properties of event structures and their impact on the combination of our systems. Finally, we trace the origins of events proposed by the stacked model to determine the role each system plays in different components of the output. We learn that, while stacking can propose novel event structures not seen in either base model, these events have extremely low precision. Removing these novel events improves our already state-of-the-art F1 to 56.6% on the test set of Genia (Task 1). Overall, the combined system formed via stacking ("FAUST") performed well in the BioNLP 2011 shared task. The FAUST system obtained 1st place in three out of four tasks: 1st place in Genia Task 1 (56.0% F1) and Task 2 (53.9%), 2nd place in the Epigenetics and Post-translational Modifications track (35.0%), and 1st place in the Infectious Diseases track (55.6%). We present a state-of-the-art event extraction system that relies on the strengths of structured prediction and model combination through stacking. Akin to results on other tasks, stacking outperforms intersection and union and leads to very strong results. The utility of model combination hinges on complementary views of the data, and we show that our sub-systems capture different graph properties of event structures. Finally, by removing low precision novel events, we show that performance from stacking can be further improved.
AB - We explore techniques for performing model combination between the UMass and Stanford biomedical event extraction systems. Both sub-components address event extraction as a structured prediction problem, and use dual decomposition (UMass) and parsing algorithms (Stanford) to find the best scoring event structure. Our primary focus is on stacking where the predictions from the Stanford system are used as features in the UMass system. For comparison, we look at simpler model combination techniques such as intersection and union which require only the outputs from each system and combine them directly. First, we find that stacking substantially improves performance while intersection and union provide no significant benefits. Second, we investigate the graph properties of event structures and their impact on the combination of our systems. Finally, we trace the origins of events proposed by the stacked model to determine the role each system plays in different components of the output. We learn that, while stacking can propose novel event structures not seen in either base model, these events have extremely low precision. Removing these novel events improves our already state-of-the-art F1 to 56.6% on the test set of Genia (Task 1). Overall, the combined system formed via stacking ("FAUST") performed well in the BioNLP 2011 shared task. The FAUST system obtained 1st place in three out of four tasks: 1st place in Genia Task 1 (56.0% F1) and Task 2 (53.9%), 2nd place in the Epigenetics and Post-translational Modifications track (35.0%), and 1st place in the Infectious Diseases track (55.6%). We present a state-of-the-art event extraction system that relies on the strengths of structured prediction and model combination through stacking. Akin to results on other tasks, stacking outperforms intersection and union and leads to very strong results. The utility of model combination hinges on complementary views of the data, and we show that our sub-systems capture different graph properties of event structures. Finally, by removing low precision novel events, we show that performance from stacking can be further improved.
UR - http://www.scopus.com/inward/record.url?scp=84876207025&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84876207025&partnerID=8YFLogxK
U2 - 10.1186/1471-2105-13-S11-S9
DO - 10.1186/1471-2105-13-S11-S9
M3 - Article
C2 - 22759463
AN - SCOPUS:84876207025
SN - 1471-2105
VL - 13 Suppl 11
SP - S9
JO - BMC bioinformatics
JF - BMC bioinformatics
M1 - S9
ER -