TY - GEN
T1 - Domain adaptation for authorship attribution
T2 - 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016
AU - Sapkota, Upendra
AU - Solorio, Thamar
AU - Montes-y-Gómez, Manuel
AU - Bethard, Steven
N1 - Publisher Copyright:
© 2016 Association for Computational Linguistics.
PY - 2016
Y1 - 2016
N2 - We present the first domain adaptation model for authorship attribution to leverage unlabeled data. The model includes extensions to structural correspondence learning needed to make it appropriate for the task. For example, we propose a median-based classification instead of the standard binary classification used in previous work. Our results show that punctuation-based character n-grams form excellent pivot features. We also show how singular value decomposition plays a critical role in achieving domain adaptation, and that replacing (instead of concatenating) non-pivot features with correspondence features yields better performance.
AB - We present the first domain adaptation model for authorship attribution to leverage unlabeled data. The model includes extensions to structural correspondence learning needed to make it appropriate for the task. For example, we propose a median-based classification instead of the standard binary classification used in previous work. Our results show that punctuation-based character n-grams form excellent pivot features. We also show how singular value decomposition plays a critical role in achieving domain adaptation, and that replacing (instead of concatenating) non-pivot features with correspondence features yields better performance.
UR - http://www.scopus.com/inward/record.url?scp=85012034460&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85012034460&partnerID=8YFLogxK
U2 - 10.18653/v1/p16-1210
DO - 10.18653/v1/p16-1210
M3 - Conference contribution
AN - SCOPUS:85012034460
T3 - 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016 - Long Papers
SP - 2226
EP - 2235
BT - 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016 - Long Papers
PB - Association for Computational Linguistics (ACL)
Y2 - 7 August 2016 through 12 August 2016
ER -