TY - GEN
T1 - Self-Supervised Behavior Cloned Transformers are Path Crawlers for Text Games
AU - Wang, Ruoyao
AU - Jansen, Peter
N1 - Publisher Copyright:
© 2023 Association for Computational Linguistics.
PY - 2023
Y1 - 2023
N2 - In this work, we introduce a self-supervised behavior cloning transformer for text games, which are challenging benchmarks for multi-step reasoning in virtual environments. Traditionally, Behavior Cloning Transformers excel in such tasks but rely on supervised training data. Our approach auto-generates training data by exploring trajectories (defined by common macro-action sequences) that lead to reward within the games, while determining the generality and utility of these trajectories by rapidly training small models then evaluating their performance on unseen development games. Through empirical analysis, we show our method consistently uncovers generalizable training data, achieving about 90% performance of supervised systems across three benchmark text games.
AB - In this work, we introduce a self-supervised behavior cloning transformer for text games, which are challenging benchmarks for multi-step reasoning in virtual environments. Traditionally, Behavior Cloning Transformers excel in such tasks but rely on supervised training data. Our approach auto-generates training data by exploring trajectories (defined by common macro-action sequences) that lead to reward within the games, while determining the generality and utility of these trajectories by rapidly training small models then evaluating their performance on unseen development games. Through empirical analysis, we show our method consistently uncovers generalizable training data, achieving about 90% performance of supervised systems across three benchmark text games.
UR - http://www.scopus.com/inward/record.url?scp=85183299788&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85183299788&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85183299788
T3 - Findings of the Association for Computational Linguistics: EMNLP 2023
SP - 5555
EP - 5565
BT - Findings of the Association for Computational Linguistics
PB - Association for Computational Linguistics (ACL)
T2 - 2023 Findings of the Association for Computational Linguistics: EMNLP 2023
Y2 - 6 December 2023 through 10 December 2023
ER -