Abstract
I highlight a simple failure mode of state-of-the-art machine reading systems: when contexts do not align with commonly shared beliefs. For example, machine reading systems fail to answer What did Elizabeth want? correctly in the context of ‘My kingdom for a cough drop, cried Queen Elizabeth.’ Biased by co-occurrence statistics in the training data of pretrained language models, systems predict my kingdom, rather than a cough drop. I argue such biases are analogous to human belief biases and present a carefully designed challenge dataset for English machine reading, called Auto-Locke, to quantify such effects. Evaluations of machine reading systems on Auto-Locke show the pervasiveness of belief bias in machine reading.
Original language | English |
---|---|
Title of host publication | Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing |
Publisher | Association for Computational Linguistics |
Publication date | 2021 |
Pages | 8240–8245 |
DOIs | |
Publication status | Published - 2021 |
Event | 2021 Conference on Empirical Methods in Natural Language Processing - Duration: 7 Nov 2021 → 11 Nov 2021 |
Conference
Conference | 2021 Conference on Empirical Methods in Natural Language Processing |
---|---|
Period | 07/11/2021 → 11/11/2021 |