Abstract
I highlight a simple failure mode of state-of-the-art machine reading systems: when contexts do not align with commonly shared beliefs. For example, machine reading systems fail to answer What did Elizabeth want? correctly in the context of ‘My kingdom for a cough drop, cried Queen Elizabeth.’ Biased by co-occurrence statistics in the training data of pretrained language models, systems predict my kingdom, rather than a cough drop. I argue such biases are analogous to human belief biases and present a carefully designed challenge dataset for English machine reading, called Auto-Locke, to quantify such effects. Evaluations of machine reading systems on Auto-Locke show the pervasiveness of belief bias in machine reading.
Originalsprog | Engelsk |
---|---|
Titel | Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing |
Forlag | Association for Computational Linguistics |
Publikationsdato | 2021 |
Sider | 8240–8245 |
DOI | |
Status | Udgivet - 2021 |
Begivenhed | 2021 Conference on Empirical Methods in Natural Language Processing - Varighed: 7 nov. 2021 → 11 nov. 2021 |
Konference
Konference | 2021 Conference on Empirical Methods in Natural Language Processing |
---|---|
Periode | 07/11/2021 → 11/11/2021 |