Abstract
We evaluate the performance disparity of the Whisper and MMS families of ASR models across the VoxPopuli and Common Voice multilingual datasets, with an eye toward intersectionality. Our two most important findings are that model size, surprisingly, correlates logarithmically with worst-case performance disparities, meaning that larger (and better) models are less fair. We also observe the importance of intersectionality. In particular, models often exhibit significant performance disparity across binary gender for adolescents.
Originalsprog | Engelsk |
---|---|
Titel | Findings of the Association for Computational Linguistics: NAACL 2024 |
Forlag | Association for Computational Linguistics |
Publikationsdato | 2024 |
Sider | 2213–2226 |
Status | Udgivet - 2024 |
Begivenhed | 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL 2024 - Hybrid, Mexico City, Mexico Varighed: 16 jun. 2024 → 21 jun. 2024 |
Konference
Konference | 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL 2024 |
---|---|
Land/Område | Mexico |
By | Hybrid, Mexico City |
Periode | 16/06/2024 → 21/06/2024 |
Sponsor | Baidu, Capital One, et al., Grammarly, Megagon Labs, Otter.ai |