Evaluating Input Feature Explanations through a Unified Diagnostic Evaluation Framework

Publikation: Bidrag til bog/antologi/rapportKonferencebidrag i proceedingsForskningpeer review

4 Downloads (Pure)

Abstract

Explaining the decision-making process of machine learning models is crucial for ensuring their reliability and transparency for end users. One popular explanation form highlights key input features, such as i) tokens (e.g., Shapley Values and Integrated Gradients), ii) interactions between tokens (e.g., Bivariate Shapley and Attention-based methods), or iii) interactions between spans of the input (e.g., Louvain Span Interactions). However, these explanation types have only been studied in isolation, making it difficult to judge their respective applicability. To bridge this gap, we develop a unified framework that facilitates an automated and direct comparison between highlight and interactive explanations comprised of four diagnostic properties. We conduct an extensive analysis across these three types of input feature explanations-each utilizing three different explanation techniques-across two datasets and two models, and reveal that each explanation has distinct strengths across the different diagnostic properties. Nevertheless, interactive span explanations outperform other types of input feature explanations across most diagnostic properties. Despite being relatively understudied, our analysis underscores the need for further research to improve methods generating these explanation types. Additionally, integrating them with other explanation types that perform better in certain characteristics could further enhance their overall effectiveness.

OriginalsprogEngelsk
TitelProceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)
RedaktørerLuis Chiruzzo, Alan Ritter, Lu Wang
Antal sider19
ForlagAssociation for Computational Linguistics (ACL)
Publikationsdato2025
Sider10559-10577
ISBN (Elektronisk)9798891761896
DOI
StatusUdgivet - 2025
Begivenhed2025 Annual Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2025 - Hybrid, Albuquerque, USA
Varighed: 29 apr. 20254 maj 2025

Konference

Konference2025 Annual Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2025
Land/OmrådeUSA
ByHybrid, Albuquerque
Periode29/04/202504/05/2025
SponsorAdobe, Baidu, Bloomberg, ...[et al.], Megagon Labs, Toloka

Bibliografisk note

Publisher Copyright:
© 2025 Association for Computational Linguistics.

Citationsformater