Attention-Driven Dropout: A Simple Method to Improve Self-supervised Contrastive Sentence Embeddings

Fabian Stermann, Ilias Chalkidis, Amihossein Vahidi, Bernd Bischl, Mina Rezaei*

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

2 Citations (Scopus)

Abstract

Self-contrastive learning has proven effective for vision and natural language tasks. It aims to learn aligned data representations by encoding similar and dissimilar sentence pairs without human annotation. Therefore, data augmentation plays a crucial role in the learned embedding quality. However, in natural language processing (NLP), creating augmented samples for unsupervised contrastive learning is challenging since random editing may modify the semantic meanings of sentences and thus affect learning good representations. In this paper, we introduce a simple, still effective approach dubbed ADD (Attention-DrivenDropout) to generate better-augmented views of sentences to be used in self-contrastive learning. Given a sentence and a Pre-trained Transformer Language Model (PLM), such as RoBERTa, we use the aggregated attention scores of the PLM to remove the less “informative” tokens from the input. We consider two alternative algorithms based on NaiveAggregation across layers/heads and AttentionRollout [1]. Our approach significantly improves the overall performance of various self-supervised contrastive-based methods, including SimCSE [14], DiffCSE [10], and InfoCSE [33] by facilitating the generation of high-quality positive pairs required by these methods. Through empirical evaluations on multiple Semantic Textual Similarity (STS) and Transfer Learning tasks, we observe enhanced performance across the board.

Original languageEnglish
Title of host publicationMachine Learning and Knowledge Discovery in Databases. Research Track - European Conference, ECML PKDD 2024, Proceedings
EditorsAlbert Bifet, Jesse Davis, Tomas Krilavičius, Meelis Kull, Eirini Ntoutsi, Indrė Žliobaitė
Number of pages18
PublisherSpringer
Publication date2024
Pages89-106
ISBN (Print)9783031703409
DOIs
Publication statusPublished - 2024
EventEuropean Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML PKDD 2024 - Vilnius, Lithuania
Duration: 9 Sept 202413 Sept 2024

Conference

ConferenceEuropean Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML PKDD 2024
Country/TerritoryLithuania
CityVilnius
Period09/09/202413/09/2024
SeriesLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume14941 LNAI
ISSN0302-9743

Bibliographical note

Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024.

Cite this