Revisiting Wedge Sampling for Budgeted Maximum Inner Product Search

Stephan S. Lorenzen, Ninh Pham*

*Corresponding author af dette arbejde

Publikation: Bidrag til bog/antologi/rapportKonferencebidrag i proceedingsForskningpeer review

4 Citationer (Scopus)
6 Downloads (Pure)

Abstract

Top-k maximum inner product search (MIPS) is a central task in many machine learning applications. This work extends top-k MIPS with a budgeted setting, that asks for the best approximate top-k MIPS given a limited budget of computational operations. We investigate recent advanced sampling algorithms, including wedge and diamond sampling, to solve budgeted top-k MIPS. First, we show that diamond sampling is essentially a combination of wedge sampling and basic sampling for top-k MIPS. Our theoretical analysis and empirical evaluation show that wedge is competitive (often superior) to diamond on approximating top-k MIPS regarding both efficiency and accuracy. Second, we propose dWedge, a very simple deterministic variant of wedge sampling for budgeted top-k MIPS. Empirically, dWedge provides significantly higher accuracy than other budgeted top-k MIPS solvers while maintaining a similar speedup.

OriginalsprogEngelsk
TitelMachine Learning and Knowledge Discovery in Databases - European Conference, ECML PKDD 2020, Proceedings
RedaktørerFrank Hutter, Kristian Kersting, Jefrey Lijffijt, Isabel Valera
Antal sider17
ForlagSpringer
Publikationsdato2021
Sider439-455
ISBN (Trykt)9783030676575
DOI
StatusUdgivet - 2021
BegivenhedEuropean Conference on Machine Learning and Knowledge Discovery in Databases, ECML PKDD 2020 - Virtual, Online
Varighed: 14 sep. 202018 sep. 2020

Konference

KonferenceEuropean Conference on Machine Learning and Knowledge Discovery in Databases, ECML PKDD 2020
ByVirtual, Online
Periode14/09/202018/09/2020
NavnLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Vol/bind12457 LNAI
ISSN0302-9743

Citationsformater