TY - GEN
T1 - Revisiting Wedge Sampling for Budgeted Maximum Inner Product Search
AU - Lorenzen, Stephan S.
AU - Pham, Ninh
PY - 2021
Y1 - 2021
N2 - Top-k maximum inner product search (MIPS) is a central task in many machine learning applications. This work extends top-k MIPS with a budgeted setting, that asks for the best approximate top-k MIPS given a limited budget of computational operations. We investigate recent advanced sampling algorithms, including wedge and diamond sampling, to solve budgeted top-k MIPS. First, we show that diamond sampling is essentially a combination of wedge sampling and basic sampling for top-k MIPS. Our theoretical analysis and empirical evaluation show that wedge is competitive (often superior) to diamond on approximating top-k MIPS regarding both efficiency and accuracy. Second, we propose dWedge, a very simple deterministic variant of wedge sampling for budgeted top-k MIPS. Empirically, dWedge provides significantly higher accuracy than other budgeted top-k MIPS solvers while maintaining a similar speedup.
AB - Top-k maximum inner product search (MIPS) is a central task in many machine learning applications. This work extends top-k MIPS with a budgeted setting, that asks for the best approximate top-k MIPS given a limited budget of computational operations. We investigate recent advanced sampling algorithms, including wedge and diamond sampling, to solve budgeted top-k MIPS. First, we show that diamond sampling is essentially a combination of wedge sampling and basic sampling for top-k MIPS. Our theoretical analysis and empirical evaluation show that wedge is competitive (often superior) to diamond on approximating top-k MIPS regarding both efficiency and accuracy. Second, we propose dWedge, a very simple deterministic variant of wedge sampling for budgeted top-k MIPS. Empirically, dWedge provides significantly higher accuracy than other budgeted top-k MIPS solvers while maintaining a similar speedup.
KW - Budgeted maximum inner product search
KW - Sampling
U2 - 10.1007/978-3-030-67658-2_25
DO - 10.1007/978-3-030-67658-2_25
M3 - Article in proceedings
AN - SCOPUS:85103258797
SN - 9783030676575
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 439
EP - 455
BT - Machine Learning and Knowledge Discovery in Databases - European Conference, ECML PKDD 2020, Proceedings
A2 - Hutter, Frank
A2 - Kersting, Kristian
A2 - Lijffijt, Jefrey
A2 - Valera, Isabel
PB - Springer
T2 - European Conference on Machine Learning and Knowledge Discovery in Databases, ECML PKDD 2020
Y2 - 14 September 2020 through 18 September 2020
ER -