TY - GEN
T1 - Sparse Regression via Range Counting
AU - Cardinal, Jean
AU - Ooms, Aurèlien
PY - 2020
Y1 - 2020
N2 - The sparse regression problem, also known as best subset selection problem, can be cast as follows: Given a set S of n points in ℝ^d, a point y∈ ℝ^d, and an integer 2 ≤ k ≤ d, find an affine combination of at most k points of S that is nearest to y. We describe a O(n^{k-1} log^{d-k+2} n)-time randomized (1+ε)-approximation algorithm for this problem with d and ε constant. This is the first algorithm for this problem running in time o(n^k). Its running time is similar to the query time of a data structure recently proposed by Har-Peled, Indyk, and Mahabadi (ICALP'18), while not requiring any preprocessing. Up to polylogarithmic factors, it matches a conditional lower bound relying on a conjecture about affine degeneracy testing. In the special case where k = d = O(1), we provide a simple O_δ(n^{d-1+δ})-time deterministic exact algorithm, for any δ > 0. Finally, we show how to adapt the approximation algorithm for the sparse linear regression and sparse convex regression problems with the same running time, up to polylogarithmic factors.
AB - The sparse regression problem, also known as best subset selection problem, can be cast as follows: Given a set S of n points in ℝ^d, a point y∈ ℝ^d, and an integer 2 ≤ k ≤ d, find an affine combination of at most k points of S that is nearest to y. We describe a O(n^{k-1} log^{d-k+2} n)-time randomized (1+ε)-approximation algorithm for this problem with d and ε constant. This is the first algorithm for this problem running in time o(n^k). Its running time is similar to the query time of a data structure recently proposed by Har-Peled, Indyk, and Mahabadi (ICALP'18), while not requiring any preprocessing. Up to polylogarithmic factors, it matches a conditional lower bound relying on a conjecture about affine degeneracy testing. In the special case where k = d = O(1), we provide a simple O_δ(n^{d-1+δ})-time deterministic exact algorithm, for any δ > 0. Finally, we show how to adapt the approximation algorithm for the sparse linear regression and sparse convex regression problems with the same running time, up to polylogarithmic factors.
U2 - 10.4230/LIPIcs.SWAT.2020.20
DO - 10.4230/LIPIcs.SWAT.2020.20
M3 - Article in proceedings
T3 - Leibniz International Proceedings in Informatics, LIPIcs
SP - 1
EP - 17
BT - 17th Scandinavian Symposium and Workshops on Algorithm Theory (SWAT 2020)
A2 - Albers, Susanne
PB - Schloss Dagstuhl - Leibniz-Zentrum für Informatik
T2 - 17th Scandinavian Symposium and Workshops on Algorithm Theory (SWAT 2020)
Y2 - 22 June 2020 through 24 June 2020
ER -