Massively-parallel best subset selection for ordinary least-squares regression

Fabian Gieseke, Kai Lars Polsterer, Ashish Mahabal, Christian Igel, Tom Heskes

Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

3 Citations (Scopus)

Abstract

Selecting an optimal subset of k out of d features for linear regression models given n training instances is often considered intractable for feature spaces with hundreds or thousands of dimensions. We propose an efficient massively-parallel implementation for selecting such optimal feature subsets in a brute-force fashion for small k. By exploiting the enormous compute power provided by modern parallel devices such as graphics processing units, it can deal with thousands of input dimensions even using standard commodity hardware only. We evaluate the practical runtime using artificial datasets and sketch the applicability of our framework in the context of astronomy.
Original languageEnglish
Title of host publication2017 IEEE Symposium Series on Computational Intelligence (SSCI) Proceedings
Number of pages8
PublisherIEEE
Publication date2017
Pages1-8
ISBN (Electronic)978-1-5386-2726-6
DOIs
Publication statusPublished - 2017
Event2017 IEEE Symposium Series on Computational Intelligence (SSCI) - Honolulu, United States
Duration: 27 Nov 20171 Dec 2017

Conference

Conference2017 IEEE Symposium Series on Computational Intelligence (SSCI)
Country/TerritoryUnited States
CityHonolulu
Period27/11/201701/12/2017

Keywords

  • graphics processing units
  • least squares approximations
  • optimisation
  • parallel processing
  • regression analysis
  • sensitivity analysis
  • input dimensions
  • linear regression models
  • massively-parallel best subset selection
  • optimal feature subsets
  • optimal subset
  • ordinary least-squares regression
  • subset selection
  • Computational modeling
  • Graphics processing units
  • Instruction sets
  • Optimization
  • Runtime
  • Task analysis
  • Training

Cite this