Non-sequential Pipelines and Tuning

Martin Binder, Florian Pfisterer, Marc Becker, Marvin N. Wright

Publikation: Bidrag til bog/antologi/rapportBidrag til bog/antologiForskningpeer review

Abstract

Real-world applications often require complicated pipeline that do not progress sequentially. For example, many experiments have demonstrated that bagging is a powerful method to improve model performance. Bagging can be thought of as a non-sequential pipeline where a learner is replicated, each separate learner is trained and makes predictions, and their results are combined. This is non-sequential as data is not flowing sequentially through the pipeline but is instead passed to all learners (who may then subsample the data) and then recombined, thus creating a pipeline where operations have multiple inputs and outputs. Pipeline operations also have hyperparameters that can be set and tuned to improve model performance. Moreover the choice of operations to include in a pipeline can also be tuned, known as combined algorithm selection and hyperparameter optimization (CASH). This chapter looks at more advanced uses of mlr3pipelines. This is put into practice by demonstrating how to build a bagging and stacking pipeline from scratch, as well as how to access common pipelines that are readily available in mlr3pipelines. The chapter then looks at tuning pipelines and CASH.

OriginalsprogEngelsk
TitelApplied Machine Learning Using mlr3 in R
RedaktørerBernd Bischl, Raphael Sonabend, Lars Kotthoff, Michel Lang
Antal sider22
ForlagCRC Press
Publikationsdato2024
Sider174-195
Kapitel8
ISBN (Trykt)978-1-032-51567-0, 978-1-032-50754-5
ISBN (Elektronisk)978-1-003-40284-8
DOI
StatusUdgivet - 2024

Bibliografisk note

Publisher Copyright:
© 2024 selection and editorial matter, Bernd Bischl, Raphael Sonabend, Lars Kotthoff, and Michel Lang. All rights reserved.

Citationsformater