Abstract
The ikpls software package provides fast and efficient tools for PLS (Partial Least Squares)
modeling. This package is designed to help researchers and practitioners handle PLS modeling
faster than previously possible - particularly on large datasets. The PLS implementations in
ikpls use the fast IKPLS (Improved Kernel PLS) algorithms (Dayal & MacGregor, 1997),
providing a substantial speedup compared to scikit-learn’s (Pedregosa et al., 2011) PLS
implementation, which is based on NIPALS (Nonlinear Iterative Partial Least Squares) (H.
Wold, 1966). The ikpls package also offers an implementation of IKPLS combined with
the fast cross-validation algorithm by O.-C. G. Engstrøm (2024), significantly accelerating
cross-validation of PLS models - especially when using a large number of cross-validation splits.
ikpls offers NumPy-based CPU and JAX-based CPU/GPU/TPU implementations. The
JAX implementations are also differentiable, allowing seamless integration with deep learning
techniques. This versatility enables users to handle diverse data dimensions efficiently.
In conclusion, ikpls empowers researchers and practitioners in machine learning, chemometrics,
and related fields with efficient, scalable, and end-to-end differentiable tools for PLS modeling,
facilitating optimal component selection and preprocessing decisions by offering implementations
of
1. both variants of IKPLS for CPUs;
2. both variants of IKPLS for GPUs, both of which are end-to-end differentiable, allowing
integration with deep learning models;
3. IKPLS combined with a cross-validation algorithm that yields a substantial speedup
compared to the classical cross-validation algorithm.
modeling. This package is designed to help researchers and practitioners handle PLS modeling
faster than previously possible - particularly on large datasets. The PLS implementations in
ikpls use the fast IKPLS (Improved Kernel PLS) algorithms (Dayal & MacGregor, 1997),
providing a substantial speedup compared to scikit-learn’s (Pedregosa et al., 2011) PLS
implementation, which is based on NIPALS (Nonlinear Iterative Partial Least Squares) (H.
Wold, 1966). The ikpls package also offers an implementation of IKPLS combined with
the fast cross-validation algorithm by O.-C. G. Engstrøm (2024), significantly accelerating
cross-validation of PLS models - especially when using a large number of cross-validation splits.
ikpls offers NumPy-based CPU and JAX-based CPU/GPU/TPU implementations. The
JAX implementations are also differentiable, allowing seamless integration with deep learning
techniques. This versatility enables users to handle diverse data dimensions efficiently.
In conclusion, ikpls empowers researchers and practitioners in machine learning, chemometrics,
and related fields with efficient, scalable, and end-to-end differentiable tools for PLS modeling,
facilitating optimal component selection and preprocessing decisions by offering implementations
of
1. both variants of IKPLS for CPUs;
2. both variants of IKPLS for GPUs, both of which are end-to-end differentiable, allowing
integration with deep learning models;
3. IKPLS combined with a cross-validation algorithm that yields a substantial speedup
compared to the classical cross-validation algorithm.
Original language | English |
---|---|
Article number | 6533 |
Journal | The Journal of Open Source Software |
Volume | 9 |
Issue number | 99 |
Number of pages | 6 |
ISSN | 2475-9066 |
DOIs | |
Publication status | Published - 2024 |