Abstract
Matrix factorization methods are extensively employed to understand complex data. In this paper, we introduce the cross-product penalized component analysis (X-CAN), a matrix factorization based on the optimization of a loss function that allows a trade-off between variance maximization and structural preservation, with a focus on highlighting differences between groups of observations and/or variables. The approach is based on previous developments, notably (i) the Sparse Principal Component Analysis (SPCA) framework based on the LASSO, (ii) extensions of SPCA to constrain both modes of the factorization, like co-clustering or the Penalized Matrix Decomposition (PMD), and (iii) the Group-wise Principal Component Analysis (GPCA) method. The result is a flexible modeling approach that can be used for data exploration in a large variety of problems. We demonstrate its use with applications from different disciplines.
Original language | English |
---|---|
Article number | 104038 |
Journal | Chemometrics and Intelligent Laboratory Systems |
Volume | 203 |
Number of pages | 16 |
ISSN | 0169-7439 |
DOIs | |
Publication status | Published - 2020 |
Keywords
- Data interpretation
- Group-wise principal component analysis
- Principal component analysis
- Sparse principal component analysis
- Sparsity