Abstract
Proposed All-at-once Nesterov-like extrapolation PARAFAC2 algorithm will greatly advance GC-MS data analysis in scientific areas.
In the past years, multi-way data analysis also called tensor analysis has gained widespread acceptance and attractive research interests in chemometrics owing to the rapid development of advanced analytical instruments. Tensor decomposition techniques such as PARAllel FACtor analysis (PARAFAC) and PARAllel FACtor analysis2 (PARAFAC2) can be regarded as multi-way generalizations of two-way component analysis. Comparing with PARAFAC, PARAFAC2 has the extra ability to deal with a specific type of data problems such as observations have different lengths or measured profiles that slightly change position in the multi-way data. For example, gas chromatography-mass spectrometry data (GC-MS) is one of such typical type of data, where elution profiles can vary and shift between experimental runs. The most commonly used algorithm for fitting PARAFAC2 model is the PARAFAC2-Alternating Least Squares (PARAFAC2-ALS) algorithm, since it is very simple to implement and represents a good trade-off between computational cost and quality of the solution. However, PARAFAC2-ALS algorithm is very slow, especially when facing with ‘swamps’ or ‘bottlenecks’ in the data. In this research, we propose novel implementations of extrapolation-based PARAFAC2 algorithms. Besides the standard PARAFAC2-ALS algorithm, the PARAFAC2- Hierarchical Alternating Least Squares algorithm (PARAFAC2-HALS) is, for the first time, established and investigated in this research. We focus on 12 PARAFAC2 algorithms in total, which are composed of two PARAFAC2 fitting methods, PARAFAC2-ALS and PARAFAC2-HALS, in combination with five different extrapolation acceleration schemes including Enhanced line search, PLS_Toolbox line search, N-way toolbox line search, Nesterov-like extrapolation and All-at-once Nesterov-like extrapolation. The strengths and weaknesses of 12 PARAFAC2 algorithms are quantified in terms of algorithm speed, number of local minima, convergence ability and fitting process using both simulated and real GC-MS datasets. The results show that newly proposed All-at-once Nesterov-like extrapolation PARAFAC2-ALS algorithm achieves the fastest convergence speed whilst maintaining a low fraction of local minima solutions. This algorithm is shown to significantly outperform the latest extrapolation accelerated PARAFAC2 algorithms available in literature, which is deemed to greatly advance GC-MS data analysis in various scientific areas.
In the past years, multi-way data analysis also called tensor analysis has gained widespread acceptance and attractive research interests in chemometrics owing to the rapid development of advanced analytical instruments. Tensor decomposition techniques such as PARAllel FACtor analysis (PARAFAC) and PARAllel FACtor analysis2 (PARAFAC2) can be regarded as multi-way generalizations of two-way component analysis. Comparing with PARAFAC, PARAFAC2 has the extra ability to deal with a specific type of data problems such as observations have different lengths or measured profiles that slightly change position in the multi-way data. For example, gas chromatography-mass spectrometry data (GC-MS) is one of such typical type of data, where elution profiles can vary and shift between experimental runs. The most commonly used algorithm for fitting PARAFAC2 model is the PARAFAC2-Alternating Least Squares (PARAFAC2-ALS) algorithm, since it is very simple to implement and represents a good trade-off between computational cost and quality of the solution. However, PARAFAC2-ALS algorithm is very slow, especially when facing with ‘swamps’ or ‘bottlenecks’ in the data. In this research, we propose novel implementations of extrapolation-based PARAFAC2 algorithms. Besides the standard PARAFAC2-ALS algorithm, the PARAFAC2- Hierarchical Alternating Least Squares algorithm (PARAFAC2-HALS) is, for the first time, established and investigated in this research. We focus on 12 PARAFAC2 algorithms in total, which are composed of two PARAFAC2 fitting methods, PARAFAC2-ALS and PARAFAC2-HALS, in combination with five different extrapolation acceleration schemes including Enhanced line search, PLS_Toolbox line search, N-way toolbox line search, Nesterov-like extrapolation and All-at-once Nesterov-like extrapolation. The strengths and weaknesses of 12 PARAFAC2 algorithms are quantified in terms of algorithm speed, number of local minima, convergence ability and fitting process using both simulated and real GC-MS datasets. The results show that newly proposed All-at-once Nesterov-like extrapolation PARAFAC2-ALS algorithm achieves the fastest convergence speed whilst maintaining a low fraction of local minima solutions. This algorithm is shown to significantly outperform the latest extrapolation accelerated PARAFAC2 algorithms available in literature, which is deemed to greatly advance GC-MS data analysis in various scientific areas.
Original language | English |
---|---|
Publication date | 2020 |
Publication status | Published - 2020 |
Event | Annual Conference of the Federation of Analytical Chemistry and Spectroscopy Societies - USA Duration: 12 Oct 2020 → … |
Conference
Conference | Annual Conference of the Federation of Analytical Chemistry and Spectroscopy Societies |
---|---|
Location | USA |
Period | 12/10/2020 → … |