TY - JOUR
T1 - Check-all-that-apply data analysed by Partial Least Squares regression
AU - Rinnan, Åsmund
AU - Giacalone, Davide
AU - Frøst, Michael Bom
PY - 2015
Y1 - 2015
N2 - This paper discusses the application of Partial Least Squares regression (PLS) to handle sensory data from check-all-that-apply (CATA) questions in a rapid, statistically reliable, and graphically-efficient way. We start by discussing the theory behind the CATA data and how these normally are analysed by multivariate techniques. CATA data can be analysed both by setting the CATA as the X and the Y. The former is the PLS-Discriminant Analysis (PLS-DA) version, while the latter is the ANOVA-PLS (A-PLS) version. We investigated the difference between these two approaches, concluding that there is none. This is followed by a discussion of how to get a good estimate of the uncertainty of the model parameters in the PLS model. For a PLS model this is often assessed by leave-one-respondent-out cross-validation. We will, though, show that this gives too optimistic uncertainty estimates, and a repeated split-half approach should rather be used. Finally, we will discuss the shortcomings of using univariate techniques such as the Cochran’s Q test and even the uncertainty estimates based on the Jack-knifed regression coefficients compared to the multivariate reality of the loading weights in PLS-DA. Overall, this paper provides a formal introduction as to how to utilize PLS-DA and cross validation with resampling for the investigation of CATA data.
AB - This paper discusses the application of Partial Least Squares regression (PLS) to handle sensory data from check-all-that-apply (CATA) questions in a rapid, statistically reliable, and graphically-efficient way. We start by discussing the theory behind the CATA data and how these normally are analysed by multivariate techniques. CATA data can be analysed both by setting the CATA as the X and the Y. The former is the PLS-Discriminant Analysis (PLS-DA) version, while the latter is the ANOVA-PLS (A-PLS) version. We investigated the difference between these two approaches, concluding that there is none. This is followed by a discussion of how to get a good estimate of the uncertainty of the model parameters in the PLS model. For a PLS model this is often assessed by leave-one-respondent-out cross-validation. We will, though, show that this gives too optimistic uncertainty estimates, and a repeated split-half approach should rather be used. Finally, we will discuss the shortcomings of using univariate techniques such as the Cochran’s Q test and even the uncertainty estimates based on the Jack-knifed regression coefficients compared to the multivariate reality of the loading weights in PLS-DA. Overall, this paper provides a formal introduction as to how to utilize PLS-DA and cross validation with resampling for the investigation of CATA data.
U2 - 10.1016/j.foodqual.2015.01.018
DO - 10.1016/j.foodqual.2015.01.018
M3 - Journal article
VL - 42
SP - 146
EP - 153
JO - Food Quality and Preference
JF - Food Quality and Preference
SN - 0950-3293
ER -