TY - JOUR
T1 - Exploratory data structure comparisons
T2 - three new visual tools based on principal component analysis*
AU - Petersen, Anne Helby
AU - Markussen, Bo
AU - Christensen, Karl Bang
PY - 2021
Y1 - 2021
N2 - Datasets are sometimes divided into distinct subsets, e.g. due to multi-center sampling, or to variations in instruments, questionnaire item ordering or mode of administration, and the data analyst then needs to assess whether a joint analysis is meaningful. The Principal Component Analysis-based Data Structure Comparisons (PCADSC) tools are three new non-parametric, visual diagnostic tools for investigating differences in structure for two subsets of a dataset through covariance matrix comparisons by use of principal component analysis. The PCADCS tools are demonstrated in a data example using European Social Survey data on psychological well-being in three countries, Denmark, Sweden, and Bulgaria. The data structures are found to be different in Denmark and Bulgaria, and thus a comparison of for example mean psychological well-being scores is not meaningful. However, when comparing Denmark and Sweden, very similar data structures, and thus comparable concepts of well-being, are found. Therefore, inter-country comparisons are warranted for these countries.
AB - Datasets are sometimes divided into distinct subsets, e.g. due to multi-center sampling, or to variations in instruments, questionnaire item ordering or mode of administration, and the data analyst then needs to assess whether a joint analysis is meaningful. The Principal Component Analysis-based Data Structure Comparisons (PCADSC) tools are three new non-parametric, visual diagnostic tools for investigating differences in structure for two subsets of a dataset through covariance matrix comparisons by use of principal component analysis. The PCADCS tools are demonstrated in a data example using European Social Survey data on psychological well-being in three countries, Denmark, Sweden, and Bulgaria. The data structures are found to be different in Denmark and Bulgaria, and thus a comparison of for example mean psychological well-being scores is not meaningful. However, when comparing Denmark and Sweden, very similar data structures, and thus comparable concepts of well-being, are found. Therefore, inter-country comparisons are warranted for these countries.
KW - covariance matrix
KW - data structure
KW - exploratory data analysis
KW - Principal component analysis
U2 - 10.1080/02664763.2020.1773772
DO - 10.1080/02664763.2020.1773772
M3 - Journal article
C2 - 35706572
AN - SCOPUS:85086333334
VL - 48
SP - 1675
EP - 1695
JO - Journal of Applied Statistics
JF - Journal of Applied Statistics
SN - 0266-4763
IS - 9
ER -