TY - GEN
T1 - Fashion-focused creative commons social dataset
AU - Loni, Babak
AU - Menendez Blanco, Maria
AU - Georgescu, Mihai
AU - Galli, Luca
AU - Massari, Claudio
AU - Altingovde, Ismail Sengor
AU - Martinenghi, Davide
AU - Melenhorst, Mark
AU - Vliegendhart, Raynor
AU - Larson, Martha
PY - 2013
Y1 - 2013
N2 - In this work, we present a fashion-focused Creative Commons dataset, which is designed to contain a mix of general images as well as a large component of images that are focused on fashion (i.e., relevant to particular clothing items or fashion accessories). The dataset contains 4810 images and related metadata. Furthermore, a ground truth on image's tags is presented. Ground truth generation for large-scale datasets is a necessary but expensive task. Traditional expert based approaches have become an expensive and non-scalable solution. For this reason, we turn to crowdsourcing techniques in order to collect ground truth labels; in particular we make use of the commercial crowdsourcing platform, Amazon Mechanical Turk (AMT). Two different groups of annotators (i.e., trusted annotators known to the authors and crowdsourcing workers on AMT) participated in the ground truth creation. Annotation agreement between the two groups is analyzed. Applications of the dataset in different contexts are discussed. This dataset contributes to research areas such as crowdsourcing for multimedia, multimedia content analysis, and design of systems that can elicit fashion preferences from users.
AB - In this work, we present a fashion-focused Creative Commons dataset, which is designed to contain a mix of general images as well as a large component of images that are focused on fashion (i.e., relevant to particular clothing items or fashion accessories). The dataset contains 4810 images and related metadata. Furthermore, a ground truth on image's tags is presented. Ground truth generation for large-scale datasets is a necessary but expensive task. Traditional expert based approaches have become an expensive and non-scalable solution. For this reason, we turn to crowdsourcing techniques in order to collect ground truth labels; in particular we make use of the commercial crowdsourcing platform, Amazon Mechanical Turk (AMT). Two different groups of annotators (i.e., trusted annotators known to the authors and crowdsourcing workers on AMT) participated in the ground truth creation. Annotation agreement between the two groups is analyzed. Applications of the dataset in different contexts are discussed. This dataset contributes to research areas such as crowdsourcing for multimedia, multimedia content analysis, and design of systems that can elicit fashion preferences from users.
U2 - 10.1145/2483977.2483984
DO - 10.1145/2483977.2483984
M3 - Article in proceedings
SP - 72
EP - 77
BT - Proceedings of the 4th ACM Multimedia Systems Conference
ER -