Abstract
Digitalizing the visible world: Scanning the environment using a mobile with an AI app
Knowledge of objects in spatial environments is obtained through multisensorial resources, such as sight, hearing, touch and smell (Mondada, 2019). For most people the visual sense takes priority in the perception of objects’ spatial relation to the sensing body and the social world (Workman, 2016). We mostly use our eyes to unnoticeably perceive objects and their position in the environment (e.g. Goodwin & Goodwin, 1996). Visually impaired people (VIP), however, have no or very limited access to the visual aspects of these spatial relations, when orienting towards their immediate surroundings. The emergence of computer vision and natural language processing (NLP) in accessible mainstream technology, such as smartphones, enables the device to “translate” the visual world into language descriptions.
We explore in this paper the use of the app Microsoft SeeingAI which enables VIP to receive computerized descriptions of “visual” information about objects, people and places by scanning the environment - like using a flashlight to receive information about objects in the dark. Based on a video ethnographic collection of VIP scanning the shelfs when grocery shopping, we investigate how this ‘digitalizing’ of an everyday practice is done in situ. Through ethnomethodological multimodal conversation analysis (Streeck et al., 2011) we investigate environment scanning as a specific aspect of distributed perception (Due, 2021a), that is focusing on the co-operative actions between sense-able agents. As VIP rely on other multisensorial resources than sight alone, the question is how the practice of scanning is achieved in situ. Our analysis shows, that the practice of scanning inanimate distant objects require complex coordination of the device used to scan in relation to the object(s) being scanned with regards to both distance, angle of the device, and the location of the relevant feature of the object. This paper focuses on three particular phenomena: i) scanning nearby surroundings for the location and identification of objects, ii) scanning specific objects to obtain information about the object, and iii) scanning text on the object. This paper contributes to research into blind and visually impaired people, the senses and perception in interaction, shopping activities (Due, 2017) and interactions with non-human robotic agents (cf. Due, 2021b).
Due, B. L. (2017). Respecifying the information sheet: An interactional resource for decision-making in optician shops. Journal of Applied Linguistics and Professional Practice, 14(2), 127–148. https://doi.org/10.1558/jalpp.33663
Due, B. L. (2021a). Distributed Perception: Co-Operation between Sense-Able, Actionable, and Accountable Semiotic Agents. Symbolic Interaction, 44(1), 134–162. https://doi.org/10.1002/symb.538
Due, B. L. (2021b). RoboDoc: Semiotic resources for achieving face-to-screenface formation with a telepresence robot. Semiotica, 238, 253–278. https://doi.org/10.1515/sem-2018-0148
Goodwin, C., & Goodwin, M. H. (1996). Formulating Planes: Seeing as a Situated Activity. In Cognition and Communication at Work, (eds.) David Middleton and Yrjö Engestrom (pp. 61–95). Cambridge University Press.
Mondada, L. (2019). Rethinking Bodies and Objects in Social Interaction: A Multimodal and Multisensorial Approach to Tasting. In U. T. Kissmann & J. van Loon (Eds.), Discussing New Materialism: Methodological Implications for the Study of Materialities (pp. 109–134). Springer Fachmedien. https://doi.org/10.1007/978-3-658-22300-7_6
Streeck, J., Goodwin, C., & LeBaron, C. D. (2011). Embodied interaction: Language and body in the material world /. University Press.
Workman, J. (2016). Phenomenology and Blindness: Merleau-Ponty, Levinas, and an Alternative Metaphysical Vision. Electronic Theses and Dissertations. https://digitalcommons.du.edu/etd/1210
Knowledge of objects in spatial environments is obtained through multisensorial resources, such as sight, hearing, touch and smell (Mondada, 2019). For most people the visual sense takes priority in the perception of objects’ spatial relation to the sensing body and the social world (Workman, 2016). We mostly use our eyes to unnoticeably perceive objects and their position in the environment (e.g. Goodwin & Goodwin, 1996). Visually impaired people (VIP), however, have no or very limited access to the visual aspects of these spatial relations, when orienting towards their immediate surroundings. The emergence of computer vision and natural language processing (NLP) in accessible mainstream technology, such as smartphones, enables the device to “translate” the visual world into language descriptions.
We explore in this paper the use of the app Microsoft SeeingAI which enables VIP to receive computerized descriptions of “visual” information about objects, people and places by scanning the environment - like using a flashlight to receive information about objects in the dark. Based on a video ethnographic collection of VIP scanning the shelfs when grocery shopping, we investigate how this ‘digitalizing’ of an everyday practice is done in situ. Through ethnomethodological multimodal conversation analysis (Streeck et al., 2011) we investigate environment scanning as a specific aspect of distributed perception (Due, 2021a), that is focusing on the co-operative actions between sense-able agents. As VIP rely on other multisensorial resources than sight alone, the question is how the practice of scanning is achieved in situ. Our analysis shows, that the practice of scanning inanimate distant objects require complex coordination of the device used to scan in relation to the object(s) being scanned with regards to both distance, angle of the device, and the location of the relevant feature of the object. This paper focuses on three particular phenomena: i) scanning nearby surroundings for the location and identification of objects, ii) scanning specific objects to obtain information about the object, and iii) scanning text on the object. This paper contributes to research into blind and visually impaired people, the senses and perception in interaction, shopping activities (Due, 2017) and interactions with non-human robotic agents (cf. Due, 2021b).
Due, B. L. (2017). Respecifying the information sheet: An interactional resource for decision-making in optician shops. Journal of Applied Linguistics and Professional Practice, 14(2), 127–148. https://doi.org/10.1558/jalpp.33663
Due, B. L. (2021a). Distributed Perception: Co-Operation between Sense-Able, Actionable, and Accountable Semiotic Agents. Symbolic Interaction, 44(1), 134–162. https://doi.org/10.1002/symb.538
Due, B. L. (2021b). RoboDoc: Semiotic resources for achieving face-to-screenface formation with a telepresence robot. Semiotica, 238, 253–278. https://doi.org/10.1515/sem-2018-0148
Goodwin, C., & Goodwin, M. H. (1996). Formulating Planes: Seeing as a Situated Activity. In Cognition and Communication at Work, (eds.) David Middleton and Yrjö Engestrom (pp. 61–95). Cambridge University Press.
Mondada, L. (2019). Rethinking Bodies and Objects in Social Interaction: A Multimodal and Multisensorial Approach to Tasting. In U. T. Kissmann & J. van Loon (Eds.), Discussing New Materialism: Methodological Implications for the Study of Materialities (pp. 109–134). Springer Fachmedien. https://doi.org/10.1007/978-3-658-22300-7_6
Streeck, J., Goodwin, C., & LeBaron, C. D. (2011). Embodied interaction: Language and body in the material world /. University Press.
Workman, J. (2016). Phenomenology and Blindness: Merleau-Ponty, Levinas, and an Alternative Metaphysical Vision. Electronic Theses and Dissertations. https://digitalcommons.du.edu/etd/1210
Original language | English |
---|---|
Publication date | 2021 |
Publication status | Published - 2021 |
Event | 17th International Pragmatics Conference - Online Duration: 27 Jun 2021 → 2 Jul 2021 Conference number: 17 |
Conference
Conference | 17th International Pragmatics Conference |
---|---|
Number | 17 |
Location | Online |
Period | 27/06/2021 → 02/07/2021 |