Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Multisensory Perception and Learning: Linking Pedagogy, Psychophysics, and Human–Computer Interaction

Multisensory Perception and Learning: Linking Pedagogy, Psychophysics, and Human–Computer... 1.IntroductionTeaching and learning technologies are increasingly present in classrooms, but children and teachers do not yet widely accept and use them. Although significant effort is invested in developing new digital environments for education, essential design and developmental considerations would improve their educational effectiveness. First, digital learning environments are typically predominantly visual and fail to exploit the other sensory modalities, such as audition and touch (including active touch or haptics), that are critical for perception and learning. When we speak about learning, we refer to the embodied cognition theory that does not separate perceptual and conceptual learning. The theory argues that conceptual understanding is grounded in perceptual experience and that perceptual and conceptual learning are inextricably intertwined. The implication of this is that “The richer the perceptual environment using multiple sensory modalities (e.g., using visuals, voiceovers, and movement) during initial learning, the better the student learning, understanding and motivation” (Segal et al., 2014).Recent research highlights the role of multisensory integration and how different sensory modalities influence children’s learning. For example, the visual channel is not always the best sensory modality or the most effective channel to develop perceptual skills (Barrett and Newell, 2015; Gori et al., 2008, 2010, 2012a, 2014a). Indeed, for some features, such as perceiving temporal properties, the auditory modality is the more accurate sense (McGovern et al., 2017). This dominance is true for adults and children with and without disabilities (Gori et al., 2012b, 2017). Similarly, the haptic modality seems crucial for processing size properties in the visual domain (Gori et al., 2008, 2012c). Perception can therefore be influenced by the sensory modality best suited for the task, sometimes referred to as the modality-appropriateness hypothesis (Welch and Warren, 1980).Second, most technological solutions are insufficiently grounded in educational practice. They do not adequately consider both the teachers’ and students’ needs and expectations, involve practitioners in the innovation and transformation process, or integrate into appropriate innovative pedagogical paradigms. Moreover, the technical design is typically not based on neuroscientific and pedagogical results. The current review speaks to these topics by bringing together an interdisciplinary team composed of pedagogical, educational, neuroscientific, and technological experts to present state of the art in research findings from across these disciplines. In particular, the review builds upon previous work devoted to presenting the concept (Volpe and Gori, 2019) and methodology of our research (Price and Duffy, 2018) and extends it by identifying how this approach — grounded in neuroscientific and pedagogical evidence — provides a unique contribution to the development of new technological solutions for school learning. It mainly focuses on the authors’ lab work, the studies carried out and the results obtained by applying the methodology that the project consortium proposed in the framework of the EU-H2020-ICT project ‘weDRAW’. In the first part of the review, we discuss how the challenges in pedagogical practices and recent neuroscientific evidence may present novel opportunities for multisensory technology design and significant benefits to the learning process. In the second part of the review, we present the scientific results and technological development of the work activity of the authors through a single-use case for learning angles through sounds and body movements (these data are a summary of the data reported in Gori et al., 2021).2.Pedagogical Approach: the Role of Sensory Modalities and Multisensory Inputs in LearningThere has been a longstanding interest in how learning can be supported by using representations that engage multiple sensory modalities. For example, the Montessori education tradition uses artifacts like sandpaper letters that children trace with their fingers to develop the physical skill of learning to write. Papert (1980) discussed the idea of body-syntonic reasoning in learning about geometry. This refers to students projecting an experiential understanding of how their own bodies move onto a representation of a turtle moving on a screen. In this case, turtle movements reflect those of the student: if the student moves forward, so too does the turtle. This approach, which has been extended using sensors to record and represent movement, seems to help children to experience and view graphs of their own movements over time to foster experiential knowledge of concepts such as rate of change (Ackermann, 2004). Indeed, recent advances in theories of embodied cognition highlight the important role of the sensory body, experience, emotion and social interaction for learning and development (e.g., Alibali and Nathan, 2012; Barsalou, 2008; Varela et al., 1991). A central argument is that meaning and conceptual representation are grounded in perceptual and motor experience. In other words, in situ sensory experiences provide sensorimotor representations, or ‘embodied tools’ (through action, gesture and analogical mapping) that one can use later in reasoning (Weisberg and Newcombe, 2017).A substantial tranche of research, particularly in mathematics, shows the value of embodied and multisensory forms of interaction for learning. In mathematics, the body’s role (embodied cognition) and proprioceptive feedback are backed by significant evidence in the literature (Abrahamson, 2011; Cress et al., 2010). In particular, fostering ‘mindful’ movement supports mathematical problem solving (Ma, 2017; Rosenfeld, 2017; Shoval, 2011). For example, Shoval (2011) showed that children who were taught about angles through meaningful movements seemed to outperform those who a teacher taught verbally. The notion of ‘meaningful’ here refers to when bodily activities are related to a learning task in a conceptually meaningful way, such as swinging the forearm with the elbow fixed when learning about pendulum motion (Lindgren and Johnson-Glenberg, 2013). The authors argue that learners performing movements allowed them to encode concepts and externalise thoughts and understandings, cementing their learning. “In effect, one way of knowing a mathematical relationship is by being the relationship. In particular, learners can enact and therefore become mathematical relations by using gestures, an important type of body-based action” (Walkington et al., 2014). Bautista et al. (2011) go as far as to state that interacting with mathematical principles ‘in the flesh’ and incorporating kinetic body movement might be required for children to develop abstract mathematical knowledge. Further evidence using a control group that separates motor activity from other modalities would be useful.Another key area of evidence for embodied and multisensory forms of interaction derives from gesture studies. Goldin-Meadow and colleagues (2001) suggest that gestures can ‘offload’ some aspects of memory, augmenting working memory and thereby improving performance. Furthermore, research indicates that gestures simulate actions from mental representations of concepts and that those gestures are based on sensorimotor interaction. Results point to the notion of ‘gestural congruency’ (Segal et al., 2014), where the child’s action is linked to the underlying conceptual idea or representation (Lindgren and Johnson-Glenberg, 2013). Indeed, actions congruent with the nature of the underlying process of the task seem to improve performance and the likelihood of transferability of the skills in children (Schwarz and Martin, 2008). In addition, teacher gesture in the classroom may play an important role in conveying conceptual ideas to children (Alibali and Nathan, 2007). Representational gestures (those that convey aspects of their meaning, literally or metaphorically) demonstrate simulations of actions among mathematical objects (for instance conveying the slope on a graph), and metaphoric gestures demonstrate body-based conceptual metaphors. For example, ‘arithmetic is collecting objects’. Collectively, this work suggests that both using and observing congruent gestures seem beneficial for learning. The notion of conceptual metaphor draws on Lakoff and Johnson’s (1980a, b) work, which considers the role of sensorimotor experiences to be central in the foundation of human language and our conceptualisation of experienced phenomena (Hampe, 2005). Physical experiences form metaphorical analogies for abstract ideas. For example, the metaphor of ‘love is a journey’ is based on physical experience: similarly, one might say ‘it’s been a long road’ or one has ‘reached a cross-roads’ as metaphors using physical experience to indicate social, professional, or emotional experiences that have nothing to do with actual roads.An embodied perspective inherently embraces the idea of multimodal and multisensory learning, given its foundation in the body’s sensory experience. Research in cognitive science has shown the different representational affordances of images, text, animation and changing digital interfaces for conveying ideas and concepts (e.g., Price, 2002; Stenning, 1998); “Diagrammatic representations also typically display information that is only implicit in sentential representations and that therefore has to be computed, sometime at great cost, to make it explicit for use” (Larkin and Simon, 1987, p. 65). Furthermore, research in social semiotics and the role of multimodal or multiple representations offer new ways of engaging with learning ideas (e.g., Ainsworth, 2006; Kress, 2009; Moreno and Mayer, 1999). For example, combining tangible and visual materials emphasizes mathematical principles and seems to promote reflection on the connection between these principles and the actions the students are performing (Cramer and Antle, 2015). In addition, haptic information about molecular forces in conjunction with a 3D visual protein model improves protein–ligand understanding (Bivall et al., 2011). Finally, other theoretically grounded research exploring the cognitive impacts of multimodal learning materials (e.g., Moreno and Mayer, 1999) suggests that coherently integrating verbal and pictorial information to reduce the cognitive load for different processing channels (e.g., auditory and visual) may be pivotal.Understanding how best to design and integrate multisensory approaches into education is complex. One limitation of studies in this area may be a lack of experimental designs that include control groups of children performing similar activities with different modalities and feedback. This might contribute to confirming the benefits provided by embodied and sensory stimulation. Neuroscientific research can inform this design through better understanding the development of sensory modalities across childhood and multisensory integration.3.Neuroscientific Findings: Multisensory Integration and Sensory Interaction During DevelopmentResearchers in many fields have considered the role of each of the sensory modalities on perception during child development. Such fields include developmental psychology, psychophysics, and neuroscience. This current research provides evidence for the importance of using multiple senses and how specific sensory modalities are fundamental to develop specific perceptual environmental properties. This highlights the value of understanding the role of the senses for different tasks at varying developmental stages to inform the design of new educational technology.The same environmental features may be measured by more than one sensory system, so the brain must integrate redundant sensory information of a particular ecological property. For example, it can use the visual and haptic features of a grasped object (Ernst et al., 2007; Newell et al., 2001; Woods et al., 2008) or a spatial array (Newell et al., 2005; Woods and Newell, 2004) into a coherent percept. Recent results show that individual sensory modalities communicate with each other to influence perception during the first years of life (see, e.g., Bremner et al., 2012). Other findings show that in adults, multiple sensory inputs can benefit performance by improving multisensory precision compared to unisensory precision (Alais and Burr, 2004; Ernst et al., 2007). Developmental studies investigating the emergence of this ability demonstrate that multisensory integration abilities are task-specific and develop at different ages for different tasks. For example, during the first years of primary (i.e., elementary) school, children begin integrating multisensory information for simple localization tasks (Rohlf et al., 2020). For more complex tasks such as spatial bisection, vertical localization, navigation, size and orientation, multisensory integration develops after 8–10 years (Gori et al., 2008, 2012b, c; Nardini et al., 2008, 2014; Petrini et al., 2015). Researchers have also investigated how sensory information becomes calibrated against each other, such as how visual modality influences the development of audition or touch (Burr and Gori, 2012; Gori, 2015). Indeed, some have hypothesized that cross-modal calibration is an important developmental mechanism that occurs in some cases before the ability to integrate multisensory information occurs (Burr and Gori, 2012; Gori, 2015; Gori et al., 2008). This process is based on the idea that specific sensory modalities are more accurate (even if they are less precise) during development and one uses them to ‘train’ the development of other sensory systems. Cross-modal calibration is important because, in a developing system, the sensory modalities and different body parts grow at different rates and multisensory integration may be less useful than having a reference sensory system that can help maintain sensory stability (Burr and Gori, 2012; Gori, 2015; Gori et al., 2008). For example, using the visual modality to calibrate haptic orientation perception might help create a more stable system for orientation (i.e., vision) that can be used as a reference by more variable haptic systems. The study of multisensory integration during typical development has offered results in support of cross-modal calibration (Cuturi and Gori, 2017; Gori et al., 2008) and in the case of children with a disability (e.g., Cappagli et al., 2017; Gori et al., 2010; Tinelli et al., 2015). Cross-modal calibration during development for some tasks is task-specific. Indeed, some kind of cross-modal calibration occurs during the first period of life until the end of primary school and before multisensory integration development, i.e., orientation, size and space bisection perception (Gori et al., 2008, 2012b, c). In contrast, other forms of cross-modal calibration start later in life (after 8–10 years of age) after the beginning of multisensory integration abilities, such as audio-motor temporal recalibration and the localization of stimuli (Rohlf et al., 2020; Vercillo et al., 2015). For example, differing results that are typically observed between horizontal localization and horizontal bisection tasks might be related to stimulus complexity. For the localization task, the participants must localize a single stimulus source, and for the bisection task they must report the spatial position of the stimulus in the middle between two lateral stimuli (i.e., if closer to the first or to the second). In particular, the stimulus should be processed based on its relative metric for the bisection task, while for localization the estimation of the position in space requires less complex spatial calculations. For this reason, the role of visual calibration for the two mechanisms might differ.Studies have also recently clarified the importance of specific sensory modalities for specific concepts’ development. For example, children typically use haptic information to perceive the size of objects, whereas they use the visual system to understand their orientation (Cuturi and Gori, 2019; Gori et al., 2008, 2012b, c). Given the importance of sensory calibration during development, an object of study has been to investigate how the lack of one sensory modality affects the development of other sensory modalities. Results suggest that the absence of one sensory input can impact other modalities. For example, lack of vision affects the development of haptic orientation perception, while motor impairment affects visual size perception (Gori et al., 2010, 2012a). These results suggest that sensory modalities play an essential role in how our brains learn specific environmental properties and the understanding of differences in multisensory and sensory abilities across typical and impaired children might help improve interventions for rehabilitation (Cappagli et al., 2017; Gori et al., 2016) and education. Understanding how and when these perceptual skills develop during the primary (elementary) school years may provide fundamental knowledge for developing useful educational technology. For example, in the first years of primary school, combining audio and visual information presentation is often more suitable than presenting one signal alone (Rohlf et al., 2020), especially for some tasks such as stimulus localization but not for others such as bisection (Gori et al., 2012b).As we discussed above, evidence for a supporting role of movement, particularly gestures, with regard to learning mathematical concepts has been reported. There is evidence that indicates motor representations play a major role in multisensory processing (Fogassi and Gallese, 2004) and motor planning can modulate multisensory integration in adults (Sober and Sabes, 2003). However, the role that action and action planning play for multisensory perception during development is still unclear. Psychologists and philosophers have long argued for the essential role of sensorimotor interaction with the world for cognitive development (Clark and Chalmers, 1998; Piaget, 1972; Vygotsky, 1978). For example, experts have recognized the fundamental role of external tools in shaping activity and mediating cognition (Vygotsky, 1978). Indeed, sensorimotor experience and interaction with the environment are fundamental to meaning-making and conceptual understanding, providing the basis for learning and playing an essential role in knowledge construction. One recent example is the demonstration of children’s representation of number lines along different axes that Cooney et al. (2021) reported. In general, the study of sensory preferences for different tasks can improve technology for teaching specific concepts. For example, the haptic modality is important for size processing (e.g., Gori et al., 2012a), which enhances the importance of technological solutions that include haptic information in the teaching of size (e.g., by providing haptic feedback). Similarly, the audio modality is important for rhythm perception (Gori et al., 2012b) and technology that can associate rhythm with mathematical concepts (e.g., fractions). These may also provide an alternative means of enhancing teaching and learning. In the next section, we present new technological solutions that one could use to develop educational technology, providing multisensory inputs.Another essential feature is the potential to sense the child’s affective state during the learning process and tailor multisensory feedback to the child’s needs. As technology becomes a partner in the learning process, it is essential that it can sense children’s related affective and cognitive states. While well designed multisensory feedback can foster confidence in the child, a child might also experience moments of frustration, confusion, and even boredom. It then becomes crucial that technology adapts to children’s psychological needs. While negative affective states can be detrimental to learning if they remain unaddressed, positive conditions are also significant. A sign of excitement and curiosity may indicate that the child is ready to be challenged, while over-confidence may lead a child to overlook details (for a review, see Olugbade et al., 2017). The affective computing field has shown that detecting emotional states from expressions is possible. We argue that in the current study’s context, body expressions are particularly interesting because the body is engaged in exploring learning concepts and possibly expressing how the child feels about what is enacting through their body. For example, a child’s arm may be hesitating while s/he extends them to form an angle. A growing body of work in psychology, neuroscience, and computing confirms body expression as a primary channel for understanding how a person feels (Kleinsmith and Bianchi-Berthouze, 2013) as powerful as facial expressions and even more in certain states. Aviezer et al. (2012) argue that body expressions are more informative than facial expressions for intense emotion. De Gelder (2009) also points out that body expressions provide information about how a person feels. They also inform about whether a person is ready to respond to the emotional state, such as whether a confused learner is prepared to invest more effort or is disengaging with the task.4.Engineering Results: Multisensory Technology and Feedback, Affective State, and FlexibilityAs we have argued, multisensory information and specific senses can be essential for developing environmental perception and learning. In this section, we discuss how recent technological development enables new ways of supporting multisensory learning. There are multisensory technologies at the base of this new approach (e.g., haptic, visual, and auditory interfaces) that can provide novel forms of multisensory interaction that can foster new ways of teaching and learning. Indeed, new technological solutions facilitate more bodily-based interaction compared with desktop computing devices. Recent developments in computing extend opportunities to enhance sensorimotor interaction with learning experiences bringing research interest to understanding the interaction between mind, body, and digital tools: embodied interaction and embodied cognition. Multisensory technology can support concept exploration and understanding, and foster children’s confidence in their capability to explore and learn concepts. Below we present three essential features of multisensory technology that can support inputting neuroscientific findings into new educational, technological solutions.4.1.Multisensory FeedbackContemporary work in human–computer interaction (HCI) within learning suggests ways in which whole-body and haptic interaction systems through multisensory feedback might support learning. Whole-body interactive technologies foster learning of abstract concepts (Antle et al., 2013; Malinverni and Pares, 2014), combined with visual (e.g., Smith et al., 2014) and aural augmentation (e.g., Bakker et al., 2011). For example, Moving Sound (MoSo), a set of tangible artefacts that afford different kinds of movement (forwards, backwards, closer, further) mapped metaphorically onto the pitch, volume, and tempo of sounds, enabled children to explore and learn about the abstract qualities of sound (Bakker et al., 2011). Indeed, Walkington and colleagues (2014) report greater transfer to novel problems when using full-body gestures/action.More recently, haptic technologies have introduced new tactile experiences to learning. Through force and tactile feedback, haptic technologies enable simulated tactile sensations of an object’s hardness, shape, and texture. Haptic augmented feedback is beneficial for recall, inference, and transfer in elementary learning contexts, particularly for learning about how gears work (Han and Black, 2011) and developing psychomotor skills (Zacharia and Michael, 2015). In the context of mathematics, haptics brings new sensory experiences on concepts being explored, including partial or unfamiliar perspectives of shapes (Davis et al., 2017), and can foster children’s dimensional deconstruction of shape in ways that underpin later enactive 3D shape communication (Price et al., 2020a). Research has explored haptic interaction systems among individuals with visual impairments, often focusing on supporting wayfinding, navigation and orientation using wearable haptic artefacts (e.g., He et al., 2020; Kammoun et al., 2012; Mattheiss et al., 2017; Ross and Blasch, 2000). Other work has focused on designing interfaces to support the visually impaired to engage with graphics through 3D tactile systems (e.g., Memeo et al., 2021; Siu et al., 2021) to support science (Han, 2020) and object recognition (Dhaher and Clements, 2017). Within the context of geometry, some initial work has explored means of engaging children with shape or topological configurations (e.g., Buzzi et al., 2015), primarily using touch screens. However, few projects have explored their use in classroom-based environments.Furthermore, the recent availability to the general public of cheap devices for virtual reality (VR) and augmented reality (AR) made these technologies suitable for large-scale employment in education. VR and AR are useful in sensorimotor stimulations where they are associated not only with visual and audio feedback but also with tactile stimulation (e.g., vibration on the hands) or interfaced with haptic robotic platforms (e.g., Omni or Phantom devices). These robotics devices can be used to support the drawing of virtual shapes or present haptic scenarios (e.g., virtual cartesian plane) that one can experience haptically through the definition of virtual forces and positional cues.Although pioneering work (e.g., Kaufmann and Schmalstieg, 2003) dates to the early 2000s, research and applications of VR and AR for mathematical teaching have only recently become more widespread (for recent examples, see Salman et al., 2019; Simonetti et al., 2020; Stranger-Johannessen, 2018). These technologies would benefit from an interdisciplinary approach that identifies learning concepts the technologies can best convey and the perceptual mechanisms they can leverage.One essential feature of multisensory technology for learning is the potential to provide sensory feedback associated with specific body movements or sensory interaction. Conceptual metaphors and interactional mappings between what computer designers call ‘input actions and output responses’ provide a foundation for designing more intuitive sensorimotor interaction interfaces that foster meaningful action concerning learning concepts, such as (Lindgren and Johnson-Glenberg, 2013). In other words, grounding the interaction with technology (e.g., how input is provided, how feedback is conveyed, how feedback is produced for any given input, and so on) on metaphors of real life, i.e., utilizing mechanisms that somewhat mimic real life and support the development of more intuitive interfaces.In digital experience design, Bakker and colleagues (Bakker et al., 2012) drew on musical concepts such as tempo, volume, and pitch mapped to movement in terms of speed, proximity, and flow to design a series of physical interactive artifacts constraining movement in specific ways to support children learning about abstract qualities of sound. Other work has focused on understanding how embodied interaction with the physical world can help in designing technologies to learn more abstract forms of knowledge. For example, Howison and colleagues (Howison et al., 2011) investigated the role that bodily operations play in learning about proportional equivalence, showing that physical enaction of proportional equivalence supports one in developing understanding of this concept. Furthermore, Ma (2017) showed how the body supports communication and negotiation of geometrical mathematical ideas. However, few classroom technologies have moved beyond predominance of the visual modality (e.g., tablet applications for learning mathematics — see Note 1) to integrate information from multiple sensory modalities and foster congruent actions, in other words: meaningful actions concerning the concept being known (Segal et al., 2014).New technological solutions enable accurate and real-time mapping of motor behaviour onto multiple facets of sound, haptics, and visual media. In mathematics, sound and music content can be associated with arithmetic concepts. Mapping motor behaviour onto single dimensions of sound morphology (e.g., pitch, intensity, granularity, rhythm, and so on) can enable simple associations with arithmetic concepts (e.g., less/more, counting, summing, subtracting, etc.). The concept of order (i.e., less/more) is conveyable by associating a motor feature (kinetic energy or posture expansion are some examples) with sound granularity. Thus, as the child moves more (or the more the child expands their body), the sound content becomes richer and denser. The concept of fractions can be associated with the musical pulse of a percussive sound so that smaller fractions relate to faster (i.e., shorter in time) pulses, as happens in music in which a half note has a duration that is twice as long as that of a quarter note. In more traditional approaches (e.g., making sounds with objects or musical instruments), a precise mapping of motor features onto sound and music features would be impossible, since acting upon a sound source would simultaneously affect many audio parameters. However, current technology can support the mapping of motor behaviour onto multiple sound morphology dimensions. Technology can also enable one to map different movement features onto varying features of sound morphology (e.g., energy onto intensity, expansion onto granularity, fluidity onto rhythm, and so on).4.2.Affective StateAdvances have been made in the computational field to automatically classify the emotional states of a person from their body. The recent but fast-growing body of work for this affective state is due to the current availability of low-cost, full-body sensing technology and its availability in an everyday context (e.g., Kinect, wearable devices). Bianchi-Berthouze and colleagues (Bianchi-Berthouze, 2003) and Camurri and colleagues (Camurri et al., 2003) have pioneered this field by showing the possibility of capturing dimensional (valence, arousal, control, and avoidance) and certain facets of emotional states. By investigating these expressions in naturalistic settings, they have also shown that the automatic recognition of realistic body expressions reaches performances similar to those observed for automatic identification of facial expressions (Aung et al., 2015; Griffin et al., 2015). Body expressions are an informative affective channel in intelligent tutoring systems (Cooper et al., 2011; D’Mello and Graesser, 2010). These studies show that many learning-related affective states (e.g., boredom, confidence, confusion, engagement, excitement, flow, frustration) are detectable using this modality. Work has also been done to examine children (Sanghvi et al., 2011) engaged in in-game activities showing reliable performances. However, technology for detecting affect in education is being considered in only very controlled situations, mainly seated, and where the body is not explicitly engaged as a source of multisensory experience within the specified learning tasks.4.3.Enabling TeachersAnother essential feature is the potential for multisensory technological flexibility in allowing teachers to configure the learning experiences. Teachers can choose, for example, which motor features are mapped onto which sound parameters, supporting a child to quickly achieve fine-grained control on the sound parameters. This requires many years of practice, such as with a traditional musical instrument. One can easily re-adapt this approach exploited for musical aspects for education technology design, allowing for effective exploitation of an embodied and enactive pedagogical approach. This can foster effectiveness by using the best modality for each specific concept to teach, improve personalization and flexibility for teachers and students in the learning process, and be easily re-adapted for children with impairments.Because of the properties discussed above, we argue herein that multisensory interfaces can support engagement with learning concepts in active and participative ways, encouraging exploration, discovery, and experimentation through manipulation and tactile feedback as well as visual and aural representations. These experiential modes have potential to bridge the gap between concrete and abstract understanding, where one can exploit features of different sensory modalities to foster interpretation and meaning-making. For example, the auditory modality might be better for understanding rhythm than the visual modality, and the former might be linked to some mathematical concepts that are proximate to musical concepts, like fractions. Considering these aspects, multisensory technologies are ideal for effectively supporting a pedagogical approach that exploits specific sensory modalities to teach different concepts, as suggested by neuroscientific findings (e.g., the possibility of developing new technological methods that associate audio rhythm with visual fractions to facilitate the comprehension of this concept).4.4.Considerations About Multisensory Technological SolutionsCurrent digital technologies used in the classroom commonly rely on the visual modality for conveying learning concepts. For example, there are multiple applications for tablets for learning mathematics (Note 2), all relying on ‘seeing’ digital content rather than physically interacting with mathematical ideas. While theories of embodied cognition commonly underpin more contemporary design and development of digital learning environments that better exploit sensorimotor and multisensory interaction, it is not theoretically grounded on psychophysical or neuroscientific research findings. In the following section, we discuss key challenges and considerations in designing and developing technology for education.4.4.1.The First Consideration Is That Technological Development Should Be More Child- and Teacher-CentredAlthough a common goal for developing educational technology in general, here it assumes a particular relevance. It means that the designer can assess (i) the effective sensory modality for the child to learn a specific concept and (ii) whether specific impairments require exploiting sensory modalities. For example, recent studies showed that one could use musical training as a therapeutic tool for treating children with dyslexia (Overy, 2003; Overy et al., 2003). In our view, the teacher plays a central part as a mediator in employing technology. This means that an iterative methodology of design, development, and evaluation through a framework of a participatory design involving both teachers and students must consider usability, pedagogical effectiveness and customisability. As a mediator of learning, teachers need customizable options that enable them to choose modalities and features related to their teaching concept. Following an initial evaluation phase, we may then identify the best modalities linked to different concepts, personalize the technology to exploit the selected sensory modality, and evaluate learning process outcomes. As a side issue, technology that measures affective state may also help with screening for behavioural problems and addressing them.4.4.2.The Second Consideration Is That a More Embodied and Enactive Pedagogical Approach Should Be Used to Develop New Technological SolutionsWith this approach, teachers can use different sensorimotor feedback (audio, haptic, proprioception and visual) to teach new concepts to primary-school children. Such an approach would be more direct, natural and intuitive since it is based on the experience itself and perceptual responses to motor acts. Moreover, using movement and sensorimotor interaction for learning deepens and strengthens education and retention (Shoval, 2011), and multisensory designs to foster new engagement with mathematical ideas (Ma, 2017; Price et al., 2020a, b; Yiannoutsou et al., 2018). For example, previous studies suggest that body movement enhances spatial perception across visual and haptic modalities (Pasqualotto et al., 2005). Moreover, both space perception and spatial awareness are improvable through the association of body movement and sounds (Cappagli et al., 2017; Cuppone et al., 2018; Finocchietti et al., 2015a, b). A more active role of the user might also improve the engagement of the child in the task.4.4.3.The Third Consideration Is That Multiple Sensory Feedback and Inputs Should Be Considered in New ApplicationsMultisensory technologies can overcome the major challenge of the consolidated hegemony of vision in current educational practice. Focusing too much on one single sensory channel may represent a severe issue for the effectiveness and personalization of the learning process and the inclusion of children with impairments (such as visual impairment) or indeed children who benefit from alternative routes into learning. If we consider multisensory platforms (e.g., systems that provide audio and visual information simultaneously) we can break through these barriers because both visually impaired and sighted children can use the same system to learn based on different sensory signals. Concerning effectiveness, this may be impacted by inappropriate, incorrect, or excessive vision usage, which is not always the most effective modality for communicating certain concepts to children since different modalities communicate different information. As for personalization, a pedagogical methodology based almost exclusively on the visual modality would not consider the learning potential and routes of access for learning in children, exploiting the different modalities in ways that more comprehensively convey different kinds of information (i.e., the haptic modality is often better for the perception of texture than vision). Every child could use a different learning approach mediated by the most effective sensory signal for the specific person (e.g., for learning shape, one child would prefer to use the visual signal and another the haptic signal). In the example above, visually impaired but also typical children can use the audio modality or both visual and auditory simultaneously based on their own individual predisposition.4.4.4.The Fourth Consideration Is That New Technological Solutions Should Be Grounded on Pedagogical and Neuroscientific NeedsMultisensory technology should provide effective means to teach specific concepts that would benefit from digital augmentation or mediation from a pedagogical perspective. These could be particularly difficult concepts for children to understand or where sensory modalities other than vision can enhance critical ideas. In a recent survey we conducted with teachers (see Cuturi et al., 2021a), we observed that the challenging concepts depend on school level. We observed that, in primary-school-aged children, in geometrical context, concepts related to mental transformations are problematic while angles are medium and high difficulty for students at levels 3–4 (8–10 years old). Moreover, teachers showed that other sensory modalities (such as haptic) can benefit the understanding of ‘isometric transformations’: tipping, translation, and rotation. All this information may offer guidance toward improving teaching strategies and technology design. According to scientific evidence, from a neuroscientific perspective, multisensory technology should leverage the sensorial, perceptual, and cognitive capabilities that children possess. A technology capable of detecting specific motor behaviours in a target population (e.g., primary school) of children makes sense only if scientific evidence shows that children in in that population can display such actions. The same holds for feedback: multisensory technology can provide specific feedback (e.g., based on the auditory pitch). This approach is right if (i) children can perceive it (e.g., they developed the perception of tone), and (ii) an association exists or a new association is trained between the feedback and the concept to be communicated (e.g., the association between pitch and size of objects or audio and body movements as in the ABBI device — Gori et al., 2017).In the second part of this review, we use this approach to address critical points in an example of an application we discuss. To that end, we investigated the sensory preferences of children for sound and visual angles. We then studied the pedagogical and affective ability of the child to learn angles through sounds. Finally, we describe a new flexible technology that we developed to understand angles through body movements and sounds as feedback. We expect that utilizing movements (e.g., gestures) that are conceptually congruent with the knowledge being learned increases the child’s performance, learning, understanding and motivation (Segal et al., 2014). In the context of this new technological approach, called the ‘RobotAngle’, children make concept-congruent movements with their bodies that correspond with changes in perceptually enhanced abilities, using both vision and audition. Moreover, these sensory motor associations improve space representation and facilitate the link between the body and space (as shown in the ABBI device — Cappagli et al., 2017; Cuppone et al., 2018; Finocchietti et al., 2015c).4.4.5.One Example of Our Approach: the Use of Sounds and Body Movements to Learn AnglesWe have considered the issues outlined above in designing and developing a new digital environment to support children’s learning of angles using the association between audition and body movement. This effort was conducted as part of our collaborative project, ‘weDRAW’. In the next paragraph, we present our results and the interdisciplinary process of work we adopted from a pedagogical, psychophysical, and HCI point of view.4.4.6.Pedagogical InputsFrom a pedagogical point of view, we first needed to better understand the role of the body in children’s experience of mathematical ideas (e.g., shapes and angles in geometry). We use this to inform a digital game design that effectively fosters meaningful bodily enactment for young children (aged 6–11) learning of geometric concepts. It was necessary for us to understand how to encourage meaningful or congruent action and design useful reflective feedback, such as designing appropriate feedback that augments bodily interaction to support practical mathematical thinking. Accordingly, we collected data on childrens’ spontaneous, intuitive body actions and bodily representation for engaging with and creating angles and shapes to understand how whole-body sensory experiences can engage children in meaning-making around ideas of shape and angles (Price and Duffy, 2018). This study engaged 29 students from 7 to 11 years of age to examine the kinds of bodily movements children make and how they interpret and use them to experience different angles and shapes. Doing so informs the benefits and limitations of bodily exploration. This allows us to inform the design of augmented sensory interaction to support effective mathematical thinking. Drawing on Henderson and Taimiða (2005), activities were purposefully designed to ensure a clear maths concept was explored, where the whole-body movement was integral to the task. The study aligned activities with the school curriculum in the UK (also consistent with pedagogical goals in other educational systems across Europe). It defined angles as a geometric figure (a pair of rays with a common endpoint), a dynamic figure (a turn or rotation), and a measure (Price and Duffy, 2018). Children worked in groups of 3 or 4 with three tasks: (i) to use their bodies to make angles; (ii) use their bodies to make shapes; and (iii) use their bodies to create symmetry of shapes.Facilitators encouraged the children to think aloud as they worked, using questions such as “what is the new angle?”, “is it bigger or smaller than the first angle?” and “what is the combined angle in total?” The study collected video data for qualitative data analysis to examine moment-by-moment bodily interactions through a focus on gesture, action, facial expression, body posture, and talk (Jewitt, 2015). The aim of the research was to inform how one can use the body to enact and engage with mathematical ideas. Research showed that it is necessary to account for several considerations when using the body as a learning resource. Fundamental bodily limitations became more apparent as children used various bodily postures to represent different angles and communicate them to others (Fig. 1A). For example, stretching beyond acute angles to create obtuse or reflex angles was physically challenging.Figure 1.(A) Picture of children exploring new ways to represent angles with the body. (B) Groups of children working to create a shape composed of multiple angles.Differences in children’s physical bodies (e.g., the lengths of their arms or legs) led to differences in collaborative shape formation (e.g., two intended equilateral sides were different lengths) and potential misconceptions of properties of regular polygons. Figure 1B presents a group standing in a circle to form a triangle with joined arms. Getting the right length was a challenge due to differences in participants’ arm lengths. This challenge enabled opportunities for reflection about their aim, the discrepancy between that and their body make-up, and their exploring ways to achieve appropriate lengths, such as overlapping their arms or ‘shortening’ their arms. This result provides evidence that the children were thinking about the critical features of equilateral triangles.Finally, differences in how children perceived or ‘felt’ the positioning of their bodies in space compared to how this looked to others suggest the need to foster better awareness of the body in space and make the links between ‘felt’ and ‘overt’ bodily experience explicit.4.5.Psychophysical InputsStarting from these pedagogical results, from a psychophysical point of view, we investigated how alternative sensory modalities could be associated with angle processing. To achieve this goal, we investigated the phenomenon of perceptual correspondence during development (Cuturi et al., 2019, 2021b). Developmental studies show that children can associate visual size with non-visual stimuli that are apparently unrelated, such as pure-tone frequencies or proprioception (Holmes et al., in prep.). So far, most literature has focused on audiovisual size associations by showing that children can associate low pure-tone frequencies with big objects and high pure-tone frequencies with small ones. We investigated whether the sound frequency could offer information about angle size. Toward this goal, we investigated how cross-modal audiovisual associations develop during primary-school age, from 6 to 11 years old (Cuturi et al., 2019). To unveil such patterns, we took advantage of a range of pure auditory tones and tested how primary-school children match sounds with visually presented shapes. We tested 66 children (6–11 years old) in an audiovisual matching task involving a range of pure-tone frequencies. Visual stimuli were angles of different sizes (see Fig. 2A).Figure 2.(A) Visual angles used in the task. From the left, the angles were: 100°, 60°, 40°, 20°, 10°; each line composing the angles is 6.5 cm long. Participants were asked to indicate the visual stimulus that corresponds to the heard auditory stimulus. (B) Each data point indicates the average of each response rating’s probability distribution corresponding to the response options. Error bars indicate the 95% confidence interval. One asterisk (*) indicates p<0.05, two asterisks (**) indicate p<0.01. With permission from Cuturi et al., 2020.Our study asked participants to indicate the shape matching the sound they heard. We present the results in Fig. 2B. All children associated large angles with low-pitch and small angles with high-pitch sounds. Interestingly, older children made greater use of intermediate visual sizes than younger children to provide their responses. Audiovisual associations for finer stimuli might develop later, likely depending on the maturation of supramodal size perception processes. Upon consideration of this result, we suggest that these natural audiovisual size correspondences are useable for educational purposes by supporting the learning of relative size, including angles. Moreover, the effectiveness of such correspondences is optimisable according to children’s specific developmental stages.4.6.Inputs From Affective Technology: the Engineering ApproachIt is crucial to consider the user state in the role of technology. To investigate this point, we also studied children’s affective state in using sounds and body movements to learn angles with technology. Olugbade and colleagues (Olugbade et al., 2020) present a system that automatically infers a child’s affective and cognitive states as they are engaged in mathematical games by exploring mathematical concepts through their body and sound feedback. In recent work (Volta et al., 2019) we extend this approach to children with visual impairment. While these works are preliminary, they show the potential new multi-sensing and affect-aware technology offers to learn during uncontrolled naturalistic settings. Their research also offers an in-depth analysis of how children express certain affective and cognitive states related to learning through their body and other modalities. We considered three categories of features in the movement analysis: low-level features (e.g., velocity, energy, postural configurations), spatial/temporal features (e.g., trajectory length, distance covered), and motion descriptors (directness, smoothness, impulsivity). These types of features have been effective in affect detection (De Silva and Bianchi-Berthouze, 2004; Griffin et al., 2015). Furthermore, they are related to features that experts have used to assess self-efficacy, curiosity, and reflectivity in other contexts (Olugbade et al., 2018). Studies also indicate that postural configurations capture various states across contexts and cultures (Kleinsmith et al., 2005). Additionally, using more advanced machine-learning techniques, we let the algorithms identify patterns of movement that relate to such states and possibly better capture differences due to learning tasks and children’s idiosyncrasies.5.New Technological Solution Based on Previous Inputs: the Engineering ApproachStarting from the results in the previous sections, we developed a new technological solution that enables real-time association of visual and auditory feedback with body movements and angle processing. We considered the pedagogical inputs of providing students with a means of exploring the concept of angles by involving full-body movement discussed in the previous pedagogical section. We focused on angles because angles were among the most challenging concepts revealed by the survey conducted with teachers (for more details see Cuturi et al., 2021a). Teachers had to indicate the mathematical concepts that they perceived as difficult to teach with the visual modality and which alternative inputs (audio or touch) could effectively teach a particular concept (when the visual modality is missing, such as in the case of visually impaired individuals). The results we obtained suggest that angles are particularly difficult for children to understand. Indeed, angle comprehension presented at medium and high difficulty levels for students in the age group of 8–10 years old (in-class years specific to their national educational system). This result is supported by scientific evidence showing that the angle concept is probably difficult to learn because angle size is difficult to disentangle from overall figure size, which is in agreement with other work (Dillon and Spelke, 2018; Gibson and Maurer, 2016; Mitchelmore and White, 2000). We implemented the results obtained through the psychophysical tests by employing specific auditory pitches to infer angle size. Finally, we included the analysis of the three features of movement highlighted in the abovementioned affective study. As a result, we developed a full-body activity where different proprioceptive skills and sensory modalities need to solve a mathematical problem concerning angles.The activity we developed can be potentially used both in the classroom and at home, and its setup consists of a range-imaging sensor device [in particular, we used Kinect v.2 (Microsoft Corporation, Redmond, WA, USA)], connected to a personal computer. The software was implemented in the EyesWeb XMI platform (Volpe et al., 2016) and Unity (Unity Technologies, San Francisco, CA, USA). In this process, a major difficulty consisted of designing a clear relation between the geometrical concept and the child’s embodied experience to create a strong relation that was useful for learning. We addressed this challenge by involving teachers, pedagogues, and psychologists in the design process and conducting early testing of the produced prototypes with children. We applied a user-centred, game-based, non-invasive, and ecological approach using simple and natural stimuli to adapt training language to the subject, rather than forcing the opposite process. We ran workshops during the technological development wherein children engaged with the designs while the teachers evaluated the results. This iterative interactive approach between all users and the adaptation of the technology based on user feedback and experience allowed us to fine-tune the methods. Throughout this process, an important relation emerged between sound and body movement and was highlighted by our psychophysical experiments (e.g., the sound pitch should be associated with dimension of the angle aperture)After a short introduction explaining the rules at the beginning of the activity, the system asks the child to move their arms in the space to represent a specific angle. The system tracks the child’s movements to compute which angle is defined. Once the child can reproduce the required angle correctly, the system proposed a new angle. The goal consists of reconstructing all the angles contained in a complex shape (e.g., the house represented in Fig. 3). To facilitate the child’s adjustment of their movements, the study provided two kinds of feedback: (i) visual feedback (i.e., two lines drawing the angle created by the arms) and (ii) auditory feedback (i.e., a different sound for each angle). A sound model maps each angle to a different sound. Starting from a reference sound (associated to 0°), the sound pitch is modified (i.e., pitch decreases) until the angle size changes (i.e., the angle size increases).Figure 3.(A) A screenshot of the activity developed for exploring angles. A child must reconstruct the angles contained in a complex shaper (the house). The provided visual feedback is represented: the white angle is the angle the child’ forms with her arms. The final composition of angles creates a house. (B) A child engaged in the activity.From a technological point of view, the study used the Kinect sensor to acquire the 3D coordinates of the head and the child’s hands. This data is used to control in real-time the production of the visual and auditory feedback. The study uses EyesWeb XMI and Unity for such purposes. Features of the activity (e.g., the auditory and visual content feedback) are adaptable to each child. For example, we developed a version for visually impaired children by adapting the visual feedback with higher contrast to allow them to perform the same activity as sighted children. During the activity, we stored movement data to perform movement and affective analysis to assess the child’s behaviour, performance, and progression.Twenty-four children of 7 (n=12) and 9 (n=12) years old used the application. We divided children into two groups: one trained with the RobotAngle activity (n=12) and the other trained with another audio-motor activity unrelated to angle but on fractions (n=12). Both groups underwent pre- and post-evaluation tests before and after five activity sessions. This experiment is extensively described in Gori et al. (2021). Their study briefly describes the paradigm and methods, and partially presents the results (see Gori et al., 2021, for more details). In the first session, they performed the first part of the pre-test finished in the second session. In the second part of the second session and in the third session they performed the training, as well as in the first part of the fourth session. In the second part or the fourth session and in the fifth session, the researchers tested children in the post-tests (same duration and structure of the activity for both groups, see Gori et al., 2021, for more details). The pre- and post-evaluation sessions were composed of three tests that investigate proportional reasoning, numerosity and general geometrical knowledge. In our study, we performed a measure of number estimation, a measure of proportional reasoning and a measure of visuo-spatial abilities.In the number estimation test, we required participants to localize the position of a specific number (e.g., 17) on a bold horizontal number line from 0 to 100 (or from −100 to 0) in each trial. For the proportional-reasoning measure, we used the previously developed Proportional Reasoning Task (Boyer et al., 2008). We asked participants to select a proportion that matched a target juice mixture, determining proportionality by the relative quantities of juice and water parts. For the measure of visuo-spatial abilities, we used a validated Italian battery of tests specifically developed to link visuo-spatial abilities with geometrical knowledge (Mammarella et al., 2012). Results suggest that children of different ages improved in different tasks after different training. For example, the only test children improved upon after training with the RobotAngle was the visuospatial abilities test. The improvement was specific to children of 9 years and not of 7 years, suggesting that the effect of the training is age-specific and perhaps associated with developmental knowledge. Contrarily, the 7-years-old children improved their performance in the number line task with both training activities (i.e., RobotAngle and BodyFraction). A possible explanation of this result is that younger children might benefit more from associating numbers, arm aperture and sound. Both training activities provided this association: in the RobotAngle activity, the relationship was created between numbers and increased quantities represented as angle apertures corresponding to sound and arm apertures. In the BodyFraction activity, children created the relationship in the correspondence between increasing/decreasing numbers arm/legs aperture/rhythm. Thus, 7-years-old children less familiar with the concept of numbers along a continuum might benefit more from training. In contrast, for the geometrical test, the RobotAngle activity is more directly related to geometry and may therefore be useful at 9 years of age but too complex for younger children to internalize well. This result suggests that there might be a relationship between the school level and learning stage of the selected groups of children, indicating that training effects might differ based on age. There is a need for age appropriateness for specific task–modality combinations to ensure the usefulness of sensorimotor training. We can speculate that the benefit can be maximal considering the window between the need of improving the understanding of the concept and cognitive basis to learn it. Seven-years-old children improve more in the number line and 9-years-old children in the geometrical test because these tests present complex, relatively unfamiliar concepts for their age. Moreover, children should already have the foundation necessary to learn the concepts through the training. Indeed, geometrical concepts for 7-years-old children are too complex to learn, and even with the aid of the training they do not reach this ability. The same is not true for the 9-years-old children.The findings suggest that informative sounds associated with body movements can be a powerful tool for improving some perceptual concepts related to mathematics learning (see Gori et al., 2021 for more details). Admittedly, a limitation of this work is that we were unable to compare the benefit the technology provides with standard teaching methods that only have visual aids (e.g., classroom-taught lesson with angles drawn on a blackboard). In future work, it will be essential to include a control group who learn angles with the standard teaching approach.In this context, future must evaluate, possibly with longitudinal studies, whether these enhancements and the sensorimotor association are maintained after the training ends and the generality of learning for other sensory information. Moreover, future experiments should be performed to clarify how much improvement is a result of the sensorimotor experience compared to the visual and/or auditory experience. To support a wider usage of the application among students, we developed a new optimized version of the game to distribute in schools. It is endowed with an interactive whiteboard (IWB) already in place in most classrooms without using a Kinect and a PC. The activity includes the first part of the multisensory exploration of angles and the second part of training related to the angles’ understanding using body movements. It is freely available online (see this link for more details: https://s3a.deascuola.it/wedraw/WEDRAW.html; this link is permanent, and the activity is in Italian).6.ConclusionDrawing on recent advances in the literature, we argue for the beneficial use of different sensorimotor signals such as audition, haptics, vision, and movement to teach primary-school children mathematical concepts in novel ways. This review has presented a new approach that links pedagogics with neuroscience and engineering to develop new technology for learning mathematical concepts at primary (i.e., elementary) schools. We discussed the limitations of existing technological solutions and proposed key points that stand to improve learning activities. As a result of this interdisciplinary approach, we presented a new platform based on audiovisual and body movement to teach geometrical concepts of angles to children. In future work, it will be crucial to identify a quantitative method to better measure the improvement that standard procedures and new technological training engenders. To conclude, we propose that multisensory technology offers essential inputs to develop new applications and technology for learning and that a central, multidisciplinary approach will allow the field to reach this critical goal. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Multisensory Research Brill

Multisensory Perception and Learning: Linking Pedagogy, Psychophysics, and Human–Computer Interaction

Loading next page...
 
/lp/brill/multisensory-perception-and-learning-linking-pedagogy-psychophysics-0620lAUEPz

References

References for this paper are not available at this time. We will be adding them shortly, thank you for your patience.

Publisher
Brill
Copyright
Copyright © Koninklijke Brill NV, Leiden, The Netherlands
ISSN
2213-4794
eISSN
2213-4808
DOI
10.1163/22134808-bja10072
Publisher site
See Article on Publisher Site

Abstract

1.IntroductionTeaching and learning technologies are increasingly present in classrooms, but children and teachers do not yet widely accept and use them. Although significant effort is invested in developing new digital environments for education, essential design and developmental considerations would improve their educational effectiveness. First, digital learning environments are typically predominantly visual and fail to exploit the other sensory modalities, such as audition and touch (including active touch or haptics), that are critical for perception and learning. When we speak about learning, we refer to the embodied cognition theory that does not separate perceptual and conceptual learning. The theory argues that conceptual understanding is grounded in perceptual experience and that perceptual and conceptual learning are inextricably intertwined. The implication of this is that “The richer the perceptual environment using multiple sensory modalities (e.g., using visuals, voiceovers, and movement) during initial learning, the better the student learning, understanding and motivation” (Segal et al., 2014).Recent research highlights the role of multisensory integration and how different sensory modalities influence children’s learning. For example, the visual channel is not always the best sensory modality or the most effective channel to develop perceptual skills (Barrett and Newell, 2015; Gori et al., 2008, 2010, 2012a, 2014a). Indeed, for some features, such as perceiving temporal properties, the auditory modality is the more accurate sense (McGovern et al., 2017). This dominance is true for adults and children with and without disabilities (Gori et al., 2012b, 2017). Similarly, the haptic modality seems crucial for processing size properties in the visual domain (Gori et al., 2008, 2012c). Perception can therefore be influenced by the sensory modality best suited for the task, sometimes referred to as the modality-appropriateness hypothesis (Welch and Warren, 1980).Second, most technological solutions are insufficiently grounded in educational practice. They do not adequately consider both the teachers’ and students’ needs and expectations, involve practitioners in the innovation and transformation process, or integrate into appropriate innovative pedagogical paradigms. Moreover, the technical design is typically not based on neuroscientific and pedagogical results. The current review speaks to these topics by bringing together an interdisciplinary team composed of pedagogical, educational, neuroscientific, and technological experts to present state of the art in research findings from across these disciplines. In particular, the review builds upon previous work devoted to presenting the concept (Volpe and Gori, 2019) and methodology of our research (Price and Duffy, 2018) and extends it by identifying how this approach — grounded in neuroscientific and pedagogical evidence — provides a unique contribution to the development of new technological solutions for school learning. It mainly focuses on the authors’ lab work, the studies carried out and the results obtained by applying the methodology that the project consortium proposed in the framework of the EU-H2020-ICT project ‘weDRAW’. In the first part of the review, we discuss how the challenges in pedagogical practices and recent neuroscientific evidence may present novel opportunities for multisensory technology design and significant benefits to the learning process. In the second part of the review, we present the scientific results and technological development of the work activity of the authors through a single-use case for learning angles through sounds and body movements (these data are a summary of the data reported in Gori et al., 2021).2.Pedagogical Approach: the Role of Sensory Modalities and Multisensory Inputs in LearningThere has been a longstanding interest in how learning can be supported by using representations that engage multiple sensory modalities. For example, the Montessori education tradition uses artifacts like sandpaper letters that children trace with their fingers to develop the physical skill of learning to write. Papert (1980) discussed the idea of body-syntonic reasoning in learning about geometry. This refers to students projecting an experiential understanding of how their own bodies move onto a representation of a turtle moving on a screen. In this case, turtle movements reflect those of the student: if the student moves forward, so too does the turtle. This approach, which has been extended using sensors to record and represent movement, seems to help children to experience and view graphs of their own movements over time to foster experiential knowledge of concepts such as rate of change (Ackermann, 2004). Indeed, recent advances in theories of embodied cognition highlight the important role of the sensory body, experience, emotion and social interaction for learning and development (e.g., Alibali and Nathan, 2012; Barsalou, 2008; Varela et al., 1991). A central argument is that meaning and conceptual representation are grounded in perceptual and motor experience. In other words, in situ sensory experiences provide sensorimotor representations, or ‘embodied tools’ (through action, gesture and analogical mapping) that one can use later in reasoning (Weisberg and Newcombe, 2017).A substantial tranche of research, particularly in mathematics, shows the value of embodied and multisensory forms of interaction for learning. In mathematics, the body’s role (embodied cognition) and proprioceptive feedback are backed by significant evidence in the literature (Abrahamson, 2011; Cress et al., 2010). In particular, fostering ‘mindful’ movement supports mathematical problem solving (Ma, 2017; Rosenfeld, 2017; Shoval, 2011). For example, Shoval (2011) showed that children who were taught about angles through meaningful movements seemed to outperform those who a teacher taught verbally. The notion of ‘meaningful’ here refers to when bodily activities are related to a learning task in a conceptually meaningful way, such as swinging the forearm with the elbow fixed when learning about pendulum motion (Lindgren and Johnson-Glenberg, 2013). The authors argue that learners performing movements allowed them to encode concepts and externalise thoughts and understandings, cementing their learning. “In effect, one way of knowing a mathematical relationship is by being the relationship. In particular, learners can enact and therefore become mathematical relations by using gestures, an important type of body-based action” (Walkington et al., 2014). Bautista et al. (2011) go as far as to state that interacting with mathematical principles ‘in the flesh’ and incorporating kinetic body movement might be required for children to develop abstract mathematical knowledge. Further evidence using a control group that separates motor activity from other modalities would be useful.Another key area of evidence for embodied and multisensory forms of interaction derives from gesture studies. Goldin-Meadow and colleagues (2001) suggest that gestures can ‘offload’ some aspects of memory, augmenting working memory and thereby improving performance. Furthermore, research indicates that gestures simulate actions from mental representations of concepts and that those gestures are based on sensorimotor interaction. Results point to the notion of ‘gestural congruency’ (Segal et al., 2014), where the child’s action is linked to the underlying conceptual idea or representation (Lindgren and Johnson-Glenberg, 2013). Indeed, actions congruent with the nature of the underlying process of the task seem to improve performance and the likelihood of transferability of the skills in children (Schwarz and Martin, 2008). In addition, teacher gesture in the classroom may play an important role in conveying conceptual ideas to children (Alibali and Nathan, 2007). Representational gestures (those that convey aspects of their meaning, literally or metaphorically) demonstrate simulations of actions among mathematical objects (for instance conveying the slope on a graph), and metaphoric gestures demonstrate body-based conceptual metaphors. For example, ‘arithmetic is collecting objects’. Collectively, this work suggests that both using and observing congruent gestures seem beneficial for learning. The notion of conceptual metaphor draws on Lakoff and Johnson’s (1980a, b) work, which considers the role of sensorimotor experiences to be central in the foundation of human language and our conceptualisation of experienced phenomena (Hampe, 2005). Physical experiences form metaphorical analogies for abstract ideas. For example, the metaphor of ‘love is a journey’ is based on physical experience: similarly, one might say ‘it’s been a long road’ or one has ‘reached a cross-roads’ as metaphors using physical experience to indicate social, professional, or emotional experiences that have nothing to do with actual roads.An embodied perspective inherently embraces the idea of multimodal and multisensory learning, given its foundation in the body’s sensory experience. Research in cognitive science has shown the different representational affordances of images, text, animation and changing digital interfaces for conveying ideas and concepts (e.g., Price, 2002; Stenning, 1998); “Diagrammatic representations also typically display information that is only implicit in sentential representations and that therefore has to be computed, sometime at great cost, to make it explicit for use” (Larkin and Simon, 1987, p. 65). Furthermore, research in social semiotics and the role of multimodal or multiple representations offer new ways of engaging with learning ideas (e.g., Ainsworth, 2006; Kress, 2009; Moreno and Mayer, 1999). For example, combining tangible and visual materials emphasizes mathematical principles and seems to promote reflection on the connection between these principles and the actions the students are performing (Cramer and Antle, 2015). In addition, haptic information about molecular forces in conjunction with a 3D visual protein model improves protein–ligand understanding (Bivall et al., 2011). Finally, other theoretically grounded research exploring the cognitive impacts of multimodal learning materials (e.g., Moreno and Mayer, 1999) suggests that coherently integrating verbal and pictorial information to reduce the cognitive load for different processing channels (e.g., auditory and visual) may be pivotal.Understanding how best to design and integrate multisensory approaches into education is complex. One limitation of studies in this area may be a lack of experimental designs that include control groups of children performing similar activities with different modalities and feedback. This might contribute to confirming the benefits provided by embodied and sensory stimulation. Neuroscientific research can inform this design through better understanding the development of sensory modalities across childhood and multisensory integration.3.Neuroscientific Findings: Multisensory Integration and Sensory Interaction During DevelopmentResearchers in many fields have considered the role of each of the sensory modalities on perception during child development. Such fields include developmental psychology, psychophysics, and neuroscience. This current research provides evidence for the importance of using multiple senses and how specific sensory modalities are fundamental to develop specific perceptual environmental properties. This highlights the value of understanding the role of the senses for different tasks at varying developmental stages to inform the design of new educational technology.The same environmental features may be measured by more than one sensory system, so the brain must integrate redundant sensory information of a particular ecological property. For example, it can use the visual and haptic features of a grasped object (Ernst et al., 2007; Newell et al., 2001; Woods et al., 2008) or a spatial array (Newell et al., 2005; Woods and Newell, 2004) into a coherent percept. Recent results show that individual sensory modalities communicate with each other to influence perception during the first years of life (see, e.g., Bremner et al., 2012). Other findings show that in adults, multiple sensory inputs can benefit performance by improving multisensory precision compared to unisensory precision (Alais and Burr, 2004; Ernst et al., 2007). Developmental studies investigating the emergence of this ability demonstrate that multisensory integration abilities are task-specific and develop at different ages for different tasks. For example, during the first years of primary (i.e., elementary) school, children begin integrating multisensory information for simple localization tasks (Rohlf et al., 2020). For more complex tasks such as spatial bisection, vertical localization, navigation, size and orientation, multisensory integration develops after 8–10 years (Gori et al., 2008, 2012b, c; Nardini et al., 2008, 2014; Petrini et al., 2015). Researchers have also investigated how sensory information becomes calibrated against each other, such as how visual modality influences the development of audition or touch (Burr and Gori, 2012; Gori, 2015). Indeed, some have hypothesized that cross-modal calibration is an important developmental mechanism that occurs in some cases before the ability to integrate multisensory information occurs (Burr and Gori, 2012; Gori, 2015; Gori et al., 2008). This process is based on the idea that specific sensory modalities are more accurate (even if they are less precise) during development and one uses them to ‘train’ the development of other sensory systems. Cross-modal calibration is important because, in a developing system, the sensory modalities and different body parts grow at different rates and multisensory integration may be less useful than having a reference sensory system that can help maintain sensory stability (Burr and Gori, 2012; Gori, 2015; Gori et al., 2008). For example, using the visual modality to calibrate haptic orientation perception might help create a more stable system for orientation (i.e., vision) that can be used as a reference by more variable haptic systems. The study of multisensory integration during typical development has offered results in support of cross-modal calibration (Cuturi and Gori, 2017; Gori et al., 2008) and in the case of children with a disability (e.g., Cappagli et al., 2017; Gori et al., 2010; Tinelli et al., 2015). Cross-modal calibration during development for some tasks is task-specific. Indeed, some kind of cross-modal calibration occurs during the first period of life until the end of primary school and before multisensory integration development, i.e., orientation, size and space bisection perception (Gori et al., 2008, 2012b, c). In contrast, other forms of cross-modal calibration start later in life (after 8–10 years of age) after the beginning of multisensory integration abilities, such as audio-motor temporal recalibration and the localization of stimuli (Rohlf et al., 2020; Vercillo et al., 2015). For example, differing results that are typically observed between horizontal localization and horizontal bisection tasks might be related to stimulus complexity. For the localization task, the participants must localize a single stimulus source, and for the bisection task they must report the spatial position of the stimulus in the middle between two lateral stimuli (i.e., if closer to the first or to the second). In particular, the stimulus should be processed based on its relative metric for the bisection task, while for localization the estimation of the position in space requires less complex spatial calculations. For this reason, the role of visual calibration for the two mechanisms might differ.Studies have also recently clarified the importance of specific sensory modalities for specific concepts’ development. For example, children typically use haptic information to perceive the size of objects, whereas they use the visual system to understand their orientation (Cuturi and Gori, 2019; Gori et al., 2008, 2012b, c). Given the importance of sensory calibration during development, an object of study has been to investigate how the lack of one sensory modality affects the development of other sensory modalities. Results suggest that the absence of one sensory input can impact other modalities. For example, lack of vision affects the development of haptic orientation perception, while motor impairment affects visual size perception (Gori et al., 2010, 2012a). These results suggest that sensory modalities play an essential role in how our brains learn specific environmental properties and the understanding of differences in multisensory and sensory abilities across typical and impaired children might help improve interventions for rehabilitation (Cappagli et al., 2017; Gori et al., 2016) and education. Understanding how and when these perceptual skills develop during the primary (elementary) school years may provide fundamental knowledge for developing useful educational technology. For example, in the first years of primary school, combining audio and visual information presentation is often more suitable than presenting one signal alone (Rohlf et al., 2020), especially for some tasks such as stimulus localization but not for others such as bisection (Gori et al., 2012b).As we discussed above, evidence for a supporting role of movement, particularly gestures, with regard to learning mathematical concepts has been reported. There is evidence that indicates motor representations play a major role in multisensory processing (Fogassi and Gallese, 2004) and motor planning can modulate multisensory integration in adults (Sober and Sabes, 2003). However, the role that action and action planning play for multisensory perception during development is still unclear. Psychologists and philosophers have long argued for the essential role of sensorimotor interaction with the world for cognitive development (Clark and Chalmers, 1998; Piaget, 1972; Vygotsky, 1978). For example, experts have recognized the fundamental role of external tools in shaping activity and mediating cognition (Vygotsky, 1978). Indeed, sensorimotor experience and interaction with the environment are fundamental to meaning-making and conceptual understanding, providing the basis for learning and playing an essential role in knowledge construction. One recent example is the demonstration of children’s representation of number lines along different axes that Cooney et al. (2021) reported. In general, the study of sensory preferences for different tasks can improve technology for teaching specific concepts. For example, the haptic modality is important for size processing (e.g., Gori et al., 2012a), which enhances the importance of technological solutions that include haptic information in the teaching of size (e.g., by providing haptic feedback). Similarly, the audio modality is important for rhythm perception (Gori et al., 2012b) and technology that can associate rhythm with mathematical concepts (e.g., fractions). These may also provide an alternative means of enhancing teaching and learning. In the next section, we present new technological solutions that one could use to develop educational technology, providing multisensory inputs.Another essential feature is the potential to sense the child’s affective state during the learning process and tailor multisensory feedback to the child’s needs. As technology becomes a partner in the learning process, it is essential that it can sense children’s related affective and cognitive states. While well designed multisensory feedback can foster confidence in the child, a child might also experience moments of frustration, confusion, and even boredom. It then becomes crucial that technology adapts to children’s psychological needs. While negative affective states can be detrimental to learning if they remain unaddressed, positive conditions are also significant. A sign of excitement and curiosity may indicate that the child is ready to be challenged, while over-confidence may lead a child to overlook details (for a review, see Olugbade et al., 2017). The affective computing field has shown that detecting emotional states from expressions is possible. We argue that in the current study’s context, body expressions are particularly interesting because the body is engaged in exploring learning concepts and possibly expressing how the child feels about what is enacting through their body. For example, a child’s arm may be hesitating while s/he extends them to form an angle. A growing body of work in psychology, neuroscience, and computing confirms body expression as a primary channel for understanding how a person feels (Kleinsmith and Bianchi-Berthouze, 2013) as powerful as facial expressions and even more in certain states. Aviezer et al. (2012) argue that body expressions are more informative than facial expressions for intense emotion. De Gelder (2009) also points out that body expressions provide information about how a person feels. They also inform about whether a person is ready to respond to the emotional state, such as whether a confused learner is prepared to invest more effort or is disengaging with the task.4.Engineering Results: Multisensory Technology and Feedback, Affective State, and FlexibilityAs we have argued, multisensory information and specific senses can be essential for developing environmental perception and learning. In this section, we discuss how recent technological development enables new ways of supporting multisensory learning. There are multisensory technologies at the base of this new approach (e.g., haptic, visual, and auditory interfaces) that can provide novel forms of multisensory interaction that can foster new ways of teaching and learning. Indeed, new technological solutions facilitate more bodily-based interaction compared with desktop computing devices. Recent developments in computing extend opportunities to enhance sensorimotor interaction with learning experiences bringing research interest to understanding the interaction between mind, body, and digital tools: embodied interaction and embodied cognition. Multisensory technology can support concept exploration and understanding, and foster children’s confidence in their capability to explore and learn concepts. Below we present three essential features of multisensory technology that can support inputting neuroscientific findings into new educational, technological solutions.4.1.Multisensory FeedbackContemporary work in human–computer interaction (HCI) within learning suggests ways in which whole-body and haptic interaction systems through multisensory feedback might support learning. Whole-body interactive technologies foster learning of abstract concepts (Antle et al., 2013; Malinverni and Pares, 2014), combined with visual (e.g., Smith et al., 2014) and aural augmentation (e.g., Bakker et al., 2011). For example, Moving Sound (MoSo), a set of tangible artefacts that afford different kinds of movement (forwards, backwards, closer, further) mapped metaphorically onto the pitch, volume, and tempo of sounds, enabled children to explore and learn about the abstract qualities of sound (Bakker et al., 2011). Indeed, Walkington and colleagues (2014) report greater transfer to novel problems when using full-body gestures/action.More recently, haptic technologies have introduced new tactile experiences to learning. Through force and tactile feedback, haptic technologies enable simulated tactile sensations of an object’s hardness, shape, and texture. Haptic augmented feedback is beneficial for recall, inference, and transfer in elementary learning contexts, particularly for learning about how gears work (Han and Black, 2011) and developing psychomotor skills (Zacharia and Michael, 2015). In the context of mathematics, haptics brings new sensory experiences on concepts being explored, including partial or unfamiliar perspectives of shapes (Davis et al., 2017), and can foster children’s dimensional deconstruction of shape in ways that underpin later enactive 3D shape communication (Price et al., 2020a). Research has explored haptic interaction systems among individuals with visual impairments, often focusing on supporting wayfinding, navigation and orientation using wearable haptic artefacts (e.g., He et al., 2020; Kammoun et al., 2012; Mattheiss et al., 2017; Ross and Blasch, 2000). Other work has focused on designing interfaces to support the visually impaired to engage with graphics through 3D tactile systems (e.g., Memeo et al., 2021; Siu et al., 2021) to support science (Han, 2020) and object recognition (Dhaher and Clements, 2017). Within the context of geometry, some initial work has explored means of engaging children with shape or topological configurations (e.g., Buzzi et al., 2015), primarily using touch screens. However, few projects have explored their use in classroom-based environments.Furthermore, the recent availability to the general public of cheap devices for virtual reality (VR) and augmented reality (AR) made these technologies suitable for large-scale employment in education. VR and AR are useful in sensorimotor stimulations where they are associated not only with visual and audio feedback but also with tactile stimulation (e.g., vibration on the hands) or interfaced with haptic robotic platforms (e.g., Omni or Phantom devices). These robotics devices can be used to support the drawing of virtual shapes or present haptic scenarios (e.g., virtual cartesian plane) that one can experience haptically through the definition of virtual forces and positional cues.Although pioneering work (e.g., Kaufmann and Schmalstieg, 2003) dates to the early 2000s, research and applications of VR and AR for mathematical teaching have only recently become more widespread (for recent examples, see Salman et al., 2019; Simonetti et al., 2020; Stranger-Johannessen, 2018). These technologies would benefit from an interdisciplinary approach that identifies learning concepts the technologies can best convey and the perceptual mechanisms they can leverage.One essential feature of multisensory technology for learning is the potential to provide sensory feedback associated with specific body movements or sensory interaction. Conceptual metaphors and interactional mappings between what computer designers call ‘input actions and output responses’ provide a foundation for designing more intuitive sensorimotor interaction interfaces that foster meaningful action concerning learning concepts, such as (Lindgren and Johnson-Glenberg, 2013). In other words, grounding the interaction with technology (e.g., how input is provided, how feedback is conveyed, how feedback is produced for any given input, and so on) on metaphors of real life, i.e., utilizing mechanisms that somewhat mimic real life and support the development of more intuitive interfaces.In digital experience design, Bakker and colleagues (Bakker et al., 2012) drew on musical concepts such as tempo, volume, and pitch mapped to movement in terms of speed, proximity, and flow to design a series of physical interactive artifacts constraining movement in specific ways to support children learning about abstract qualities of sound. Other work has focused on understanding how embodied interaction with the physical world can help in designing technologies to learn more abstract forms of knowledge. For example, Howison and colleagues (Howison et al., 2011) investigated the role that bodily operations play in learning about proportional equivalence, showing that physical enaction of proportional equivalence supports one in developing understanding of this concept. Furthermore, Ma (2017) showed how the body supports communication and negotiation of geometrical mathematical ideas. However, few classroom technologies have moved beyond predominance of the visual modality (e.g., tablet applications for learning mathematics — see Note 1) to integrate information from multiple sensory modalities and foster congruent actions, in other words: meaningful actions concerning the concept being known (Segal et al., 2014).New technological solutions enable accurate and real-time mapping of motor behaviour onto multiple facets of sound, haptics, and visual media. In mathematics, sound and music content can be associated with arithmetic concepts. Mapping motor behaviour onto single dimensions of sound morphology (e.g., pitch, intensity, granularity, rhythm, and so on) can enable simple associations with arithmetic concepts (e.g., less/more, counting, summing, subtracting, etc.). The concept of order (i.e., less/more) is conveyable by associating a motor feature (kinetic energy or posture expansion are some examples) with sound granularity. Thus, as the child moves more (or the more the child expands their body), the sound content becomes richer and denser. The concept of fractions can be associated with the musical pulse of a percussive sound so that smaller fractions relate to faster (i.e., shorter in time) pulses, as happens in music in which a half note has a duration that is twice as long as that of a quarter note. In more traditional approaches (e.g., making sounds with objects or musical instruments), a precise mapping of motor features onto sound and music features would be impossible, since acting upon a sound source would simultaneously affect many audio parameters. However, current technology can support the mapping of motor behaviour onto multiple sound morphology dimensions. Technology can also enable one to map different movement features onto varying features of sound morphology (e.g., energy onto intensity, expansion onto granularity, fluidity onto rhythm, and so on).4.2.Affective StateAdvances have been made in the computational field to automatically classify the emotional states of a person from their body. The recent but fast-growing body of work for this affective state is due to the current availability of low-cost, full-body sensing technology and its availability in an everyday context (e.g., Kinect, wearable devices). Bianchi-Berthouze and colleagues (Bianchi-Berthouze, 2003) and Camurri and colleagues (Camurri et al., 2003) have pioneered this field by showing the possibility of capturing dimensional (valence, arousal, control, and avoidance) and certain facets of emotional states. By investigating these expressions in naturalistic settings, they have also shown that the automatic recognition of realistic body expressions reaches performances similar to those observed for automatic identification of facial expressions (Aung et al., 2015; Griffin et al., 2015). Body expressions are an informative affective channel in intelligent tutoring systems (Cooper et al., 2011; D’Mello and Graesser, 2010). These studies show that many learning-related affective states (e.g., boredom, confidence, confusion, engagement, excitement, flow, frustration) are detectable using this modality. Work has also been done to examine children (Sanghvi et al., 2011) engaged in in-game activities showing reliable performances. However, technology for detecting affect in education is being considered in only very controlled situations, mainly seated, and where the body is not explicitly engaged as a source of multisensory experience within the specified learning tasks.4.3.Enabling TeachersAnother essential feature is the potential for multisensory technological flexibility in allowing teachers to configure the learning experiences. Teachers can choose, for example, which motor features are mapped onto which sound parameters, supporting a child to quickly achieve fine-grained control on the sound parameters. This requires many years of practice, such as with a traditional musical instrument. One can easily re-adapt this approach exploited for musical aspects for education technology design, allowing for effective exploitation of an embodied and enactive pedagogical approach. This can foster effectiveness by using the best modality for each specific concept to teach, improve personalization and flexibility for teachers and students in the learning process, and be easily re-adapted for children with impairments.Because of the properties discussed above, we argue herein that multisensory interfaces can support engagement with learning concepts in active and participative ways, encouraging exploration, discovery, and experimentation through manipulation and tactile feedback as well as visual and aural representations. These experiential modes have potential to bridge the gap between concrete and abstract understanding, where one can exploit features of different sensory modalities to foster interpretation and meaning-making. For example, the auditory modality might be better for understanding rhythm than the visual modality, and the former might be linked to some mathematical concepts that are proximate to musical concepts, like fractions. Considering these aspects, multisensory technologies are ideal for effectively supporting a pedagogical approach that exploits specific sensory modalities to teach different concepts, as suggested by neuroscientific findings (e.g., the possibility of developing new technological methods that associate audio rhythm with visual fractions to facilitate the comprehension of this concept).4.4.Considerations About Multisensory Technological SolutionsCurrent digital technologies used in the classroom commonly rely on the visual modality for conveying learning concepts. For example, there are multiple applications for tablets for learning mathematics (Note 2), all relying on ‘seeing’ digital content rather than physically interacting with mathematical ideas. While theories of embodied cognition commonly underpin more contemporary design and development of digital learning environments that better exploit sensorimotor and multisensory interaction, it is not theoretically grounded on psychophysical or neuroscientific research findings. In the following section, we discuss key challenges and considerations in designing and developing technology for education.4.4.1.The First Consideration Is That Technological Development Should Be More Child- and Teacher-CentredAlthough a common goal for developing educational technology in general, here it assumes a particular relevance. It means that the designer can assess (i) the effective sensory modality for the child to learn a specific concept and (ii) whether specific impairments require exploiting sensory modalities. For example, recent studies showed that one could use musical training as a therapeutic tool for treating children with dyslexia (Overy, 2003; Overy et al., 2003). In our view, the teacher plays a central part as a mediator in employing technology. This means that an iterative methodology of design, development, and evaluation through a framework of a participatory design involving both teachers and students must consider usability, pedagogical effectiveness and customisability. As a mediator of learning, teachers need customizable options that enable them to choose modalities and features related to their teaching concept. Following an initial evaluation phase, we may then identify the best modalities linked to different concepts, personalize the technology to exploit the selected sensory modality, and evaluate learning process outcomes. As a side issue, technology that measures affective state may also help with screening for behavioural problems and addressing them.4.4.2.The Second Consideration Is That a More Embodied and Enactive Pedagogical Approach Should Be Used to Develop New Technological SolutionsWith this approach, teachers can use different sensorimotor feedback (audio, haptic, proprioception and visual) to teach new concepts to primary-school children. Such an approach would be more direct, natural and intuitive since it is based on the experience itself and perceptual responses to motor acts. Moreover, using movement and sensorimotor interaction for learning deepens and strengthens education and retention (Shoval, 2011), and multisensory designs to foster new engagement with mathematical ideas (Ma, 2017; Price et al., 2020a, b; Yiannoutsou et al., 2018). For example, previous studies suggest that body movement enhances spatial perception across visual and haptic modalities (Pasqualotto et al., 2005). Moreover, both space perception and spatial awareness are improvable through the association of body movement and sounds (Cappagli et al., 2017; Cuppone et al., 2018; Finocchietti et al., 2015a, b). A more active role of the user might also improve the engagement of the child in the task.4.4.3.The Third Consideration Is That Multiple Sensory Feedback and Inputs Should Be Considered in New ApplicationsMultisensory technologies can overcome the major challenge of the consolidated hegemony of vision in current educational practice. Focusing too much on one single sensory channel may represent a severe issue for the effectiveness and personalization of the learning process and the inclusion of children with impairments (such as visual impairment) or indeed children who benefit from alternative routes into learning. If we consider multisensory platforms (e.g., systems that provide audio and visual information simultaneously) we can break through these barriers because both visually impaired and sighted children can use the same system to learn based on different sensory signals. Concerning effectiveness, this may be impacted by inappropriate, incorrect, or excessive vision usage, which is not always the most effective modality for communicating certain concepts to children since different modalities communicate different information. As for personalization, a pedagogical methodology based almost exclusively on the visual modality would not consider the learning potential and routes of access for learning in children, exploiting the different modalities in ways that more comprehensively convey different kinds of information (i.e., the haptic modality is often better for the perception of texture than vision). Every child could use a different learning approach mediated by the most effective sensory signal for the specific person (e.g., for learning shape, one child would prefer to use the visual signal and another the haptic signal). In the example above, visually impaired but also typical children can use the audio modality or both visual and auditory simultaneously based on their own individual predisposition.4.4.4.The Fourth Consideration Is That New Technological Solutions Should Be Grounded on Pedagogical and Neuroscientific NeedsMultisensory technology should provide effective means to teach specific concepts that would benefit from digital augmentation or mediation from a pedagogical perspective. These could be particularly difficult concepts for children to understand or where sensory modalities other than vision can enhance critical ideas. In a recent survey we conducted with teachers (see Cuturi et al., 2021a), we observed that the challenging concepts depend on school level. We observed that, in primary-school-aged children, in geometrical context, concepts related to mental transformations are problematic while angles are medium and high difficulty for students at levels 3–4 (8–10 years old). Moreover, teachers showed that other sensory modalities (such as haptic) can benefit the understanding of ‘isometric transformations’: tipping, translation, and rotation. All this information may offer guidance toward improving teaching strategies and technology design. According to scientific evidence, from a neuroscientific perspective, multisensory technology should leverage the sensorial, perceptual, and cognitive capabilities that children possess. A technology capable of detecting specific motor behaviours in a target population (e.g., primary school) of children makes sense only if scientific evidence shows that children in in that population can display such actions. The same holds for feedback: multisensory technology can provide specific feedback (e.g., based on the auditory pitch). This approach is right if (i) children can perceive it (e.g., they developed the perception of tone), and (ii) an association exists or a new association is trained between the feedback and the concept to be communicated (e.g., the association between pitch and size of objects or audio and body movements as in the ABBI device — Gori et al., 2017).In the second part of this review, we use this approach to address critical points in an example of an application we discuss. To that end, we investigated the sensory preferences of children for sound and visual angles. We then studied the pedagogical and affective ability of the child to learn angles through sounds. Finally, we describe a new flexible technology that we developed to understand angles through body movements and sounds as feedback. We expect that utilizing movements (e.g., gestures) that are conceptually congruent with the knowledge being learned increases the child’s performance, learning, understanding and motivation (Segal et al., 2014). In the context of this new technological approach, called the ‘RobotAngle’, children make concept-congruent movements with their bodies that correspond with changes in perceptually enhanced abilities, using both vision and audition. Moreover, these sensory motor associations improve space representation and facilitate the link between the body and space (as shown in the ABBI device — Cappagli et al., 2017; Cuppone et al., 2018; Finocchietti et al., 2015c).4.4.5.One Example of Our Approach: the Use of Sounds and Body Movements to Learn AnglesWe have considered the issues outlined above in designing and developing a new digital environment to support children’s learning of angles using the association between audition and body movement. This effort was conducted as part of our collaborative project, ‘weDRAW’. In the next paragraph, we present our results and the interdisciplinary process of work we adopted from a pedagogical, psychophysical, and HCI point of view.4.4.6.Pedagogical InputsFrom a pedagogical point of view, we first needed to better understand the role of the body in children’s experience of mathematical ideas (e.g., shapes and angles in geometry). We use this to inform a digital game design that effectively fosters meaningful bodily enactment for young children (aged 6–11) learning of geometric concepts. It was necessary for us to understand how to encourage meaningful or congruent action and design useful reflective feedback, such as designing appropriate feedback that augments bodily interaction to support practical mathematical thinking. Accordingly, we collected data on childrens’ spontaneous, intuitive body actions and bodily representation for engaging with and creating angles and shapes to understand how whole-body sensory experiences can engage children in meaning-making around ideas of shape and angles (Price and Duffy, 2018). This study engaged 29 students from 7 to 11 years of age to examine the kinds of bodily movements children make and how they interpret and use them to experience different angles and shapes. Doing so informs the benefits and limitations of bodily exploration. This allows us to inform the design of augmented sensory interaction to support effective mathematical thinking. Drawing on Henderson and Taimiða (2005), activities were purposefully designed to ensure a clear maths concept was explored, where the whole-body movement was integral to the task. The study aligned activities with the school curriculum in the UK (also consistent with pedagogical goals in other educational systems across Europe). It defined angles as a geometric figure (a pair of rays with a common endpoint), a dynamic figure (a turn or rotation), and a measure (Price and Duffy, 2018). Children worked in groups of 3 or 4 with three tasks: (i) to use their bodies to make angles; (ii) use their bodies to make shapes; and (iii) use their bodies to create symmetry of shapes.Facilitators encouraged the children to think aloud as they worked, using questions such as “what is the new angle?”, “is it bigger or smaller than the first angle?” and “what is the combined angle in total?” The study collected video data for qualitative data analysis to examine moment-by-moment bodily interactions through a focus on gesture, action, facial expression, body posture, and talk (Jewitt, 2015). The aim of the research was to inform how one can use the body to enact and engage with mathematical ideas. Research showed that it is necessary to account for several considerations when using the body as a learning resource. Fundamental bodily limitations became more apparent as children used various bodily postures to represent different angles and communicate them to others (Fig. 1A). For example, stretching beyond acute angles to create obtuse or reflex angles was physically challenging.Figure 1.(A) Picture of children exploring new ways to represent angles with the body. (B) Groups of children working to create a shape composed of multiple angles.Differences in children’s physical bodies (e.g., the lengths of their arms or legs) led to differences in collaborative shape formation (e.g., two intended equilateral sides were different lengths) and potential misconceptions of properties of regular polygons. Figure 1B presents a group standing in a circle to form a triangle with joined arms. Getting the right length was a challenge due to differences in participants’ arm lengths. This challenge enabled opportunities for reflection about their aim, the discrepancy between that and their body make-up, and their exploring ways to achieve appropriate lengths, such as overlapping their arms or ‘shortening’ their arms. This result provides evidence that the children were thinking about the critical features of equilateral triangles.Finally, differences in how children perceived or ‘felt’ the positioning of their bodies in space compared to how this looked to others suggest the need to foster better awareness of the body in space and make the links between ‘felt’ and ‘overt’ bodily experience explicit.4.5.Psychophysical InputsStarting from these pedagogical results, from a psychophysical point of view, we investigated how alternative sensory modalities could be associated with angle processing. To achieve this goal, we investigated the phenomenon of perceptual correspondence during development (Cuturi et al., 2019, 2021b). Developmental studies show that children can associate visual size with non-visual stimuli that are apparently unrelated, such as pure-tone frequencies or proprioception (Holmes et al., in prep.). So far, most literature has focused on audiovisual size associations by showing that children can associate low pure-tone frequencies with big objects and high pure-tone frequencies with small ones. We investigated whether the sound frequency could offer information about angle size. Toward this goal, we investigated how cross-modal audiovisual associations develop during primary-school age, from 6 to 11 years old (Cuturi et al., 2019). To unveil such patterns, we took advantage of a range of pure auditory tones and tested how primary-school children match sounds with visually presented shapes. We tested 66 children (6–11 years old) in an audiovisual matching task involving a range of pure-tone frequencies. Visual stimuli were angles of different sizes (see Fig. 2A).Figure 2.(A) Visual angles used in the task. From the left, the angles were: 100°, 60°, 40°, 20°, 10°; each line composing the angles is 6.5 cm long. Participants were asked to indicate the visual stimulus that corresponds to the heard auditory stimulus. (B) Each data point indicates the average of each response rating’s probability distribution corresponding to the response options. Error bars indicate the 95% confidence interval. One asterisk (*) indicates p<0.05, two asterisks (**) indicate p<0.01. With permission from Cuturi et al., 2020.Our study asked participants to indicate the shape matching the sound they heard. We present the results in Fig. 2B. All children associated large angles with low-pitch and small angles with high-pitch sounds. Interestingly, older children made greater use of intermediate visual sizes than younger children to provide their responses. Audiovisual associations for finer stimuli might develop later, likely depending on the maturation of supramodal size perception processes. Upon consideration of this result, we suggest that these natural audiovisual size correspondences are useable for educational purposes by supporting the learning of relative size, including angles. Moreover, the effectiveness of such correspondences is optimisable according to children’s specific developmental stages.4.6.Inputs From Affective Technology: the Engineering ApproachIt is crucial to consider the user state in the role of technology. To investigate this point, we also studied children’s affective state in using sounds and body movements to learn angles with technology. Olugbade and colleagues (Olugbade et al., 2020) present a system that automatically infers a child’s affective and cognitive states as they are engaged in mathematical games by exploring mathematical concepts through their body and sound feedback. In recent work (Volta et al., 2019) we extend this approach to children with visual impairment. While these works are preliminary, they show the potential new multi-sensing and affect-aware technology offers to learn during uncontrolled naturalistic settings. Their research also offers an in-depth analysis of how children express certain affective and cognitive states related to learning through their body and other modalities. We considered three categories of features in the movement analysis: low-level features (e.g., velocity, energy, postural configurations), spatial/temporal features (e.g., trajectory length, distance covered), and motion descriptors (directness, smoothness, impulsivity). These types of features have been effective in affect detection (De Silva and Bianchi-Berthouze, 2004; Griffin et al., 2015). Furthermore, they are related to features that experts have used to assess self-efficacy, curiosity, and reflectivity in other contexts (Olugbade et al., 2018). Studies also indicate that postural configurations capture various states across contexts and cultures (Kleinsmith et al., 2005). Additionally, using more advanced machine-learning techniques, we let the algorithms identify patterns of movement that relate to such states and possibly better capture differences due to learning tasks and children’s idiosyncrasies.5.New Technological Solution Based on Previous Inputs: the Engineering ApproachStarting from the results in the previous sections, we developed a new technological solution that enables real-time association of visual and auditory feedback with body movements and angle processing. We considered the pedagogical inputs of providing students with a means of exploring the concept of angles by involving full-body movement discussed in the previous pedagogical section. We focused on angles because angles were among the most challenging concepts revealed by the survey conducted with teachers (for more details see Cuturi et al., 2021a). Teachers had to indicate the mathematical concepts that they perceived as difficult to teach with the visual modality and which alternative inputs (audio or touch) could effectively teach a particular concept (when the visual modality is missing, such as in the case of visually impaired individuals). The results we obtained suggest that angles are particularly difficult for children to understand. Indeed, angle comprehension presented at medium and high difficulty levels for students in the age group of 8–10 years old (in-class years specific to their national educational system). This result is supported by scientific evidence showing that the angle concept is probably difficult to learn because angle size is difficult to disentangle from overall figure size, which is in agreement with other work (Dillon and Spelke, 2018; Gibson and Maurer, 2016; Mitchelmore and White, 2000). We implemented the results obtained through the psychophysical tests by employing specific auditory pitches to infer angle size. Finally, we included the analysis of the three features of movement highlighted in the abovementioned affective study. As a result, we developed a full-body activity where different proprioceptive skills and sensory modalities need to solve a mathematical problem concerning angles.The activity we developed can be potentially used both in the classroom and at home, and its setup consists of a range-imaging sensor device [in particular, we used Kinect v.2 (Microsoft Corporation, Redmond, WA, USA)], connected to a personal computer. The software was implemented in the EyesWeb XMI platform (Volpe et al., 2016) and Unity (Unity Technologies, San Francisco, CA, USA). In this process, a major difficulty consisted of designing a clear relation between the geometrical concept and the child’s embodied experience to create a strong relation that was useful for learning. We addressed this challenge by involving teachers, pedagogues, and psychologists in the design process and conducting early testing of the produced prototypes with children. We applied a user-centred, game-based, non-invasive, and ecological approach using simple and natural stimuli to adapt training language to the subject, rather than forcing the opposite process. We ran workshops during the technological development wherein children engaged with the designs while the teachers evaluated the results. This iterative interactive approach between all users and the adaptation of the technology based on user feedback and experience allowed us to fine-tune the methods. Throughout this process, an important relation emerged between sound and body movement and was highlighted by our psychophysical experiments (e.g., the sound pitch should be associated with dimension of the angle aperture)After a short introduction explaining the rules at the beginning of the activity, the system asks the child to move their arms in the space to represent a specific angle. The system tracks the child’s movements to compute which angle is defined. Once the child can reproduce the required angle correctly, the system proposed a new angle. The goal consists of reconstructing all the angles contained in a complex shape (e.g., the house represented in Fig. 3). To facilitate the child’s adjustment of their movements, the study provided two kinds of feedback: (i) visual feedback (i.e., two lines drawing the angle created by the arms) and (ii) auditory feedback (i.e., a different sound for each angle). A sound model maps each angle to a different sound. Starting from a reference sound (associated to 0°), the sound pitch is modified (i.e., pitch decreases) until the angle size changes (i.e., the angle size increases).Figure 3.(A) A screenshot of the activity developed for exploring angles. A child must reconstruct the angles contained in a complex shaper (the house). The provided visual feedback is represented: the white angle is the angle the child’ forms with her arms. The final composition of angles creates a house. (B) A child engaged in the activity.From a technological point of view, the study used the Kinect sensor to acquire the 3D coordinates of the head and the child’s hands. This data is used to control in real-time the production of the visual and auditory feedback. The study uses EyesWeb XMI and Unity for such purposes. Features of the activity (e.g., the auditory and visual content feedback) are adaptable to each child. For example, we developed a version for visually impaired children by adapting the visual feedback with higher contrast to allow them to perform the same activity as sighted children. During the activity, we stored movement data to perform movement and affective analysis to assess the child’s behaviour, performance, and progression.Twenty-four children of 7 (n=12) and 9 (n=12) years old used the application. We divided children into two groups: one trained with the RobotAngle activity (n=12) and the other trained with another audio-motor activity unrelated to angle but on fractions (n=12). Both groups underwent pre- and post-evaluation tests before and after five activity sessions. This experiment is extensively described in Gori et al. (2021). Their study briefly describes the paradigm and methods, and partially presents the results (see Gori et al., 2021, for more details). In the first session, they performed the first part of the pre-test finished in the second session. In the second part of the second session and in the third session they performed the training, as well as in the first part of the fourth session. In the second part or the fourth session and in the fifth session, the researchers tested children in the post-tests (same duration and structure of the activity for both groups, see Gori et al., 2021, for more details). The pre- and post-evaluation sessions were composed of three tests that investigate proportional reasoning, numerosity and general geometrical knowledge. In our study, we performed a measure of number estimation, a measure of proportional reasoning and a measure of visuo-spatial abilities.In the number estimation test, we required participants to localize the position of a specific number (e.g., 17) on a bold horizontal number line from 0 to 100 (or from −100 to 0) in each trial. For the proportional-reasoning measure, we used the previously developed Proportional Reasoning Task (Boyer et al., 2008). We asked participants to select a proportion that matched a target juice mixture, determining proportionality by the relative quantities of juice and water parts. For the measure of visuo-spatial abilities, we used a validated Italian battery of tests specifically developed to link visuo-spatial abilities with geometrical knowledge (Mammarella et al., 2012). Results suggest that children of different ages improved in different tasks after different training. For example, the only test children improved upon after training with the RobotAngle was the visuospatial abilities test. The improvement was specific to children of 9 years and not of 7 years, suggesting that the effect of the training is age-specific and perhaps associated with developmental knowledge. Contrarily, the 7-years-old children improved their performance in the number line task with both training activities (i.e., RobotAngle and BodyFraction). A possible explanation of this result is that younger children might benefit more from associating numbers, arm aperture and sound. Both training activities provided this association: in the RobotAngle activity, the relationship was created between numbers and increased quantities represented as angle apertures corresponding to sound and arm apertures. In the BodyFraction activity, children created the relationship in the correspondence between increasing/decreasing numbers arm/legs aperture/rhythm. Thus, 7-years-old children less familiar with the concept of numbers along a continuum might benefit more from training. In contrast, for the geometrical test, the RobotAngle activity is more directly related to geometry and may therefore be useful at 9 years of age but too complex for younger children to internalize well. This result suggests that there might be a relationship between the school level and learning stage of the selected groups of children, indicating that training effects might differ based on age. There is a need for age appropriateness for specific task–modality combinations to ensure the usefulness of sensorimotor training. We can speculate that the benefit can be maximal considering the window between the need of improving the understanding of the concept and cognitive basis to learn it. Seven-years-old children improve more in the number line and 9-years-old children in the geometrical test because these tests present complex, relatively unfamiliar concepts for their age. Moreover, children should already have the foundation necessary to learn the concepts through the training. Indeed, geometrical concepts for 7-years-old children are too complex to learn, and even with the aid of the training they do not reach this ability. The same is not true for the 9-years-old children.The findings suggest that informative sounds associated with body movements can be a powerful tool for improving some perceptual concepts related to mathematics learning (see Gori et al., 2021 for more details). Admittedly, a limitation of this work is that we were unable to compare the benefit the technology provides with standard teaching methods that only have visual aids (e.g., classroom-taught lesson with angles drawn on a blackboard). In future work, it will be essential to include a control group who learn angles with the standard teaching approach.In this context, future must evaluate, possibly with longitudinal studies, whether these enhancements and the sensorimotor association are maintained after the training ends and the generality of learning for other sensory information. Moreover, future experiments should be performed to clarify how much improvement is a result of the sensorimotor experience compared to the visual and/or auditory experience. To support a wider usage of the application among students, we developed a new optimized version of the game to distribute in schools. It is endowed with an interactive whiteboard (IWB) already in place in most classrooms without using a Kinect and a PC. The activity includes the first part of the multisensory exploration of angles and the second part of training related to the angles’ understanding using body movements. It is freely available online (see this link for more details: https://s3a.deascuola.it/wedraw/WEDRAW.html; this link is permanent, and the activity is in Italian).6.ConclusionDrawing on recent advances in the literature, we argue for the beneficial use of different sensorimotor signals such as audition, haptics, vision, and movement to teach primary-school children mathematical concepts in novel ways. This review has presented a new approach that links pedagogics with neuroscience and engineering to develop new technology for learning mathematical concepts at primary (i.e., elementary) schools. We discussed the limitations of existing technological solutions and proposed key points that stand to improve learning activities. As a result of this interdisciplinary approach, we presented a new platform based on audiovisual and body movement to teach geometrical concepts of angles to children. In future work, it will be crucial to identify a quantitative method to better measure the improvement that standard procedures and new technological training engenders. To conclude, we propose that multisensory technology offers essential inputs to develop new applications and technology for learning and that a central, multidisciplinary approach will allow the field to reach this critical goal.

Journal

Multisensory ResearchBrill

Published: Apr 19, 2022

There are no references for this article.