ACM DL Proceedings – ACM IMX 2025

IMX '25: Proceedings of the 2025 ACM International Conference on Interactive Media Experiences

Full Citation in the ACM Digital Library

SESSION: Applications and Use Cases

Metaverse as Agile Team Workplace: An Evaluation on the Perspective of Agile Software Developers

Fabricio Malta de Oliveira
Luciana Zaina

The work of software development teams has been altered to remote mode in recent years. Although software teams have used digital tools for virtual face-to-face communication (e.g., Google Meet), these tools have usually not been adequate to promote the engagement and collaboration of software team members. Metaverse, an immersive virtual environment, has emerged as a promising platform to replicate physical presence and address these challenges. In this study, we proposed design requirements to support the development of an immersive environment tailored for agile teams, specifically focusing on sprint planning ceremonies. Based on these requirements, we developed a prototype to support sprint planning ceremonies. To evaluate the environment, we conducted three rounds of sprint planning sessions with the participation of twelve agile software developers. Our findings revealed that participants experienced a high sense of immersion and enhanced their engagement supported by the environment features. These results suggest that the Metaverse can be seen as a potential workplace for agile teams collaboration.

Who Killed Helene Pumpulivaara?: Exploring Narrative Design and Extended Reality in Industrial Heritage Interpretation

Janset Shawash
Mattia Thibault
Juho Hamari

This paper presents a Research Through Design investigation exploring innovative approaches to industrial heritage interpretation through extended reality (XR) technologies and narrative design. Using the historic Finlayson Factory in Tampere, Finland as a case study, we developed "Who Killed Helene Pumpulivaara?" - an interactive heritage experience that combines crime mystery narrative with XR technology to address key challenges in digital heritage interpretation, particularly decontextualization and accessibility. Our methodology used emerging AI tools and accessible technologies to create a multi-platform experience implemented across web and VR environments. The research revealed several significant findings: first, the varying fidelity levels between AI-generated and authentic content created an unexpected advantage in distinguishing historical fact from creative interpretation; second, combining accessible technologies with AI-assisted content generation can democratize the creation of engaging digital heritage experiences; and third, spatial relationships and narrative progression can effectively counter decontextualization concerns in digital heritage. The implementation points toward spatial augmented reality as a promising direction for future heritage interpretation, suggesting possibilities for deeper integration with physical spaces through interactive projections. This research contributes to ongoing discussions about democratizing heritage interpretation and demonstrates how thoughtful integration of emerging technologies can enhance rather than diminish authentic heritage experiences, while making industrial history more engaging for contemporary audiences.

Transforming Design Reviews with XR: A No-Code Media Experience Creation Strategy for Manufacturing Design

Sahir Sharma
Conor Keighrey
Shane Gilligan
James Lardner
Niall Murray

Extended Reality (XR) has proven effective in reducing cognitive load, enhancing spatial perception, and improving decision-making within mechanical engineering environments. However, hardware and software limitations have slowed its widespread adoption in industrial manufacturing. In particular, there is a notable gap in enabling non-programming stakeholders to create immersive, process-oriented experiences from CAD models for use in design meetings. Futhermore, the cost of traditional XR development workflows is proving prohibitive with respect to industry-wide implementation. This paper details the design, implementation, and evaluation of a no-code workflow that enables the creation of Virtual Reality (VR) experiences from CAD models for early-stage design reviews. Although developed to meet the rigorous requirements of equipment design engineering, the underlying philosophy is adaptable to other manufacturing domains working with CAD data. By integrating an enterprise-grade CAD-to-mesh translation tool with a freely available, cross-platform XR Software Development Kit, we have developed a reusable VR software container that allows CAD models to be imported without any programming expertise. Documented no-code instructions and pre-programmed XR components simplify the creation of VR-based Equipment Design Review (VR-EDR) experiences. Our research demonstrates that transforming industrial equipment design reviews using XR is feasible and efficient when supported by a container-based software solution and context-specific no-code guidelines, as validated through continuous qualitative assessments.

Immersive Learning at Scale: Exploring the Feasibility of VR in Education

Negin Soltani
Asreen Rostami

Immersive technologies are increasingly seen as transformative tools in education, offering immersive and interactive learning experiences. While much research has focused on VR’s role in content delivery, less attention has been given to its large-scale integration and systemic challenges. This study examines how VR is currently used in education, its future potential, and barriers to adoption. Through expert interviews with education developers in [anonymous country], we explore real-world challenges, including infrastructure, content availability, and usability. Additionally, we highlight VR’s potential to re-engage students who have left the school system. Our findings reveal both the promise and complexity of VR in education, emphasizing the need for systemic readiness and pedagogical alignment.

SESSION: Media, Content and Artistic Experiences

Understanding Creative Potential and Use Cases of AI-Generated Environments for Virtual Film Productions: Insights from Industry Professionals

Pauline Leininger
Christoph Johannes Weber
Sylvia Rothe

Virtual production (VP) is transforming filmmaking by integrating real-time digital elements with live-action footage, offering new creative possibilities and streamlined workflows. While industry experts recognize AI’s potential to revolutionize VP, its practical applications and value across different production phases and user groups remain underexplored. Building on initial research into generative and data-driven approaches, this paper presents the first systematic pilot study evaluating three types of AI-generated 3D environments—Depth Mesh, 360° Panoramic Meshes, and Gaussian Splatting—through the participation of 15 filmmaking professionals from diverse roles. Unlike commonly used 2D AI-generated visuals, our approach introduces navigable 3D environments that offer greater control and flexibility, aligning more closely with established VP workflows. Through expert interviews and literature research, we developed evaluation criteria to assess their usefulness beyond concept development, extending to previsualization, scene exploration, and interdisciplinary collaboration. Our findings indicate that different environments cater to distinct production needs, from early ideation to detailed visualization. Gaussian Splatting proved effective for high-fidelity previsualization, while 360° Panoramic Meshes excelled in rapid concept ideation. Despite their promise, challenges such as limited interactivity and customization highlight areas for improvement. Our prototype, EnVisualAIzer, built in Unreal Engine 5, provides an accessible platform for diverse filmmakers to engage with AI-generated environments, fostering a more inclusive production process. By lowering technical barriers, these environments have the potential to make advanced VP tools more widely available. This study offers valuable insights into the evolving role of AI in VP and sets the stage for future research and development.

Spatial Media Controller: Exploring the Potential of Augmented Reality, Virtual Reality, Volumetric Capture and 360 Videos for News Broadcasting

Charles Bailly
Lucas Pometti
Joseph Fadel
Juliette Vauchez
Julien Castet

Traditional news broadcasting is showing a decrease in viewer interest. To address this issue, new approaches are being explored, including immersive storytelling and real-time data integration, to make news broadcast more engaging and responsive to audience preferences. This paper investigates the expectations, visions, and barriers of Augmented Reality (AR), Virtual Reality, volumetric capture, and 360° video technologies for news production teams. We first present the results of the two workshops we conducted to collect this feedback from experienced professionals. We then detail the design of Spatial Media Controller, an immersive application we implemented based on the scenario created with the professionals. Finally, we also propose and compare three Mixed Reality interaction techniques to perform travelling with virtual cameras during a news broadcast simulated through a simplified version of Spatial Media Controller.

Tapest[o]ry: Promoting Marine Noise Pollution Awareness through an Interactive Tangible Tapestry

Laura Santos
Sandra M Câmara Olim
Pedro Campos
Mara Dionisio

Our ancestors communicated stories through tapestries, using them to adorn public and private spaces. Traditionally, these tapestries were static artworks hanging on walls without audience interaction. Nevertheless, textiles offer a versatile medium, crafted from diverse materials and colours and manipulated to produce unique, touch-responsive textures. Our work integrates the tactile capabilities of weaving with capacitive sensor technology to create an interactive art installation that fosters non-formal learning and raises awareness about marine noise pollution and its harmful effects on marine ecosystems. Marine animals, particularly cetaceans, rely heavily on sound for communication, navigation, and hunting. Noise pollution disrupts these essential functions, presenting severe threats to their survival. By harnessing the storytelling potential of tapestries and the power of capacitive sensors connected to a microcontroller, we developed a unique and immersive art installation that educates the public and decision-makers about these pressing issues. This study evaluates the impact of this interactive tapestry on its audience. Qualitative and quantitative results suggest valuable insights, showcasing that Tapest[o]ry engages participants through its tangible storytelling interface, fostering emotional responses such as empathy and compassion. Participants shared their thoughts about awareness of marine noise pollution, demonstrating the possible use of interactive art installations as storytelling tools. Our findings highlight the potential of integrating traditional textile techniques with interactive technology and storytelling to create powerful educational experiences outside formal learning environments, encouraging personal reflection and support for environmental issues.

Drawing-in-Steps: Supporting Creative Goals through User Engagement via Hierarchical Image Generation

Christoph Johannes Weber
Jenny Huang
Sylvia Rothe

Artificial intelligence (AI) is rapidly transforming creative arts, offering new possibilities while raising challenges related to creative control and authorship. Artistic creation is a deeply personal process that reflects an individual’s unique vision and style. However, existing AI-driven text-to-image tools often provide limited control, making it difficult for users to express their intentions fully. These tools can either fall short of expectations or produce results that feel detached from the creator’s vision, ultimately diminishing their sense of ownership and pride in the final outcome. This paper introduces Drawing-in-Steps, an application that supports a hierarchical drawing process, where stages such as ideation, detailing, colorization, refinement, and finishing build on each other to progressively shape the final artwork. This structured approach fosters collaboration between the user and AI, ensuring that artistic intent is preserved while taking advantage of AI’s generative capabilities. An explorative user study with 30 participants investigated whether a step-by-step workflow could enhance feelings of artistic ownership, collaboration, and pride compared to traditional text-to-image methods. Our findings suggest that stepwise guidance, with varying levels of user interaction, may contribute to a stronger sense of ownership and user involvement in the creative process. Participants tended to report feeling more connected to their work, experiencing greater satisfaction with their artistic contributions, and achieving better alignment with their original vision. Notably, stages that involved higher user input appeared to enhance the sense of ownership, while more automated stages required careful balancing to maintain user involvement. These results indicate that integrating AI as a collaborative partner rather than a dominant force could empower users by striking a balance between automation and creative control. Our study offers valuable insights into how AI might be meaningfully incorporated into artistic workflows to support user experience, enhance creative expression, and strengthen feelings of artistic ownership.

SESSION: Inclusion and Social Impact

A Study on Immersive Behavioral Therapy for Individuals with Intellectual Disabilities with Fear of Stairs

Marta Goyena
Carlos Cortés
Marta Orduna
Matteo Dal Magro
Ainhoa Fernández-Alcaide
María Nava-Ruiz
Jesús Gutiérrez
Pablo Perez
Narciso García

The increasing deployment of immersive technologies is opening up opportunities in the field of behavioral therapies. In particular, the use of eXtended Reality technologies allows the development of therapies that require bringing users to a remote location without the need of going to the therapy site. This is especially useful when the therapy is aimed for phobia treatment. Specifically, in this paper we present a complete pilot study of an immersive therapy based on the systematic desensitization technique focused on the treatment of phobias to stairs for individuals with intellectual disabilities. The study involved: 1) the selection, classification, and creation of the immersive stimuli; 2) the development of an immersive tool for stimuli visualization and a tool for the therapists to monitor and control the sessions; 3) the design of a methodology for the assessment of the user experience, including biosensors and adapted questionnaires, scales and scoring to evaluate factors as discomfort, fear level and technical aspects of the system; 4) carrying out the therapy study with ten users during six months; and 5) the analysis of the results showing the benefits of the immersive therapy. The therapists were involved in all these steps guaranteeing the ecological validity of the study. The results show that the immersive therapy helps to overcome the fear of stairs. In addition, the success and holistic approach of this study will encourage and guide following studies of immersive therapies.

Co-creating with Older Adults is a Challenging Task! Lessons Learned from Real-Life Experiences and Challenges at an Ageing Lab

Francisco Regalado
Carlos Santos
Ana Isabel Veloso

As the world's population ages, the use of digital technologies and games grows. However, the needs and preferences of this demographic group are not always considered. Therefore, with this research study, we intend to present a set of insights and recommendations to follow when conducting co-creating sessions for developing digital games with older adults. From a group of 27 sessions and 27 individual participants, we were able to define a set of eight recommendations to follow: (1) build trust and rapport; (2) promote social gatherings; (3) tailor the sessions to individual preferences; (4) consider diverse motivations for contacting with digital games; (5) establish an iterative feedback mechanism; (6) age-related impairments have multiple impacts; (7) favor the use of touchscreens; and (8) build a routine. It is our hope that these outcomes will contribute to pursuing an ageing-inclusive design not only for today's older adults but also for generations to come.

Livras: An App to Help Women at Risk

Luciana Rocha Palhanos
José Viterbo
Aura Conci

Gender-based violence affects millions of women worldwide, transcending cultural, economic, and social boundaries. According to an ONU survey, home is the most dangerous place. In a large number of cases, even with the possibility of local telephone numbers for help, at-risk women cannot make a call simply because the aggressor is close to them. Considering the accessibility of communication for the deaf, there are tools to teach American Sign Language (ALS), which is also used in countries other than the United States. The Signal for Help was created by the Canadian Women’s Foundation, allowing women to silently send an SOS. The idea presented in this work combines a fake app to teach sign language with a way to use this signal without the possible note of any neighbor aggressor. This is a silent and effective request to help at-risk women. After implementing the first version, it was tested by users who suggested significant improvements, and a second version was developed. This version was also sent to users for experimentation, and an update to the current version is presented here. Thus, there are useful functionalities in the application that fulfill the needs of daily use, such as access by voice activation, login with facial recognition, interface with contrasting colors, and the possibility of setting the language of the screen texts and menu words, among others. The geographical location of the woman sending the signal helps in her location, allows quick contact with the protection service in the region, and helps in identifying areas of greater risk (geographic tracking map) for the regional public security authorities.

SESSION: Immersive Experiences and Embodiment

Transitions between Realities: A Systematic Review on the Usage of XR Systems for Bridging Reality and Virtuality

Tippayaporn Pavavimol
Aleksandr Ometov
Mikko Valkama
Mattia Thibault

Transitions between "realities" play an important role in designing XR experiences, as they significantly influence user experience. However, integrating these transitions into XR applications poses a significant challenge on multiple levels, including both technical and design aspects. Although this challenge has been tackled over the past decade, existing efforts seem to be fragmented, often examining different issues in isolation. This paper aims to address this issue by conducting a systematic literature review of research related to transitions between realities using XR technology. The study seeks to provide an overview of these transitions, enhancing our understanding of the relevant elements and structures involved in the process. The review covers 38 papers from the Scopus and ACM DL databases. To provide the current state of research, we classify the literature into three main themes: research topics, application domains, and transition entities. Our findings reveal three research gaps in this area: 1) limited exploration of XR transitions across diverse domains, 2) contradictions in the transition metaphor, and 3) a lack of comprehensive understanding of XR transitions across multiple scales. In conclusion, we outline future research opportunities aimed at advancing knowledge in the field.

The Impact of Segmentation Methods for Avatar representation on User Experience in a Task-Based XR experience

Carlos Cortés
David Barbero García
Jesus Gutiérrez
Narciso García

As eXtended Reality (XR) continues to expand in applications ranging from gaming to education and training, enhancing user Quality of Experience (QoE) remains a critical focus. A key factor in XR environments is how users perceive their own bodies during interactive tasks, such as object manipulation and navigation. The accuracy and fidelity of these self-representations can significantly affect immersion, realism, and overall user engagement. This paper explores the influence of egocentric segmentation techniques on user experience within XR. Using the Quest 3 headset, we developed immersive experiences where users interact with virtual elements through representations of their hands, arms, and upper body. Participants performed two interactive tasks under three hand visualization conditions. While the visualization differed in levels of detail and fidelity, the tasks were either static or translational based. Subjective assessments measured global and visual QoE, sense of presence and perceived performance level across these conditions. All software developed is publicly available, ensuring transparency and reproducibility. While no significant differences were observed in overall subjective measurements across representation variations, users reported heightened immersion for all conditions; thus, showing the validity of these methods to be able to blend physical and virtual realities without degrading the experience. However, the results indicate that the difference between purely virtual environment representations together with a photorealistic body representation can affect the consistency of the experience. In particular, the involvement factor for the translational task suggests degradation when the whole arm is not shown (only the hand) and occlusions are not well managed, so that manual interactions based on photorealism do not meet expectations. Furthermore, visual results suggest difference between virtual and photorealistic hands, possibly due to their contrast with a non-realistic environment. Future research should focus on improving the fidelity of the virtual environments so body representation matches the representation of the environment.

Collective Embodiment, or the Social Nature of the Sense of Embodiment in Social VR

Eugene Kukshinov
Lennart E. Nacke

In Social Virtual Reality (SVR), mediated social communication blends with simulated virtual environments and bodies. However, little is known about how these social and physical (or embodied) affordances intersect in user experiences. We bridge this research gap by applying phenomenological analysis to SVR user interviews to reveal embodiment in SVR based on their lived experiences. We contribute empirical evidence to the concept of “collective embodiment,” described as a mutually maintained feeling of embodiment that SVR provides beyond individual experiences and avatar-related sensations. This involves intertwined senses of agency, location, and appearance influenced by the presence and actions of others in the virtual environment. We also observed SVR users’ difficulties controlling avatar visual representations and communication or social functions. This research uncovers the multifaceted nature of collective embodiment in SVR, offering insights into its social dynamics, design, and user experience implications.

SESSION: Analysis of Audiences and their Interactions

VIBES: Exploring Viewer Spatial Interactions as Direct Input for Livestreamed Content

Michael Yin
Robert Xiao

Livestreaming has rapidly become a popular online pastime, with real-time interaction between streamer and viewer being a key motivating feature. However, viewers have traditionally had limited opportunity to directly influence the streamed content; even when such interactions are possible, it has been reliant on text-based chat. We investigate the potential of spatial interaction on the livestreamed video content as a form of direct, real-time input for livestreamed applications. We developed VIBES, a flexible digital system that registers viewers’ mouse interactions on the streamed video, i.e., clicks or movements, and transmits it directly into the streamed application. We used VIBES as a technology probe; first designing possible demonstrative interactions and using these interactions to explore streamers’ perception of viewer influence and possible challenges and opportunities. We then deployed applications built using VIBES in two livestreams to explore its effects on audience engagement and investigate their relationships with the stream, the streamer, and fellow audience members. The use of spatial interactions enhances engagement and participation and opens up new avenues for both streamer-viewer and viewer-viewer participation. We contextualize our findings around a broader understanding of motivations and engagement in livestreaming, and we propose design guidelines and extensions for future research.

Understanding Fans' Attitudes Toward AI-generated Fan Content About Their Favorite Musician

Donghee Yvette Wohn
Tongxin Li
Sharon Oh
Vineeth Kanpa
Ian Cho

In online fan communities, generative AI is providing new ways for fans to engage with and produce fan content related to their favorite celebrities. In this study, we examined how people feel about AI-generated fan art and AI-generated music covers of their favorite musical artists. Through an online survey (N=200) we explored how parasocial relationships with their favorite musicians and self-determination factors of autonomy, competence, and relatedness were associated with attitudes towards AI-generated fan content. Parasocial relationships were positively associated, but when these intrinsic values were taken into consideration, participants’ sense of choice in controlling what they see on social media was the only factor that explained favorable sentiments towards AI-generated fan content.

Spoiler Alert! Understanding and Designing for Spoilers in Social Media

Ville Mäkelä
Kartik Chinda
Irtika Khan

Spoilers are segments of information that unexpectedly reveal crucial details about storylines. Social media is rich with discussions and content about TV shows, movies, and similar entertainment, often exposing users to spoilers. However, we lack an understanding of how users encounter and experience spoilers in social media, how they strategize to avoid them, and what kinds of tools could help control spoilers. We conducted an online survey (n = 400) to understand users’ experiences and strategies, and explored designs for spoiler control in a user study (n = 18). Our results show that spoilers are often encountered while scrolling through social media feeds, they evoke strong negative emotions, decrease people’s interest and experience regarding the spoiled title, and fear of spoilers makes people avoid social media. Users require better tools in social media platforms for controlling spoilers, and we present UI strategies for blocking and presenting potential spoiler content.

Entertainers Between Real and Virtual - Investigating Viewer Interaction, Engagement, and Relationships with Avatarized Virtual Livestreamers

Michael Yin
Chenxinran Shen
Robert Xiao

Virtual YouTubers (VTubers) are avatar-based livestreamers that are voiced and played by human actors. VTubers have been popular in East Asia for years and have more recently seen widespread international growth. Despite their emergent popularity, research has been scarce into the interactions and relationships that exist between avatarized VTubers and their viewers, particularly in contrast to non-avatarized streamers. To address this gap, we performed in-depth interviews with self-reported VTuber viewers (n=21). Our findings first reveal that the avatarized nature of VTubers fosters new forms of theatrical engagement, as factors of the virtual blend with the real to create a mixture of fantasy and realism in possible livestream interactions. Avatarization furthermore results in a unique audience perception regarding the identity of VTubers — an identity which comprises a dynamic, distinct mix of the real human (the voice actor/actress) and the virtual character. Our findings suggest that each of these dual identities both individually and symbiotically affect viewer interactions and relationships with VTubers. Whereas the performer’s identity mediates social factors such as intimacy, relatability, and authenticity, the virtual character’s identity offers feelings of escapism, novelty in interactions, and a sense of continuity beyond the livestream. We situate our findings within existing livestreaming literature to highlight how avatarization drives unique, character-based interactions as well as reshapes the motivations and relationships that viewers form with livestreamers. Finally, we provide suggestions and recommendations for areas of future exploration to address the challenges involved in present livestreamed avatarized entertainment.

SESSION: Enhancing and Evaluating Interactive Experiences

An AI-driven Music Visualization System for Generating Meaningful Audio-Responsive Visuals in Real-Time

Jenny Huang
Christoph Johannes Weber
Sylvia Rothe

Music visualizations are visual representations or interpretations of music that often dynamically respond to audio. They have the potential to enhance the immersive and engaging qualities of music, and can improve accessibility to music experiences for individuals with hearing loss. However, common music visualizers, such as those integrated into media players, often rely on simplistic algorithms that only react to instantaneous changes in the audio signal. Consequently, these systems tend to produce repetitive visual patterns that fail to encapsulate more complex musical attributes that evolve over time, such as mood and emotion. On the other hand, music visualization concepts from scholars primarily focus on novel data visualization techniques to represent musical information and overlook aesthetic appeal. To address these limitations, we developed an advanced AI-driven system that integrates Music Information Retrieval, Large Language Models, and Image Generation Models to produce meaningful, audio-reactive visualizations from real-time audio input. A study involving hearing, deaf and hard-of-hearing participants (DHH) was conducted to evaluate the system’s effectiveness in conveying musical emotion and enhancing the music experience. The valence-arousal ratings of visual, audio, and audiovisual stimuli highlighted the subjective nature of music perception, with participants giving varied responses to the same stimuli. Nevertheless, a moderate correlation was observed between the emotional responses to the music and those evoked by the visualizations, with arousal exhibiting a more significant correlation than valence. Furthermore, the combined experience of music and visuals led to greater consensus in participants’ ratings. Survey responses indicated a strong potential for music visualization systems to enhance music experiences. However, opinions among DHH participants were more diverse and generally more moderate compared to hearing participants, emphasizing the need for customization options to better accommodate individual preferences in future developments.

Winter is Coming… To Your Commute: Adapting Passenger ARTV Ornamentations to Match On-screen Visuals For Enhanced Immersion in Fantasy Drama TV Content

Iain Christie
Graham Wilson
Mark McGill
Stephen Anthony Brewster

Augmented Reality Television (ARTV) affords new ways to experience media – from displaying auxiliary information to increasing immersion with additional 3D models. Many people consume video content when travelling or commuting; accordingly, the benefits of ARTV in this context merit investigation. We conducted a study (N=20) exploring the effects of using ARTV Ornamentations when watching Game of Thrones on a train. We used Virtual Reality to simulate use of AR in a train carriage and evaluated four different levels of ARTV Ornamentation, from a simple floating panel through to themed models and interior textures that change to match the on-screen visuals. Results showed that Ornamentations matching the show’s genre were sufficient to provide an engaging and immersive experience, but Ornamentations themed to the show which changed based on the scene further elevated the experience. This means it is possible for passengers to have an immersive experience while commuting without significantly reducing their awareness of their environment. We also reflect on the implications of our work for an existing ARTV framework and propose the inclusion of several new aspects to encourage designers to consider the unique challenges and affordances of ARTV Ornamentations.

WristFlick: Design and Evaluation of a Smartwatch-Based System for Interacting with Smart Televisions

Yuan Ren
Ahmed Sabbir Arif

WristFlick is a smartwatch-based system designed to interact with smart televisions. It allows users to navigate channels, control media, and access additional content, such as cast and soundtrack details, using taps and flicks on the smartwatch display. It also supports text input to search for specific channels or titles by writing directly on the smartwatch screen. Its design was refined through multiple pilot studies and evaluated in a three-session user study. The results revealed that WristFlick is significantly faster, requires fewer actions, and leads to fewer errors compared to a traditional remote control. Furthermore, participants preferred WristFlick over remote control and experienced a greater sense of flow during usage. In search tasks, WristFlick achieved comparable speeds with significantly fewer actions. Participants also demonstrated improved performance over time, with faster input speeds in later blocks. These findings suggest that WristFlick is an effective alternative to traditional remote controls for operating and controlling media on smart televisions.

Is 'quick and dirty' good enough? An analysis of the usability evaluation practices for learning environment design

Sunny Prakash Prajapati
Syaamantak Das

Measuring usability is crucial for understanding the effectiveness of learning systems, with the System Usability Scale (SUS) being widely employed as a standard tool for usability evaluation. However, phygital learning systems (PLS), characterized by their use of multiple devices spanning across physical and digital media, introduce complexities in learning workflows that challenge traditional usability assessment methods. This raises the question: Is SUS an appropriate scale for evaluating such systems? To address this question, we analyzed studies that report on the measurement of usability for various learning systems, focusing on their methodologies and interpretations of SUS results. From our findings on the practices of using SUS for technologically enhanced learning environments, we highlight the strengths and limitations of SUS in capturing the nuances of complex, distributed workflows inherent in PLS.

SESSION: Demos of the ACM IMX

Astroroll - The Rolling Space Hero!: A VR Game Using Wheelchairs

Felipe Barreto Vimieiro Barbosa
Pedro Henrique Mendes Pereira
Esteban Walter Gonzalez Clua
Daniela Gorski Trevisan
Debora Christina Muchaluat-Saade
Gabriel Daher Monteiro Bastos Nascimento
Nathan Pinheiro Baptista

This article consists of presenting Astroroll, a Virtual Reality (VR) game in which the player assumes the role of the first wheelchair-bound astronaut at the Lunar Station. This game was designed to be played only using an eye gaze to interact and a wheelchair to move, using a technique called Redirect Walk (RW). RW is a technique that aims to modify movement in VR in relation to the real world, making the path taken between the real world and the virtual world different. Using this technique correctly is possible to enlarge the virtual space and keep the gameplay in a short area in the real world. It was developed to be exhibited at Casa da Descoberta, the public science museum of the Fluminense Federal University (UFF), in the city of Niterói, in a space that demands a compact play area to accommodate all visitors.

Autonomous Last Mile: Speculative Futuring in Virtual Reality

Kasper Karlgren
Negin Soltani
Asreen Rostami
Donald McMillan

Delta Fleet AI is a speculative VR demo exploring the hidden human labor behind autonomous mobility systems. Set in a future where autonomous vehicles rely on remote human “RoadRunners” to manage edge cases, users engage in high-stakes scenarios including route selection, manual driving, and obstacle disambiguation. Framed as a corporate onboarding simulation, the experience critiques the myth of full autonomy and exposes the cognitive and emotional demands placed on undervalued human operators. Built for MetaQuest 3, the project showcases VR’s potential for provocative speculative design, inviting reflection on automation, labor, and the ethics of human-autonomy teaming.

Demonstrating the Screenless Optical Theremin with Tremolo (ScOTT)

Michael Gancz
Justin Berry
Shu Wei
Kimberly Hieftje
Asher Marks

The potential of audio-only interfaces remains under-researched, in part because visual elements in traditional immersive experiences can overshadow the impact of audio. We demonstrate how a gestural input system facilitates expressivity and social interaction in screenless immersive settings. We showcase ScOTT, a novel screenless XR musical instrument, in the context of cooperative composition between in-headset and out-of-headset musicians. This demonstration highlights the potential for general-purpose screenless gestural interfaces, opening avenues for further exploration in accessible and immersive interaction design.

Falconry Heritage in Mixed Reality: An Interactive Experience for Digital-Native Tourists

yahia boray
Rain Alkai
Noora Fetais

This demonstration presents an interactive mixed reality (MR) experience utilizing HoloLens 2 to authentically simulate traditional falconry hunting, addressing the need for immersive representations of intangible cultural heritage. The project investigates how MR technologies can serve as an effective medium for preserving and conveying such practices by enabling participants to engage interactively in falconry hunting scenarios. The experience uniquely bridges historical practices with contemporary digital engagement. Designed specifically for exhibitions and cultural events, this MR experience fosters historical awareness, ecological stewardship, and inter-generational dialogue. User tests at public exhibitions demonstrated high engagement and educational value, prompting ongoing development to extend the experience to iOS platforms for broader accessibility and interaction fidelity.

GeoModel: Technology and Innovation for a Diverse and Interactive Geometric Education

Fernando Farias Dimas
Vitor Hugo Barbosa Melo
Leonardo da Conceição Estevam
Flávio Trindade Moura
Renato Araujo Lima
Walter Junior
Diego Lisboa Cardoso
Marcos César da Rocha Seruffo

Mathematics education in Brazil faces persistent challenges, particularly in the effective teaching and learning of geometry. Traditional methods often lack the interactivity and engagement needed to capture students’ interest, and many schools struggle with limited access to quality teaching resources. In response to these issues, the integration of emerging technologies such as Augmented Reality (AR) and Artificial Intelligence (AI) offers promising new pathways for educational innovation.

Hands-On Orchestra: Hand-based Interactive Manipulation of Spatial 3D Audio in Mixed Reality

Luis Quintero
Elias Bennaceur
Lucas Ahlnäs
Michael Bjorn

Using spatial audio in digital media improves the user’s cognitive load, sense of embodiment, and perceived immersion. There is a lack of systems that enable interactive manipulation of spatial audio through hand-tracking in see-through head-mounted displays (HMD). We contribute with a novel single-user experience to explore real-time music-making through HMD, proposing a solution to map audio characteristics to hand movements and gestures. This work aims to support studies on audio quality perception and spatial audio design for mixed-reality (MR) applications. Mainly focused on upcoming scenarios where HMD with dynamic spatial mapping must adapt visual and audio feedback when the users move along physical spaces with varying shapes and materials.

Immersive Cognitive Training for Job Integration with people with Intellectual Disabilities

Marta Goyena
Matteo Dal Magro
Martina Merolli
David Barbero García
Carlos Cortés
Marta Orduna
Ainhoa Fernández-Alcaide
María Nava-Ruiz
Jesus Gutiérrez
Pablo Perez
Narciso García

The combination of eXtended Reality (XR) technologies with physiological signals offers innovative approaches to cognitive training for individuals with intellectual disabilities. This study presents a tool designed for cognitive training aimed at job integration through interactive and immersive environments. The tool was developed in collaboration with therapists from the Juan XXIII Foundation, an occupational center, and focuses on enhancing cognitive skills through serious games for job training. These games simulate realistic environments, such as a cafeteria and a supermarket, where users must complete various tasks. These tasks are designed to progressively increase in complexity as users advance, allowing them to improve their skills. Therapists can monitor and control the training sessions in real-time through an application, while users engage with the immersive scenarios. Additionally, the system incorporates non-invasive biosensors to track physiological data, such as heart rate, eye movement, etc. providing valuable insights into the emotional and cognitive states of users during the sessions. This approach enables a more comprehensive assessment of progress and well-being. This approach offers an innovative solution that highlights the potential of XR technologies in cognitive training and job placement for people with intellectual disabilities. A demo video can be seen in https://www.youtube.com/watch?v=3fdXp8vJkhQ

Learn to Hunt like a San: A Virtual Reality Experience Blending Traditional Practices with Formal Education

Selma Auala
Heike Winschiers-Theophilus

Concerned with incorporating indigenous knowledge and the learning of traditional practices in formal education, we have created a Learn how to hunt like a San VR application in collaboration with primary school students in Namibia. The application is based on a previously developed and culturally validated VR hunting experience with a San partner community. We have extended the application with two features facilitating intentional learning, namely a poster board and a non-player character guiding players throughout the hunting experience.

NetVerse Edu: Exploring Collaborative Learning in Computer Networks through Virtual Reality

Erberson Evangelista Vieira
Francisco Petrônio Alencar de Medeiros

Information and Communication Technologies play a crucial role in education, and immersive virtual laboratories enable experimentation, overcoming limitations in access to physical laboratories. In this regard, this study investigated using an immersive virtual environment for teaching Computer Networks, evaluating its acceptance by students using the TAM Model and workload assessment through the NASA TLX.

PhysioDrum: Bridging Physical and Digital Realms in Immersive Musical Interaction

Rômulo Vieira
Debora Christina Muchaluat-Saade
Pablo Cesar

The Internet of Multisensory, Multimedia, and Musical Things (Io3MT) bridges computer science, humanities, and arts, fostering transmedia services and creative applications. This demo research applies these principles alongside extended reality (XR) to enhance PhysioDrum, an immersive, multimodal system that blends physical and digital aspects to expand musical expression in virtual environments. Using a smart musical instrument (SMI) and electronic pedals as interfaces, users interact with a virtual drum kit through gestures while receiving haptic feedback. By integrating sound and multimedia elements, PhysioDrum aims to reduces cognitive load and the learning curve, merging traditional drumming practices with immersive XR. The demo emphasizes design strategies that enhance playability, accessibility, and creative potential for users of all skill levels.

Ready Expert One: Universal 3D Workbench for Remote Industrial Training

Simon N.B. Gunkel
Tessa Klunder
Gianluca Cernigliaro

The lack of skilled workforce also creates new challenges specifically for training and (re)certification of the needed professionals. Extended Reality (XR) technology has emerged as a promising way to improve and enhance education and training. However, with the complexity of technical tasks in manufacturing and other industries, it is not always clear how XR can help in different professional trainings. Therefore, we present our prototype "Ready Learner One" as a means to support both the teaching and quality evaluations of complex technical tasks in a generic way. Our approach utilized 3D stereoscopic capture of the student’s and teacher’s work surface and presents a fused (blended) view in an Augmented Reality headset (hololens 2) for remote interaction. We utilize a Lego block building task as an example to illustrate our approach. Finally, our prototype will be demonstrated at the ACM IMX 2025 conference as a remote connection between Rio de Janeiro (Brazil) and The Hague (Netherlands).

Star Rocks: An Educational Outer Space Game

Victor Sassi
Esteban Clua
Lucas Sigaud
Erica Nogueira

Star Rocks is an educational virtual reality (VR) game that invites players to explore the differences in gravity across celestial bodies through an immersive experience. Created as part of a public outreach initiative at Casa da Descoberta, the science museum of Universidade Federal Fluminense, the game lets users experience how gravity changes on Earth, the Moon, and Mars. Through the mechanics of hitting targets while adjusting to different gravitational pulls, the game introduces fundamental physics concepts without needing formal instruction to play. Featuring diegetic user interfaces and realistic terrain models sourced, Star Rocks balances authenticity with playability, making scientific concepts accessible and enjoyable.

The Arborist: A Collective Bloom Through Physiological Data in Mixed Reality

Shu Wei
Barnabas Lee
Michael Gancz
Asher Marks
Kimberly Hieftje

This mixed reality installation The Arborist transforms physiological data (heart rate-HR, galvanic skin conductance-GSR, temperature) into collaborative digital flora using Meta Quest 3 and Shimmer3 sensors. Each user’s biometrics generate unique flowers—color tied to HR, bloom size to arousal—that populate a Tree of Shared Breath. The sculpture evolves through collective contributions, accompanied by hummingbirds and an HR-driven ambient soundtrack. By converting autonomic signals into artistic inputs, The Arborist creates a persistent ecosystem where individual physiology becomes part of a luminous, collective whole. The work reframes biometric data as a medium for poetic connection between bodies and environment without compromising data privacy.

Transitional Portals for Participatory Co-Located Cross-Reality Experiences

Luis Quintero
António Miguel Beleza Maciel Pinheiro Braga
Noak Petersson
Uno Gh Fors

Research in cross-reality systems has studied transitional interfaces that let an immersive system tune the degree of virtual objects from physical reality to virtual reality (VR). Portals have been effective metaphors to enable transitions between the real world, mixed reality (MR), and VR. However, there is a lack of implementations that enable co-located collaboration with immersive systems. This paper describes a cross-reality system for synchronous co-located collaborative tasks. It is instantiated in a performative experience where users perform interactive immersive activities. The novelty relies on the possibility of users sharing the physical space but being perceptually separated in different states of the reality-virtuality continuum. This work aims to advance immersive collaborative design with cross-reality interaction and understand how to design participatory performative experiences that can switch between physical and virtual environments.

VR Bicycle to Teach Science

João Guilherme Beltrão
Esteban Clua
Erica Nogueira
Daniela Trevisan
Lucas Sigaud

With the popularization of virtual reality (VR) devices, there is a growing demand for new forms of interaction in immersive environments, especially in the educational context. External accessories, widely used in electronic entertainment devices, have shown promise in improving the user experience in VR. In this project, we conducted an innovative experiment in which the user uses a bicycle as an interface to explore the surfaces of celestial bodies in a space-themed virtual environment.

VR WindSurf to Teach Environmental Conservation

Katlin Coutinho Santos
Fred Lopes
Esteban Clua
Erica Nogueira
Victor Sassi
Michelle Tizuka
Lucas Sigaud

Virtual Reality (VR) provides immersive and interactive experiences that support learning and skill acquisition. This paper introduces WindSurf VR, a simulation-based VR game set in a protected natural reserve in Camboinhas, Niterói. The game merges realistic windsurfing mechanics with an environmental conservation mission, challenging players to collect floating debris. By integrating physical engagement with ecological consciousness, WindSurf VR seeks to foster sustainability education through an engaging and competitive experience.

Who Killed Helene Pumpulivaara?: AI-Assisted Content Creation and XR Implementation for Interactive Built Heritage Storytelling

Janset Shawash
Mattia Thibault
Juho Hamari

This demo presents "Who Killed Helene Pumpulivaara?", an innovative interactive heritage experience that combines crime mystery narrative with XR technology to address key challenges in digital heritage interpretation. Our work makes six significant contributions: (1) the discovery of a "Historical Uncanny Valley" effect where varying fidelity levels between AI-generated and authentic content serve as implicit markers distinguishing fact from interpretation; (2) an accessible production pipeline combining mobile photography with AI tools that democratizes XR heritage creation for resource-limited institutions; (3) a spatial storytelling approach that effectively counters decontextualization in digital heritage; (4) a multi-platform implementation strategy across web and VR environments; (5) a practical model for AI-assisted heritage content creation balancing authenticity with engagement; and (6) a pathway toward spatial augmented reality for future heritage interpretation. Using the historic Finlayson Factory in Tampere, Finland as a case study, our implementation demonstrates how emerging technologies can enrich the authenticity of heritage experiences, fostering deeper emotional connections between visitors and the histories embedded in place.

SESSION: Work-in-Progress

A Multi-Agent Digital Twin Framework for AI-Driven Fitness Coaching

Monica (Monireh) Vahdati
Kamran Gholizadeh HamlAbadi
Fedwa Laamarti
Abdulmotaleb El Saddik

We introduce DTAIFC, a modular Digital Twin AI Fitness Coaching system that delivers personalized feedback through multimodal interaction. The system combines OpenPose-based skeletal tracking with a Crew-inspired multi-agent architecture to analyze user posture and provide biomechanically grounded coaching in natural language and voice. At its core, an Orchestrator Agent coordinates Feedback and Recommendation Agents, leveraging short-term memory (Redis) for real-time session context and long-term memory (PostgreSQL) for user-specific historical insight. Language generation is powered by GPT-4, enabling adaptive, context-aware feedback through prompt-driven reasoning. DTAIFC operates asynchronously through a lightweight web interface, supporting input via static images, voice commands, and text queries. Unlike real-time systems that depend on continuous video or wearables, DTAIFC offers a scalable, privacy-conscious solution for intelligent fitness guidance in virtual environments. This framework establishes a new paradigm for memory-augmented, agentic AI coaching, advancing the integration of digital twins in human-centered applications.

A Quality of Experience Evaluation of an Omnidirectional Treadmill for Fitness in Virtual Reality

Piotr Warkocki
Niall Murray
Conor Keighrey

Recent advancements in Virtual Reality (VR) technologies have significantly enhanced immersive experiences, bridging the gap between physical and digital worlds. Modern Head-Mounted Displays (HMDs) and motion capture systems such as the KAT VR treadmill have transformed VR into a dynamic platform for applications ranging from entertainment to fitness. These innovations offer opportunities to integrate physical activity within virtual environments, promoting engaging and interactive experiences. This paper presents an overview of a work-in-progress quality of experience (QoE) evaluation of a VR fitness system that explores the utility of a VR locomotion system. Explicit measures will be captured using traditional post-experience questionnaires, whilst implicit measures of QoE in the form of heart rate, electrodermal activity, and skin temperature will be captured using the Emotibit wristband. In parallel, the system will monitor objective metrics which aim to understand the number of calories burned, and distance travelled. The findings aim to highlight the potential of VR as a tool for promoting physical health while contributing to the growing field of fitness-focused social VR.

A Scalable and Reproducible ML Pipeline for Cancer Prediction using IR Images

Leonardo Reigoto
Aura Conci

Breast cancer remains a significant global health challenge, where early detection critically improves patient outcomes. Thermography presents a promising non-invasive imaging modality for screening. This paper details a work-in-progress on a computer vision application utilizing thermography images for breast cancer prediction. We focus on the implementation of a scalable, reproducible, and accessible machine learning pipeline built with Apache Airflow, Docker Compose, FastAPI, Gradio, MLflow, and PyTorch Lightning. The objective is to bridge the gap between research prototypes and practical deployment, facilitating easier testing and use, particularly in resource-constrained settings. We present the system architecture, initial methodological considerations addressing data leakage found in related work, and preliminary results, seeking feedback for further development.

AI-mediated Collaborative Crowdsourcing for Social News Curation: The Case of Acropolis

Daniel Schneider
Ramon Chaves
Ana Paula Pimentel
Marcos Antonio de Almeida
Jano Moreira De Souza
António Correia

Collaborative crowdsourcing has evolved over recent years as a problem-solving model that builds on interactions that take place within crowds or self-organizing teams that form and dissolve in response to an open call. Unlike microtasking, its effectiveness depends primarily on collective actions rather than individual efforts, particularly for non-decomposable tasks that require a certain level of interdependency. Although social news curation has gained attention as a means of empowering both citizens and journalists, there is a need to understand the components and constraints that shape this online crowdsourcing activity when mediated by artificial intelligence (AI). In response to this gap, this paper reimagines the traditional social news curation model through a design fiction prototype consisting of a personalized AI system and user-driven social curation activities. In a highly polarized opinion and political climate where propaganda, misinformation, and extreme polarization take on a dominant framing to generate social chaos, the envisioned solution seeks to model healthy and trustworthy socio-algorithmic interactions towards a more democratic and relevant form of news curation.

An LLM-Enhanced Framework for Bridging Simulators and Game Engines towards Realistic 3D Simulations

Luiza Martins de Freitas Cintra
Elisa Ayumi Masasi de Oliveira
Rafael Teixeira Sousa
Valdemar Vicente Graciano
Arlindo Galvão
Gustavo Webster
Sofia Larissa da Costa Paiva

Modeling and Simulation (M&S) is a fundamental approach for engineering disruptive solutions, widely applied in critical domains such as aerospace, military, healthcare, traffic, and energy. These domains often exhibit high complexity, uncertainty, and nonlinearity, making simulation and visualization particularly challenging. While M&S allows for the evaluation and analysis of innovative systems, enhancing the experience for analysts and engineers through complementary tools is crucial for improving realism, accuracy, and trustworthiness. In this paper, we present preliminary results on the development of a framework that integrates M&S with advanced visualization capabilities. Our approach bridges the Discrete-Event System Specification (DEVS) formalism with the Unity engine to provide a realistic graphical representation of simulations. By leveraging a Large Language Model (LLM) trained for this purpose, we automatically generate both simulation and animation codes, ensuring synchronized execution. The generated models exchange data, with the simulator serving as the computational engine and Unity as the visualization platform. Additionally, we provide a library of pre-defined simulation models for reuse in various domains. This integration enhances the interpretability of simulations, offering a more intuitive and immersive experience for system analysis.

Authoring Tool for Immersive Content Leveraging Cloud-based Rendering System: Toward Virtuous Cycle between Viewing and Creating Content

Shuichi Aoki
Yasutaka Matsuo

Immersive media is an advanced form of applications such as Virtual Reality (VR), Augmented Reality (AR), and eXtended Reality (XR) that allows users to move freely in a video space and view images from any location. It is a 6-degrees-of-freedom (DoF) media in which users can feel as though they are actually in the video space. We are engaged in research on content formats for immersive media and cloud-based rendering systems to achieve immersive media that can be viewed on various types of devices. In this paper, we present an authoring tool we developed to leverage a rendering server on a cloud in order to create immersive content easily. The authoring tool, which makes use of a cloud-based rendering system that has been developed to view immersive content, enables users to arrange video objects including compressed volumetric video and 360-degree video in a three-dimensional space of content while checking images viewed from various positions and directions. It is also possible to configure a viewing area in which users can move around in the three-dimensional space and to record the recommended viewing position and direction. The produced scene description can be output in a streaming format. We confirmed that the immersive content consisting of scene description produced with the authoring tool can be viewed on a head-mounted display, tablet device, and PC with controller. When such a tool becomes widely used, it is expected to create a virtuous cycle between viewing and creating immersive content, leading to the spread of more content.

Child Remains: Engaging Grief, Memory, and Cultural Heritage through XR and the Tactility of Absence

Gabriella Di Feola
Asreen Rostami

This work-in-progress presents and reports on the design of Child Remains; an artistic, practice-based project that investigates how XR technologies can be used to create affective, immersive museum experiences that support public engagement with personal and collective grief. Developed in collaboration with the Lilla Ånggården museum in Sweden, the project bridges digital heritage with contemporary experience. The installation combines VR, storytelling, tactile artefacts, and dialogical encounters to explore how immersive environments can create space for conversation, memory, and emotional processing within museum settings. This paper outlines the design rationale, methodological approach, and preliminary insights, contributing to HCI discourses on cultural heritage, XR, and designing with grief.

Co-creating a VR Narrative Experience of Constructing a Food Storage Following OvaHimba Traditional Practices

Kaulyaalalwa Peter
Isaac Makosa
Selma Auala
Laurinda Ndjao
Donovan Maasz
Uariaike Mbinge
Heike Winschiers-Theophilus

As part of an attempt to co-create a comprehensive virtual environment in which one can explore and learn traditional practices of the OvaHimba people, we have co-designed and implemented a VR experience to construct a traditional food storage. In collaboration with the OvaHimba community residing in Otjisa, we have explored culturally valid representations of the process. We have further investigated different techniques such as photogrammetry, generative AI and manual methods to develop 3D models. Our findings highlight the importance of context, process, and community-defined relevance in co-design, the fluidity of cultural realities and virtual representations, as well as technical challenges.

Emanation and Extinction: exploring the possibilities of real-time generative AI for interactive experiences

Maxim Safioulline
Minyoung Joo
Shinye Kim

This paper describes the context, artistic strategies, technical implementation and user experience of an interactive media art installation “Emanation and Extinction”, a participatory experience that explores the aesthetic, interactive and technical possibilities of real-time generative AI in interactive media art. The installation uses pose and hand recognition, video processing and real-time image generation based on diffusion machine learning models to engage the users in an interactive experience centered on aesthetic discovery. The paper also outlines the uncovered limitations, both technical and experiential, and examines possible future explorations of the use of real-time image generation in interactive media art.

Experiencing Art Museum with a Generative Artificial Intelligence Chatbot

Huan Wang
Andrii Matviienko

Generative Artificial Intelligence (GenAI) chatbots start changing experiences with art for museum visitors by making them more interactive and engaging. However, it remains underexplored how GenAI chatbots influence visitors’ in-field experience and interaction at art museums regarding finding information, engagement, and enjoyment compared to existing museum tour-guide applications. In this paper, we contribute the design and implementation of a smartphone-based chatbot that detects artwork, generates textual and auditory information, and interactively answers visitors’ questions. To explore visitors’ experience with it, we conducted a field experiment (N=30) at the National Art Museum, comparing it to the existing museum application. Our results indicate that visitors showed higher artwork engagement with the chatbot than the museum application. Moreover, they enjoyed an interactive experience using the chatbot to learn about the art collection and have equally preferred textual and auditory information representation.

Immersive Virtual Museums with Spatially-Aware Retrieval-Augmented Generation

Elisa Ayumi Masasi de Oliveira
Rafael Teixeira Sousa
Andressa Araújo Bastos
Luiza Martins de Freitas Cintra
Arlindo Rodrigues Galvão Filho

Virtual Reality has significantly expanded possibilities for immersive museum experiences, overcoming traditional constraints such as space, preservation, and geographic limitations. However, existing virtual museum platforms typically lack dynamic, personalized, and contextually accurate interactions. To address this, we propose Spatially-Aware Retrieval-Augmented Generation (SA-RAG), an innovative framework integrating visual attention tracking with Retrieval-Augmented Generation systems and advanced Large Language Models. By capturing users’ visual attention in real time, SA-RAG dynamically retrieves contextually relevant data, enhancing the accuracy, personalization, and depth of user interactions within immersive virtual environments. The system’s effectiveness is initially demonstrated through our preliminary tests within a realistic VR museum implemented using Unreal Engine. Although promising, comprehensive human evaluations involving broader user groups are planned for future studies to rigorously validate SA-RAG’s effectiveness, educational enrichment potential, and accessibility improvements in virtual museums. The framework also presents opportunities for broader applications in immersive educational and storytelling domains.

Lemmings: Creating Re-mappable Gestures For More Accessible Immersive Experiences

Justin Berry
Michael Gancz
Asher Marks
Kimberly Hieftje

We are building an open source code library, called Lemmings, for creating re-mappable gestures within Unity. We want to test the viability of a new paradigm for gestural interactions that focuses on accessibility and expressivity. Gestural interactions are becoming more broadly integrated into immersive platforms, but developing for them requires the use of tools that are biased towards particular outcomes. The most commonly supported method of creating custom gestures is pose-detection, but the kinds of semantic gestures this method supports are inaccessible and resistant to expressive interpretation. Without tools to enable easier experimentation, developers are left to explore unique alternatives that take time to build and, even when successful, such solutions might not be available to the community for extension and iteration. The full technical breakdown of Lemmings will be available with the public release and its associated documentation, and so the focus of this paper is to consolidate our conceptual approach and methodology in the hopes of fostering discourse. New paradigms are not exemplified by solutions themselves, they are measured by the extent to which they reshape the collective imagination to include such solutions. We hope to engage the community and learn from its diverse perspectives in order to ensure that the work we are doing is part of an ongoing and evolving inquiry.

On Breast Reconstruction using IR Images by AI Techniques

Fernando Pereira Gonçalves de Sá
Aura Conci

This paper presents a novel approach for 3D breast reconstruction using infrared (IR) images, leveraging Gaussian Splatting (3D-GS) techniques to improve diagnostic accuracy in breast cancer detection. The proposed method adopts the 3D-GS based method pixelSplat to generate detailed 3D models of breast tissue, aiding in the precise detection of abnormalities. The application of pixelSplat allows for real-time, memory-efficient rendering, improving the resolution of 3D models while overcoming common challenges in medical imaging, such as the handling of sparse data and the absence of camera pose information. By utilizing infrared thermography, this work addresses the challenge of enhancing early detection and treatment planning, ultimately contributing to more effective and personalized care within the public health system. This innovative approach holds significant potential for advancing breast cancer detection and treatment strategies in clinical practice.

See Through My Eyes: Using Multimodal Large Language Model for Describing Rendered Environments to Blind People

Angelo Coronado
Sergio T Carvalho
Luciana Berretta

Extended Reality (XR) is quickly expanding “as the next major technology wave in personal computing”. Nevertheless, this expansion and adoption could also exclude certain disabled users, particularly people with visual impairment (VIP). According to the World Health Organization (WHO) in their 2019 publication, there were at least 2.2 billion people with visual impairment, a number that is also estimated to have increased in recent years. Therefore, it is important to include disabled users, especially visually impaired people, in the design of Head-Mounted Displays and Extended Reality environments. Indeed, this objective can be pursued by incorporating Multimodal Large Language Model (MLLM) technology, which can assist visually impaired people. As a case study, this study employs different prompts that result in environment descriptions from an MLLM integrated into a virtual reality (VR) escape room. Therefore, six potential prompts were engineered to generate valuable outputs for visually impaired users inside a VR environment. These outputs were evaluated using the G-Eval, and VIEScore metrics. Even though, the results show that the prompt patterns provided a description that aligns with the user’s point of view, it is highly recommended to evaluate these outputs through “expected outputs” from Orientation and Mobility Specialists, and Sighted Guides. Furthermore, the subsequent step in the process is to evaluate these outputs by visually impaired people themselves to identify the most effective prompt pattern.

Social eXtended Reality (XR) and Virtual Production: Toward New Engaging Immersive Experiences

Mario Montagud Climent
Álvaro Egea Benavente
Marc Martos Cabré
Javier Montesa
Francisco Ibañez
Sergi Fernández

Social eXtended Reality (XR) is poised to become a dominant medium for remote communication, social interaction, and collaboration in the near future. However, its potential can be significantly magnified by integrating additional technological enablers. On the one hand, the support for realistic and volumetric holographic representations of users, captured in real-time via affordable setups, will result in enhanced levels of quality of interaction, co-presence and trustworthiness compared to using synthetic avatar-based user representation formats. On the other hand, the availability of high-quality multi-modal virtual production tools will enrich the immersion, interaction and storytelling possibilities, while it will allow providing such engaging services to large-scale audiences via 2D video distribution platforms. This paper presents a strategic vision toward achieving a modular and seamless integration between innovative Social XR, holographic communication, and virtual production tools to exploit all such potential advantages. In particular, it proposes a high-level architectural framework to cohesively integrate such technological enablers, accommodating novel features to meet newly derived requirements. Then, it outlines potential applicability scenarios and reports preliminary results as initial evidence of the full achievable potential.

The Bitter Taste of Confidence: Exploring Audio-Visual Taste Modulation in Immersive Reality

Pooria Ghavamian
Jan Henri Beyer
Sophie Orth
Mia Johanna Nona Zech
Florian Müller
Andrii Matviienko

Extended Reality (XR) technologies present innovative ways to augment sensory experiences, including taste perception. In this study, we investigated how augmented reality (AR) visual filters and synchronized audio cues affect gustation through a controlled experiment with 18 participants. Our findings revealed unexpected crossmodal interactions: while pink visual filter typically associated with sweetness reduced perceived bitterness in isolation, it paradoxically enhanced bitterness perception when combined with sweet-associated audio cue. Furthermore, we observed an inverse correlation between participant confidence levels and their perception of taste intensities across multiple dimensions, highlighting confidence as an overlooked factor in sensory experience design. These findings inform the design of nuanced multisensory experiences in immersive media, where subtle crossmodal interactions significantly influence user perception.

vER: Virtual Human Companionship for Patients in Preoperative Care

Aditya Pandhare
Darko Skulikj
Junwoo Lee
Pierre Mikhiel
Domna Banakou

Surgical procedures often cause significant anxiety, particularly in the preoperative phase. While virtual reality (VR) has been explored as a potential tool to reduce stress in surgical patients, existing studies typically focus on familiarizing patients with the surgical environment to alleviate fear responses by providing them with more information about their procedure; or aim to reduce stress by exposing them to virtual environments, such as those designed for meditation. This project investigates VR as an innovative approach to reduce preoperative stress, with a focus on virtual human companionship. We developed an application to examine how VR-based social presence and interactions with virtual humans can help alleviate anxiety, exploring the role of social facilitation in a VR context. Some preliminary findings indicated that VR companionship can reduce stress and enhance comfort and engagement in preoperative settings. However, participants reported low perceptions of realism in the VR environment, and stress reduction was not statistically significant, underscoring the need for improved system design. Current enhancements include refining AI agent interaction models and introducing a collaborative VR environment that allows patients' friends to join remotely.

SESSION: Doctoral Consortium

Emofusion: Toward Emotion-Driven Adaptive Computational Design Workflows

Albert Luganga

Emotion-aware systems are emerging as a frontier in interactive media design, promising adaptive, personalized experiences based on the user’s internal state. This doctoral research investigates how peripheral physiological signals—such as heart rate (HR) and electrodermal activity (EDA)—captured through wearable sensors can be used to infer emotional states in real time and drive dynamic adaptations in media content and interface design. While affective computing has made substantial progress in emotion recognition, most systems rely on static user profiles or intrusive biosensing (e.g., EEG), limiting real-world applicability. My research proposes a lightweight, real-time emotion-adaptive framework that uses peripheral physiological data to support emotionally intelligent design systems. This work sits at the intersection of affective computing, adaptive systems, and computational design, with the goal of moving beyond static personalization toward continuous emotional responsiveness and user-centric experience modulation.

Exploring Peak Experiences through Multi-Sensory Extended Reality: Implications for Therapeutic Practice and Interactive Media Experiences

Sunil Thaker
Maxi Heitmayer
Terry Hanley

As poor mental health levels rise, technology may provide healing and personal growth for people by affording them non-ordinary states of consciousness. This doctoral research will investigate the potential for Multi-Sensory Extended Reality (MSXR) to induce Peak Experiences (PEs) and altered states of consciousness (ASCs), contributing to therapeutic innovation and user engagement in interactive media. Drawing on Abraham Maslow’s humanistic framework, and notions of PE, this study will explore whether immersive experiences combining sound, vision, and somatic interaction can evoke transformative states of consciousness and moments of PE. Using a qualitative phenomenological design, the project will capture participants’ lived experiences through MSXR engagements and investigates perceived therapeutic benefits. By positioning MSXR within human-centred computing and Digitally Induced Altered States of Consciousness (DIAL), this research aims to expand the role of immersive technologies in mental health interventions, contributing to the fields of counselling psychology, interactive media, and consciousness studies.

Internet of Multisensory, Multimedia and Musical Things (Io3MT): Framework Design, Use Cases, and Analysis

Rômulo Vieira

The Internet of Multisensory, Multimedia, and Musical Things (Io3MT) is a new concept that emerges from the convergence of several fields in computer science, arts, and humanities, aiming to bring together, in a single environment, devices and data that explore the five human senses, as well as multimedia aspects and musical content. This doctoral research proposes the first framework related to this area, defining an architecture, protocol stack, data types, communication network requirements, and tools that can be employed in its development. The goal is to provide guidelines so that interested stakeholders can practically implement Io3MT environments. To validate the proposal, three proof-of-concept implementations were developed, and tests were conducted to assess the quality of service (QoS) and quality of experience (QoE) of users regarding a prototype device, an immersive application, and a real-time artistic performance scenario. The experiments confirm the technical feasibility of the proposal while ensuring the presence of aesthetic and expressive factors.

Proposal for an Open-source Robotics Framework and Platform for the Development of Affective Social Robots

Marcelo Rocha
Jesus Favela
Débora Muchaluat-Saade

Affective computing enables robots to understand, interpret, and respond to human emotions, and this is fundamental to creating more natural, human-like interactions. In healthcare, emotionally intelligent robots can reduce patient anxiety and improve treatment outcomes by providing empathic interactions. In educational settings, these robots can be used to create learning environments that are adaptive to students’ emotions, enhancing the educational experience. Despite the good results presented in the literature with the use of affective social robots, their full use in the areas of education and healthcare has often been hindered due to their high cost. In addition, there is a need for open-source robotic platforms that can be customized according to the needs and resources of human-robot interaction (HRI) researchers, developers, and target users. This thesis presents a proposal for an open-source robotics framework and platform and the development of affective social robots, which includes a new affective social robot named FRED. It is an open-source affective human-robot interaction (AHRI) research platform. This platform can help HRI researchers conduct studies with social robots, especially those aimed at answering affective robotics-related questions involving affective understanding and affective generation. To serve as a proof of concept, the proposed framework was instantiated in two real-world scenarios on two different robotic platforms. Finally, we run an experiment with FRED using the framework as a research instrument in HRI.

Tell me why: how Explanation can affect Recommender Systems

Leticia Freire de Figueiredo
Antonio A. de A. Rocha
Aline Paes

Recommender systems play a crucial role in helping users decide what to watch or purchase by suggesting relevant items. These systems can enhance the media experience by considering user preferences, inferring behavior, and personalized recommendations. However, users often do not understand why a particular item was recommended to them. Explainable recommender systems aim to clarify the reasoning behind recommendations, increasing user trust and confidence. Despite advancements, gaps remain in the literature, particularly in evaluating these systems. This thesis will explore and propose new metrics to better assess explanation methods, investigating why and how current explanations fall short in evaluations. Additionally, we aim to examine whether explanations can reveal if recommender systems create filter bubbles and explore ways to mitigate this issue based on user preferences.

Vibrotactile Digital Therapy Using Haptic Interfaces in Preschoolers with ASD

Luis Miguel Zamudio Fuentes

This research proposes the development of a haptic interface that uses vibrotactile feedback to study a digital phenotype for preschoolers diagnosed with autism spectrum disorder (ASD). Sensory processing difficulties are prevalent in children with ASD, affecting their ability to engage with their surroundings and benefit from therapeutic interventions. The study aims to capture detailed touch interaction data through a sensory tablet application, analyzing parameters such as pressure, stroke duration, and gesture patterns. These data could serve as digital biomarkers, guiding the personalization of therapeutic activities. This approach has the potential to enhance sensory integration, improve engagement, and offer objective insights into therapeutic progress, representing a significant step toward scalable and data-driven therapeutic tools for ASD interventions.