Please check the complete schedule for ACM IMX 2022 through the following link.
Table of Contents
IMX ’22: ACM International Conference on Interactive Media ExperiencesFull Citation in the ACM Digital Library
SESSION: Storytelling, Media & Engagement
As data literacy skills are increasingly important in today’s society, scholars have been exploring strategies to engage people with data, for example through storytelling and familiar media such as video. In this paper, we present the design of a video-based data storytelling application that prompts children and their families to explore and interpret historical weather data through a personalised weather forecast. The application was displayed at a 2-month summer exhibition of a popular television channel. In a controlled comparative study, we investigated how the application triggered reflection, as well as emotional and narrative engagement of families at home and at the exhibition. We combined this approach with an in-the-wild study, in which we observed spontaneous interactions of visitors. Our findings indicate that data engagement is encouraged when family interactions occur, which may be facilitated by external environmental conditions and internal story design. Here, we uncover 5 design recommendations for data video storytellers.
The creation, consumption and commercialisation of podcasts have increased rapidly in recent years, yet there is limited research exploring the creators who are often the source of the products in this relatively new medium, as well as the workflows they utilise in making podcasts. Based on semi-structured interviews with sixteen professional podcast creators, and subsequent thematic analysis, this paper 1) codifies and quantifies the activities involved in podcast creation, 2) distils the archetypal podcast production workflow, 3) finds that this workflow is remarkably consistent as a function of podcast genre and creator affiliation (independent or part of a media organisation), and 4) sheds light on the “creator” role that has become a distinctive feature of the medium. This snapshot of the inner workings of the creation process, in the evolution of a highly engaging medium, could form the basis for identifying potential innovations that would increase the interactivity of podcasts in future.
Exploring Effect of Level of Storytelling Richness on Science Learning in Interactive and Immersive Virtual Reality
Immersive and interactive storytelling in virtual reality (VR) is an emerging creative practice that has been thriving in recent years. Educational applications using immersive VR storytelling to explain complex science concepts have very promising pedagogical benefits because on the one hand, storytelling breaks down the complexity of science concepts by bridging them to people’s everyday experiences and familiar cognitive models, and on the other hand, the learning process is further reinforced through rich interactivity afforded by the VR experiences. However, it is unclear how different amounts of storytelling in an interactive VR storytelling experience may affect learning outcomes due to a paucity of literature on educational VR storytelling research. This preliminary study aims to add to the literature through an exploration of variations in the designs of essential storytelling elements in educational VR storytelling experiences and their impact on the learning of complex immunology concepts.
SESSION: Immersive-Multisensory Experiences
People are typically involved in different activities while eating, particularly when eating alone, such as watching television or playing games on their phones. Previous research in Human-Food Interaction (HFI) has primarily focused on studying people’s motivation and analyzing of the media content watched while eating. However, their impact on human behavioral and cognitive processes, particularly flavor perception and its attributes, remains underexplored. We present a user study to investigate the influence of six types of videos, including mukbang – a new food video genre, on flavor perceptions (taste sensations, liking, and emotions) while eating plain white rice. Our findings revealed that participants perceived positive emotional changes and reported significant differences in their augmented taste sensations (e.g., spicy and salty) with different food-based videos. Our findings provided insights into using our approach to promote digital commensality and healthier eating (digital augmentation without altering the food), highlighting the scope for future research.
The preliminary results of immersive virtual reality music therapy experiences are presented, carried out with people between ages of 72 and 96, residents of a nursing home, who suffer from mild mental health problems, potential dementia, moderate cognitive disorders, depressive symptoms, or severe dependency. The experiences are carried out in 15-minute sessions, streaming 180° and 360° music videos of genres that the elderly like, recorded for this purpose and performed by musical groups that collaborate voluntarily. The design of the developed application (the web interface, the application logic and the HMD player) is simple so that it is easy to use during music therapy sessions. Preliminary results show that the elderly experience similar levels of sense of presence as reported by other studies in younger population (< 55), and only 19% of them suffer from simulator sickness. This validates the usage of immersive media for music therapy in elderly patients.
Augmented Reality (AR) and Virtual Reality (VR) have shown potential benefits in managing healthy behavior. This paper presents a systematic review of AR- or VR-driven interventions for promoting healthy behaviors. The review investigates the design, implementations of the intervention, persuasive strategies, intervention platforms, underlying technologies, current trends, and research gaps. Our review of the past 10-years’ work in the area reveals that 1) the considered papers focused on seven main healthy behaviors, where “alcohol use” emerged as the most commonly considered behavior; 2) trustworthiness emerged as the most commonly used persuasive strategy; 3) youth are the most targeted audience; 4) VR is more common than AR; and 5) most AR- or VR-driven interventions are perceived to be effective in motivating healthy behavior in people. We also uncover how they use Artificial Intelligence and Object Tracking in this space. Finally, we identify gaps and offer recommendations for advancing research in this area.
SESSION: XR Artistic & Creative Experiences
We present TeleFest, a novel system for live-streaming mixed reality 360° videos to online streaming services. TeleFest allows a producer to control multiple cameras in real time, providing viewers with different locations for experiencing the concert, and an intermediate software stack allows virtual content to be overlaid with coherent illumination that matches the real-world footage. TeleFest was evaluated by livestreaming a concert to almost 2,000 online viewers, allowing them to watch the performance from the crowd, the stage, or via a catered experience controlled by a producer in real time that included camera switching and augmented content. The results of an online survey completed by virtual and physical attendees of the festival are presented, showing positive feedback for our setup and suggesting that the addition of virtual and immersive content to live events could lead to a more enjoyable experience for viewers.
As Augmented Reality Television (ARTV) transitions out of the feasibility phase, it is crucial to understand the impact of design decisions on the viewers’ ARTV experiences. In a previous study, six ARTV design dimensions were identified by relying on insights from existing prototypes. However, the set of possible dimensions is likely to be broader. Building on top of previous work, we create an ARTV design space and present it in a textual cheat sheet. We subsequently evaluate the cheat sheet in a between-subject study (n = 10), with participants with wide-ranging expertise. We identified six new dimensions (genre, broadcast mode, audience demographics, cartoonish vs. photoreal representation, modality, and privacy), and a new aspect (360°) for the display dimension. In light of our observations, we provide an updated ARTV design space and observe that asking participants to write ARTV scenarios can be an effective method for harvesting novel design dimensions.
This paper presents Fractured Objects for the design of virtual and mixed-reality experiences. Drawing on the qualitative analysis of three weeks of artistic activities within a residency program, we present six types of Fractured Objects that were used in sketching a mixed-reality performance. Building on these Fractured Objects, as they were articulated by the artists, we present speculative designs for their use in scenarios inspired by research within the IMX community. In discussion, we look to expand the concept of Fractured Objects by relating it to other design concepts such as Seamful Design and Wabi-Sabi, and explore the relationship to the temporality of interaction. We introduce Kintsugi VR with Fractured Objects, drawing on the concept of ‘golden repair’ in which the act of reconnecting fractured parts improves the resulting whole object.
SESSION: Technologies, Systems & Interfaces
The ability to make videos for different aspect ratios (known as video retargeting) contributes to optimal viewing experience on different video platforms. In this paper, we present an idiom-based tool for retargeting videos, from the most common 16:9 aspect ratio into narrower aspect ratios. In contrast to earlier retargeting approaches, which distort the video and are completely automated, our tool enables cropping and panning with user input and oversight. Users can select and order idioms from six cinematic idioms to control video retargeting, and the tool applies selected idioms in order and generates the retargeting results. We performed a pilot study for the feasibility of the tool, and conducted quantitative analysis to inform further work on crafting intelligent cropping and panning tools. In addition, we interviewed an experience video editor on how retargeting is done manually and the quality of the output of the tool.
TCP-Based Distributed Offloading Architecture for the Future of Untethered Immersive Experiences in Wireless Networks
Task offloading has become a key term in the field of immersive media technologies: it can enable lighter and cheaper devices while providing them higher remote computational capabilities. In this paper we present our TCP-based offloading architecture. The architecture, has been specifically designed for immersive media offloading tasks with a particular care in reducing any processing overhead which can degrade the network performance. We tested the architecture for different offloading scenarios and conditions on two different wireless networks: WiFi and 5G millimeter wave technologies. Besides, to test the network on alternative millimeter wave configurations, currently not available on the actual 5G millimeter rollouts, we used a 5G Radio Access Network (RAN) real-time emulator. This emulator was also used to test the offloading architecture for an simulated immersive user sharing network resources with other users. We provide insights of the importance of user prioritization techniques for successful immersive media offloading. The results show a great performance for the tested immersive media scenarios, highlighting the relevance of millimeter wave technology for the future of immersive media applications.
SESSION: Platforms & Complexities of Sharing
The current TV ecosystem is being disturbed by four major fragmentations: linear vs. on-demand content, traditional vs. newer operators, curated vs. algorithmic selection, and TV set vs. additional visualization devices. These fragmentations are changing the TV ecosystem from being channel-based to one that is app-based. However, this multiple app approach makes for a much more complex user experience for the TV viewer. This paper presents five specific strategies, both technological and business-related, to address this challenge and to have the potential to create a win-win scenario for both pay-TV operators and their customers. Those strategies were derived from the experience obtained during the development of the UltraTV research project, which aimed to create a content unification paradigm to improve the TV viewer user experience and its subsequent successful transfer to a commercial offer.
Often accounts and profiles on media platforms discount the real-world complexities of sharing by naively designing technology that supports one user per account. While recently popular media platforms like Netlfix and Disney+ have started responding here, the solutions are simplistic. Within research, despite recent interest, a holistic narrative of the practicalities of sharing and support which could be effectively leveraged for future design of media platforms that support sharing is yet to be investigated. This paper reports a set of user focus groups and expert interviews that present a holistic set of 5 challenges confronting this scope. We further situate these findings in two practical examples of sharing, one current and one near-future, to demonstrate the real-world applicability of said findings by both explicating the challenges and using them as a lens to start exploring responses in each case.
SESSION: User-Centered XR
The occlusive nature of VR headsets introduces significant barriers to a user’s awareness of their surrounding reality. While recent research has explored systems to facilitate a VR user’s interactions with nearby people, objects, etc, we lack a fundamental understanding of user attitudes towards and expectations of these systems. We present the results of a card sorting study (N=14) which investigated attitudes towards increasing a VR user’s reality awareness (awareness of people, objects, audio, pets, and systems to manage and moderate personal usage) whilst in VR. Our results confirm VR headsets should be equipped with systems to increase a user’s awareness of reality. However, opinions vary on how increased awareness should be achieved as our results also highlight differing expectations regarding: persistent vs temporary notification design, notification content and when, why and how awareness should be increased.
Augmented Reality Television (ARTV) can take many forms, from AR content displayed outside the TV frame to video-projected TV screens to social TV watching in VR to immersive holograms in the living room. While the user experience (UX) of individual forms of ARTV has been documented before, “journeys” as transitions between such forms have not. In this work, we examine the UX of watching TV when switching between various levels of augmentation. Our findings from an experiment with fourteen participants reveal an UX characterized by high perceived usability, captivation, and involvement with a low to medium workload and a moderate feeling of dissociation from the physical world. We interpret our results in the context of Garrett’s established five-plane model of UX—strategy, scope, structure, skeleton, and surface—and propose a sixth plane, “switch,” which separates conceptually the design of user journeys in ARTV from the specifics of the other UX planes.
Unveiling Behind-the-Scenes Human Interventions and Examining Source Orientation in Virtual Influencer Endorsements
A growing number of computer-generated virtual influencers are being used as alternatives to human endorsers in brand advertising. Because these virtual influencers are not real people, who gets the credit when the endorsement succeeds? And who takes the blame when they fail? In this study, we investigated how and to what extent consumers attribute responsibility to virtual influencers, as well as the behind-the-scenes human interventions (i.e., influencer company, endorsed brand) based on an internal versus external causality for endorsement failure and success—and how their attributions differ compared to human influencer cases. We also examined consumers’ attitudes and behavioral intentions toward influencers and endorsed brands under the given situations. We conducted a 2 (type of influencer: human versus virtual) × 2 (endorsement outcome: success versus failure) × 2 (locus of causality: influencer versus brand) between-subjects online experiment. The results showed that virtual influencers were attributed less blame for an endorsement failure caused by an influencer’s misbehavior than human influencers. However, virtual influencers’ companies and endorsed brands were attributed significantly more responsibilities than their human counterparts. Finally, we discuss the theoretical and practical implications in this paper.
SESSION: Telepresence & Bodily Interactions
We propose a system displaying audience eye gaze and nod reactions for enhancing synchronous remote communication. Recently, we have had increasing opportunities to speak to others remotely. In contrast to offline situations, however, speakers often have difficulty observing audience reactions at once in remote communication, which makes them feel more anxious and less confident in their speeches. Recent studies have proposed methods of presenting various audience reactions to speakers. Since these methods require additional devices to measure audience reactions, they are not appropriate for practical situations. Moreover, these methods do not present overall audience reactions. In contrast, we design and develop CalmResponses, a browser-based system which measures audience eye gaze and nod reactions only with a built-in webcam and collectively presents them to speakers. The results of our two user studies indicated that the number of fillers in speaker’s speech decreases when audiences’ eye gaze is presented, and their self-rating score increases when audiences’ nodding is presented. Moreover, comments from audiences suggested benefits of CalmResponses for them in terms of co-presence and privacy concerns.
Tele-education was already a solution for people who cannot attend lessons in person (such as inaccessibility in rural areas or illness issues). However, COVID has revealed problems in tele-education with current technology, causing adolescents and children to slow down their learning curves and experience problems of social distancing with their classmates. This paper presents a user study to validate an immersive communication system for tele-education purposes. This system streams in real time a class using 360-degree cameras, allowing remote students to explore the whole scene and improving the feeling of being in the classroom with their colleagues. Additionally, the prototype provides notifications to the remote students about events (such as a changes in the teacher’s presentation or classmates raising their hands) that occur outside their viewport to indicate in which direction they should move their heads to visualize them.
To validate the system and investigate its possible added value, socioemotional factors such as presence, perceived quality, usability, and usefulness of the notifications were evaluated through a user test using questionnaires. The obtained results show that using immersive tele-education systems can improve the presence, as well as the benefits of the notifications on the experience of the remote students.
Cybersickness involves all the adverse effects that can occur during a Virtual Reality (VR) immersion, which can compromise the quality of the user experience and limit the usability, functionality and duration of use of VR systems. Standardised protocols help detect stimuli that may cause cybersickness in multiple users but do not fully discriminate which specific users experience cybersickness. Of the biometric measures used to monitor cybersickness in an individual, Heart Rate Variability (HRV) is one of the most used in previous work. However, these only considered its temporal components and did not allow for rest periods between sessions, even though these can affect users’ immersion. Our analysis addresses these limitations in that changes in HRV can measure specific levels of discomfort or ”alertness” associated with the initial cybersickness stimulus induced in the 360 videos. Primarily, our empirical results show significant differences in the frequency components of HRV in response to cybersickness stimuli. These initial measurements can compete with standard subjective assessment protocols, especially for detecting whether a subject responds to a VR immersion with cybersickness symptoms.
SESSION: ACM IMX demos
This work presents the Co-Creation Space, a multilingual platform for professional and community artists to 1) generate raw artistic ideas, and 2) discuss and reflect on the shared meaning of those ideas. The paper describes the architecture and the technology behind the platform, and how it was used to facilitate the communication process during several user trials. By supporting ideation sessions around media items guided by a facilitator and allowing users to express themselves and be part of the creation of an artistic product, participants were enabled to access new cultural spaces and be part of the creative process.
Video for Health (V4H) Platform. A Secure Video Suite Platform for Online Care, Teleconsultation and Tele-orientation
One of the major problems faced by public health managers around the globe is the lack of specialized professionals in remote locations to meet the health demands of society. To propose a solution for these problems, the authors developed the Video for Health (V4H) Platform for Brazil’s National Education and Research Network (RNP), a Research and Development agency from the Brazilian Ministry of Science, Technology and Innovation (MCTI), which will be the topic of this proposed demo. The V4H is a system that aims to reduce the distance between health professionals and the population that needs primary care. The original proposal of the V4H was developed in two teams: Working group phase 1 (WG1) and Working Group phase 2 (WG2) at the Federal University of Paraíba (UFPB). The first working group grant went to develop the design of the system and it was tested at the Telehealth Center and the Federal University of São Paulo (Unifesp), at the São Paulo Area Military Hospital (HMASP) and at the TeleDentistry project at the University of São Paulo’s (USP) Dentistry School (FOUSP). In the second phase of the project we developed more complex systems such as blockchain for video preservation, billing, time control and accessibility for the teleconsultations. The platform was tested and integrated with University of São Paulo’s Hearth Institute (InCOR). The WG2 was coordinated by Prof. Guido Lemos (UFPB) and Prof. Marco Antonio Gutierrez (InCOR). The V4H Platform supports synchronous and confidential video streaming, with a scalable architecture to simplify integration with telehealth, teleconsulting and Electronic Health Record (HRE) platforms. The V4H system allows the authentication of the participants of the transmission, as well as the recording, retrieval and preservation of the transmitted content, using signature technologies with digital certificates and blockchain to ensure that the content remains immutable and providing the integrity and authenticity of the persisted data. The main focus of the solution is to offer a synchronous video service for platforms that support the electronic health record, where the recorded and preserved contents can be attached together with the patient’s data, serving as legal evidence of the healthcare provided. There is also the possibility that these contents can be used as a data source for teleconsulting, tele-diagnosis and preceptorship activities for health professionals from all areas, initially focusing on basic and primary health care, in locations that lack specialized health professionals.
This demo paper proposes to exhibit the pilot episode of XR Ulysses, a creative project investigating the possibilities for live performance using three-dimensional volumetric video (VV) techniques via virtual reality (VR) technologies. XR Ulysses is part of a series of innovative performance experiments hybridizing theatre and extended reality (XR) technologies. Conference attendees are invited to don an HMD, embody the character of Stephen Dedalus, and engage Buck Mulligan in the famous opening scene of Joyce’s book, situated on the top of the Martello Tower at Sandycove (Dublin). This scene enables individuals to experience a live-action re-enactment of James Joyce’s Ulysses in VR.
We present a virtual learning space station, a design space, which was co-designed by 63 school children from three continents, namely Namibia, Malaysia and Finland. The design space station is developed on the Ohyay platform for children by children to plan, interact and conduct co-design activities. In our hands-on demonstration, attendees can organise/join a design session and test the online collaboration and facilitation tools provided by the space station. Our demo contributes directly to the IMX 2022 theme of “Interactive Media Brings us Together”.
The Internet of Musical Things (IoMusT) is an interdisciplinary area that aims to improve the relationship between musicians and their peers, as well as between musicians and audience members, creating new forms of interaction in concerts, studio productions, and music learning. Although emerging, this field already faces some challenges, such as lack of privacy and security, and mainly, lack of standardization and interoperability between its devices. Therefore, an environment design, called Sunflower, was proposed, which tries to contribute to solving the most recurrent problems in this area, specifying an architecture pattern, protocol, and sound features that aim to allow heterogeneity in these systems. Its practical implementation resulted in an interoperable, multimedia, and interactive environment. This paper, therefore, shows a demonstration of how Sunflower works in the accomplishment of an artistic presentation, also emphasizing its approach, the technologies that support it, and the advances it can bring to the area of IoMusT.
Existing communication technologies have displayed a lack of affordances in supporting social-emotional connections, which is of particular interest in educational settings. We are therefore developing a live sensory immersive 3D video technology, built on a prior developed platform. Pilot trials in a Finnish school have yielded promising findings. We continue to advance the state-of-the-art platform in parallel with regards to 3D capture quality and data compression algorithms. Current developments entail joint investigations and evaluations of affordances to support emotional, social, motivational, and achievement impacts with learners and teachers from a Namibian and Finnish school. Participants can experience ”remote presence” wearing the hololens 2 while others are live streamed from another country captured by two cameras.
As we move further into a spatial future, the demand for content creation tools is immense. Social media, gaming platforms and e-commerce have been converging into interactive spaces that involve spatial representations of the world the viewer is occupying, including digital humans. Creating digital representations of humans, or holograms, can be achieved through volumetric capture technologies. This method of bringing people to virtual three-dimensional environments has been rapidly increasing in popularity, but still lacks a key element: interactivity. In this paper we describe our work on producing interactive volumetric video that responds to viewers’ actions in real-time. We present the Fushimi Inari project: a commercial use case pushing the boundaries of what can be achieved with volumetric video and describe how our spatial content creation tools allow for interactive films to be created. Our contributions include blending volumetric clips, skeletonizing captures and applying multi-bone retargeting. We also provide means to integrate this in game engines for real-time photorealistic and interactive stories to be enjoyed by any viewer.
This demonstration aims at presenting ScenaProd, a tool that allows people to produce multisensory scenagrams (multisensory exercises or interactive media). All participants can create their own scenagrams with that tool or test a more complete one that is already created. A scenagram can be defined as an interaction between a human being and different devices. For example, a robot asks a question while displaying a visual clue on a screen. Then, the participant can respond by pressing a large and colored contactor. In case of a correct answer, the robot would play a short victory song and a light would light up green. If a wrong answer were given, the system would have a different reaction.
SESSION: Work-in-Progress – Future Immersive Media Application
This paper presents a proof-of-concept of a robotic teleoperation system, that provides the human operator a thermal sense in addition to the visual sense. With a sensor suite comprising a stereo camera, 360° camera and long-wave infra-red camera, our demonstrator pushes the boundaries of virtual-reality situational awareness by bringing not only 3D visual content but also a 360° thermal experience to the operator. The visual channel of our robotic teleoperation system is represented through a head-mounted-display and the thermal channel is displayed through directional heaters in the operator cockpit and a thermal glove. Initial tests showed that an operator successfully experienced a 360° remote environment, correctly distinguished between and interacted with hot and cold objects, and could notice the presence of nearby people outside her direct field-of-view, based on their emitted heat.
Modern operating rooms (OR) are equipped with several ceiling- and wall-mounted screens that display surgical information. These physical displays are restricted in placement, limiting the surgeons’ ability to freely position them in the environment. Our work addresses this issue by exploring the feasibility of using an augmented reality (AR) headset (Microsoft HoloLens 2) as an alternative to traditional surgical screens; leading to a reduced OR footprint and improved surgical ergonomics. We developed several prototypes using state-of-the-art hardware/software and conducted various neurosurgery-related exploratory studies. Initial feedback from users suggests that coloration and resolution of the holographic feed were adequate, however, surgeons frequently commented on tactile/visual asynchrony. This emphasizes the need for novel, more efficient hardware/software solutions to support fine motor tasks in the OR.
Invasion of locust swarms has affected the crops in many countries in Africa and Asia, which is a significant threat to food security. Therefore, different approaches are adopted to monitor and control the locust swarms to save the crops. Furthermore, it has been proved in various studies that technology can help in agriculture through drones, real-time data monitoring, or teaching the farmers with the latest tools. Following the UN sustainability goals for food security, this research has presented a Virtual Reality(VR) based educational application to teach sustainable locust management strategies. Using hand tracking technology in the Oculus Quest lets users learn how farmers can deal with locusts without pesticides. Based on a storytelling approach, the methods presented are profitable for the farmers and free of any harm to crops regarding food security. This application can help motivate the adoption of these sustainable locust control strategies in broader interventions for environmental recovery.
While augmented reality television (ARTV) is being investigated in research labs, the high cost of AR headsets makes it difficult for audiences to benefit from the research. However, the relative affordability of virtual reality (VR) headsets provides ARTV researchers with opportunities to test their prototypes in VR. Additionally, as VR becomes an acceptable medium for watching conventional TV, augmenting such viewing experiences in VR creates new opportunities. We prototype a nature documentary ARTV experience in VR and conduct a remote user study (n = 10) to investigate six points on the visual display design dimension of presenting a lifelike programme-related hologram. We manipulated the starting point and the movement behaviour of the hologram to gain insight into viewer preferences. Our findings highlight the importance of personal preferences and that of the perceived role of a hologram in relation to the underlying TV content; suggesting there may not be a single way to augment a TV programme. Instead, creators may need to provide the audiences with capabilities to customise ARTV content.
As digitization has transformed media and Augmented Reality (AR) is evolving from a research area to a commodity, museums are creating interactive AR experiences to digitally enhance their collection and increase audience engagement. Head-worn AR experiences, though, face interaction challenges as they are often employed in busy spaces and are in need of intuitive multimodal interfaces for users on the move. This paper presents an innovative, work-in-progress, multimodal AR experience integrating non-obtrusive dialogue, music, and sound as well as gesture and gaze-based interaction, while a user is wearing a head-worn AR display. Users are motivated to explore and interact with digital cultural artefacts superimposed onto the real-world museum setting and physical artefacts, while moving around in a museum setting. We initially analyze interactive AR experiences to identify specific user requirements related to head-worn AR experiences. We deploy these requirements for the design of interactive, multimodal AR in a museum setting.
This work-in-progress presents the design of an XR prototype for the purpose of educating basic cybersecurity concepts. We have designed an experimental virtual reality cyberspace to visualise data traffic over network, enabling the user to interact with VR representations of data packets. Our objective was to help the user better conceptualise abstract cybersecurity topics such as encryption and decryption, firewall and malicious data. Additionally, to better stimuli the sense of immersion we have used Peltier thermoelectric modules and Arduino Uno to experiment with multisensory XR. Furthermore, we reflect on early evaluation of this experimental prototype and present potential paths for future improvements.
Several social VR platforms support virtual entertainment events, however their value for post-show activities remains unclear. Through a user-centered approach, we design a social VR lobby experience to enrich four motivations of theatre-goers: social, intellectual, emotional, and spiritual engagement. We ran a context-mapping focus group session with professionals (N=6) to conceptualize the social VR space for digital opera experiences. Based on our findings, we propose a social VR lobby consisting of four rooms: 1) a Bar for social engagement, 2) an Info Booth for intellectual engagement, 3) a Photo Zone for emotional engagement, and 4) an Interactive Stage for spiritual engagement. Based on this work, we plan to experimentally evaluate audience experiences in each room in order to create a social VR lobby template for theater experiences.
SESSION: Work-in-Progress – Technical Advancements for Extended Reality Technologies
This paper presents and discusses the introduction of 3D functionalities for an existing web-based multimodal video annotation tool. Over the past years, we have developed a multimodal web video annotation tool that now combines 3D models and 360º content with more traditional annotation types (e.g., text, drawings, images), offering users the possibility of adding extra information in their annotation work. We show how 3D models augment the annotation work and add advantages like viewing or exploring objects in detail and from different angles. The paper reports detailed feedback from a pilot study in form of a workshop with traditional dance experts to whom these new features were presented. We conclude with an outlook of future iterations of the video annotator based on the experts’ feedback.
Classification of the Video Type and Device Used in 360-Degree Videos from the Trajectories of its Viewers’ Orientations with LSTM Neural Network Models
360-degree videos are consumed in diverse devices: some based in immersive interfaces, such as viewed through Virtual Reality headsets and some based in non-immersive interfaces, as in a computer with a pointing device or mobile devices with touchscreens. We have found, in prior work, significant differences in user behavior between these devices. From a dataset of the trajectories of the users’ head orientation in 775 video reproductions, we classify which kind of video was played (two values) and which of the four possible devices was used to reproduce these videos. We found that recurrent neural network models based on LSTM layers are able to classify the video type and the device used to play the video with an average accuracy of over 90% with only four seconds of trajectory. We are convinced that this knowledge can improve techniques to predict future viewports used in viewport-adaptive streaming when diverse devices are used.
Virtual reality (VR) has created a new and rich medium for people to meet each other digitally. In VR, people can choose from a broad range of representations. In several cases, it is important to provide users with avatars that are a lifelike representation of themselves, to increase the user experience and effectiveness of communication. In this work, we propose a pipeline for generating a realistic and expressive avatar from a single reference image. The pipeline consists of a blendshape-based avatar combined with two deep learning improvements. The first improvement module runs offline and improves the texture map of the base avatar. The second module runs inference in real-time at the rendering stage and performs a style transfer to the avatar’s eyes. The deep learning modules effectively improve the visual representation of the avatar and show how AI techniques can be integrated with traditional animation methods to generate realistic human avatars for social VR.
SESSION: Work-in-Progress – Personalized Entertainment
”I want to be independent. I want to make informed choices.”: An Exploratory Interview Study of the Effects of Personalisation of Digital Media Services on the Fulfilment of Human Values
From the landing page of a shopping website, to a tailored layout on a video streaming app, digital media experiences are becoming increasingly personalised, and none of us have the same experience as each other. We report on a series of in-depth interviews, with UK media users from 19 to 68 years old, exploring their awareness, feelings, expectations and concerns about the digital media being personalised ’for them’, and the language that they use when talking about it. Our repeatable, extensible methodology develops insights aligned to a framework of fundamental human values.
Supporting the Creation and Evaluation of Interactive Touch Kiosks Designed for the Elderly: A Compilation of Requisites Acknowledging Physical and Psycho-sociological Age-related Changes
Aiming cognitive stimulation and physical exercise, interactive touch kiosks designed for the elderly seem to be promising options to promote active aging, ensuring e-health and well-being services. They need to be created and improved according to the elderly population’s real needs; however, recommendations to develop these solutions are scattered in several guidelines and standards. In this study standards regarding physical and psycho-sociological age-related changes, such as vision, hearing, cognition, communication, gross and fine motor skills were gathered; physical and social factors were also considered. A total of 107 items were found, and the following categories were defined: Terminals, Interface, Content, and Other. The proposal can be used as: a) a list to guide the creation of services and systems; b) a grid to be color coded according to the level of problems found while usability evaluations are being conducted. This is a contribution to experts who can easily recognize the items that need to be improved in the services and systems, to better support the experience of the elderly user.
Recently, the data management model using personal data store (PDS), which is a mechanism for users to store and manage their personal data, has been discussed in response to stricter privacy protection worldwide. In this regard, we developed a data-driven personalization method for broadcasting services. Various personal data, such as program viewing history and Internet service usage history, are stored centrally in the PDS on the user’s side. This personal data can be utilized under user control when using various services allowing cross-industry service collaboration while maintaining a high level of transparency to the user. This enables users to use broadcasting service more widely and conveniently by linking it with various Internet services. In this study, we developed a prototype system that implements end-to-end components from acquisition to utilization of broadcast program viewing history. The system consists of a set of functions that acquires viewing history from broadcast and Internet streaming, stores it on PDS, and uses it in applications. The PDS implementation utilizes open-source software based on web standards to facilitate data linkage with a variety of Internet services. As an effective example for system evaluation, we designed and prototyped on-demand video viewing services in which separate applications of different service providers are linked via the PDS.
Podcasts as a media format have been increasing popular in recent years. The ease of access to this format have contributed for its success story. However, the creation of Podcasts requires specific hardware and software for recording and editing it. Some platforms have emerged with the proposal to ease this creation process, namely by introducing Text-to-Speech (TTS) technologies removing the need for capturing and editing voice, reducing the effort necessary for producing this format, yet no platform allows the use of TTS in Portuguese of Portugal while retaining the scope of “podcast creation platform”. Taking these limitations in mind we present the proposal of an all-in-one Podcast Creation platform with the availability of TTS Technology in Portuguese of Portugal. The paper describes the usability testing (UX) of the platform using 3 methodologies being: Self-Assessment Manikin (SAM), System Usability Scale (SUS) and Attrakdiff. with promising results regarding its usability and desirability.
Movies are one of the most important and impactful forms of entertainment and a powerful vehicle for culture and education, due to the cognitive and emotional impact on the viewers, and technology has been making them more accessible in pervasive services and devices. As such, the huge amount of movies we can access, and the important role emotions play in our lives, make more pertinent the ability to access, visualize and search movies based on their emotional impact. In this paper, we characterize the challenges and approaches in this scenario, and present interactive means to visualize and search movies based on their dominant and actual emotional impact along the movie, with different models and modalities. In particular through emotional highlights and trajectories, the user’s emotional state, or a music being played. Music contributes greatly to the emotional impact of movies and it can also be a trigger to get us into one of them in serendipitous moments.
Scenario-based Exploration of Integrating Radar Sensing into Everyday Objects for Free-Hand Television Control
We address gesture input for TV control, for which we examine mid-air free-hand interactions that can be detected via radar sensing. We adopt a scenario-based design approach to explore possible locations from the living room where to integrate radar sensors, e.g., in the TV set, the couch armrest, or the user’s smartphone, and we contribute a four-level taxonomy of locations relative to the TV set, the user, personal robot assistants, and the living room environment, respectively. We also present preliminary results about an interactive system using a 15-antenna ultra-wideband 3D radar, for which we implemented a dictionary of six directional swipe gestures for the control of dichotomous TV system functions.
The increase in the consumption of digital formats has, in many cases, been a penalty for traditional media companies. In the adaptation to digital, the transformation of written news into audio formats, that guarantee spatio-temporal flexibility in its consumption, is one of the differentiating options. Artificial intelligence tools can help accelerate and automate the digitalization processes. It is, therefore, the objective of this paper to evaluate the integration of Text-to-Speech (TTS) technology in the process of creating news podcasts. The study comprised two surveys. The first corresponding to the validation of TTS services in Portuguese from Portugal, and the second for the validation of three models of news podcasts containing human voice, synthesized voice via TTS, and a hybrid model with TTS voice and human voice. The results point to a general acceptance of the integration of voices generated by TTS in news podcasts without prejudice to the consumer experience.
Party Mascot is an experimental design for a dynamic, interactive prop used in “actual play” streaming. Taking the form of a talking mechanical bird, the Party Mascot extends audience participation on the Twitch platform from its native chat interface to the physical playspace. Building on a critical review of frame analytical approaches to role-playing game studies and supported by an ethnographic study of actual play performers, the Party Mascot is designed to “flicker” between social, gameplay, and fictional frames of interaction. It can accommodate any number of participants and adapts to multiple roles within new mediated performance contexts. Shifting spectatorship from the screen to the physical world, the Party Mascot can reconfigure audience/performer relationships, open new avenues for game design, and engage the genre of actual play as a new site of experimentation and innovation between the producers and consumers of media.
SESSION: Work-in-Progress – Physiology and User’s Perception
Deeply immersive experiences are intrinsically rewarding; evoking them for another is a cornerstone of success in artistic or design practice. At the same time, modern interfaces have created a state of ’partial continuous attention’, and frequent self-interruption is more common than ever. In this paper, we propose a smart-glasses based interaction to quantify self-interruption dynamics in naturalistic settings, in which a slowly changing peripheral LED is monitored as a secondary task by the user. We demonstrate that this interaction captures useful information about a user’s state of engagement in real-world conditions. These data can provide designers and artists novel, objective insight into the depth of immersive experience evoked in real-world settings.
Professional theatre actors are highly specialized in controlling their own expressive behaviour and non-verbal emotional expressiveness, so they are of particular interest in fields of study such as affective computing. We present Acting Emotions, an experimental protocol to investigate the physiological correlates of emotional valence and arousal within professional theatre actors. Ultimately, our protocol examines the physiological agreement of valence and arousal amongst several actors. Our main contribution lies in the open selection of the emotional set by the participants, based on a set of four categorical emotions, which are self-assessed at the end of each experiment. The experiment protocol was validated by analyzing the inter-rater agreement (> 0.261 arousal, > 0.560 valence), the continuous annotation trajectories, and comparing the box plots for different emotion categories. Results show that the participants successfully induced the expected emotion set to a significant statistical level of distinct valence and arousal distributions.
Stroop Colour-Word Task has been widely used as a cognitive task. There are computerised and Virtual Reality versions of this task that are commonly used. The emotional version of the task, called the Emotional Stroop Colour-Word task is commonly used to induce certain emotions in a person. We are developing an application that brings the Emotional Stroop Colour-Word task into Virtual Reality. The aim of this application is to elicit different stress levels on the user and to record associated brain, heart and skin activity using wearable sensors. It is an immersive application that includes a tutorial, artificial intelligence generated audio instructions and a logging system for the user activity.