Immersive audio is an underutilized and often forgotten element for creating great XR experiences. (XR includes virtual reality, augmented reality, mixed reality, merged reality, and 360-degree video.) To understand audio’s potential impact within XR, consider whether you’d rather learn a new language such as Spanish from an XR experience with detailed and realistic 360-degree video but one-dimensional and poorly recorded audio, or simple animated video but lifelike, three-dimensional, and interactive audio.

Most people would choose the latter scenario because it’s more likely to create a feeling of presence, despite the animated visual field. This article explores what immersive audio is and why it’s so important to creating great XR experiences.

What is immersive audio?

Immersive audio is 360-degree audio that mirrors real world soundscapes within XR experiences. Sounds emanate from all directions and are nonlinear (i.e., interactive). For example, if you see a dog barking to your right in a VR simulation, immersive audio will match the location of the barking sound with the dog’s visual location even as you move around in the virtual space. Designers can also use nonlinear audio cues, such as voices, music, or even a barking dog, to direct a visitor’s attention to points of interest. This helps visitors focus, stay engaged, and navigate through an XR experience. Ultimately, immersive audio considerably increases the sense of presence in XR experiences—the feeling of really being there—which, in turn enhances learning.

Binaural audio and Ambisonics are common types of immersive audio recordings. Binaural audio captures sounds similar to how we hear in the real world, where sounds reach one ear slightly before the other. However, it is not responsive to visitor input. Ambisonics, originally developed in the 1970s, captures full 360-degree spherical sound (i.e., front to back, left to right, and top to bottom) and responds to visitor movement and rotation.

Immersive audio uses microphones (i.e., Ambisonic, binaural, omnidirectional single-channel, etc.) placed on people, objects, or cameras to capture and pinpoint voices, the ambient environment (i.e., background), and/or other close and distant sound sources.

Creating immersive audio for XR

Recording, mixing, editing, and experiencing audio for XR is (relatively) new and unchartered territory. Options for recording immersive audio can be expensive and somewhat limited. Much of the recording technology is do-it-yourself or experiment-as-you-go. For example, you can experiment with omnidirectional lavalier or Ambisonic microphones in new ways, choosing the combinations that best meet your budget or space constraints.

Similarly, you may need to edit immersive audio using software that may not be specifically intended for XR experiences. For example, you could use game engine software such as Unity or Unreal Engine 4, or you could use sound and video software such as Adobe Premier Pro, G’Audio Works and Craft, and Steam Audio. You may need to test multiple options to see what works best for your own XR experiences and level of technical expertise.

Note that while you can record an infinite number of audio channels, anything beyond 3rd order Ambisonics, or 16 channels, doesn’t improve the perceived sound quality. Sound quality is also limited by the output hardware that visitors will use for most XR scenarios, which is likely two channels, or a normal pair of headphones.

A tool worth remembering

Research conducted by Harold Pashler et al. has shown that students are not primarily auditory, visual, or kinesthetic learners as is commonly believed, but a unique blend of each. Their research has concluded that our “primary focus should be on identifying and introducing the experiences, activities, and challenges that enhance everybody’s learning.” As L&D professionals, we can we can increase learning retention, engagement, and accessibility for more people by creating XR experiences with a combination of high-quality immersive audio, visuals, haptics, and other multi-sensory elements.

Immersive audio is a valuable tool worth remembering when creating XR experiences. While not every XR experience needs professionally produced, 16-channel Ambisonic audio tracks from start to finish, many training simulations and other XR learning experiences could certainly benefit from adding higher quality immersive audio. Paying just a little more attention to the audio design in your XR experiences could dramatically increase the sense of presence. And the potential impact of presence on learning is a big part of why we create XR experiences in the first place.

Reference

Pashler, Harold, Mark McDaniel, Doug Rohrer, and Robert Bjork. “Learning Styles: Concepts and Evidence.” Psychological Science in the Public Interest, A Journal of the Association for Psychological Science, Vol. 9, No. 3. December 2008. https://www.psychologicalscience.org/journals/pspi/PSPI_9_3.pdf