Qinke Di is a teaching assistant at the Guangdong Songshan Polytechnic in Shaoguan, China. She holds a Bachelor’s degree (with First Class Honours) in Cinema and Television from UIC and a Master’s degree (with distinction) from the University of Warwick. She has had a long-standing interest in Digital Media and Cultural Industries. Recent research is focussed on digital platform, labour and gender issues. Her ongoing project centers on investigating female live-streamers’ cultural production and labour practice in rural China.
Siyi Li holds a Master's degree in Media Communication from Hong Kong Baptist University. She specializes in social media and the metaverse, having authored several articles on these topics. Her expertise lies in exploring the impact and potential of emerging virtual environments on communication and interaction. ORCID 00090008-0530-4667,
ERICA Yuhan Song is an accomplished director and producer in the CG industry, holding a master's degree from the New York Institute of Technology in CGA. She leads a team that provides solutions for various visual needs of clients. ERICA's personal virtual reality project "Qing Dynasty Costume Exhibition", in collaboration with the Palace Museum, has been invited to participate in New York Tech Week and other events. Her works combine creativity and technology, bringing unique visual experiences to the audience. ORCID 0009-0004-3761-7263,
This study examines the critical importance of audio design in crafting immersive experiences within the metaverse. The analysis focuses on key technologies driving this field, with particular emphasis on spatial audio techniques such as Head-Related Transfer Functions (HRTF) and Ambisonics, which enable precise three-dimensional sound positioning.
The research investigates adaptive audio rendering, highlighting tools such as Microsoft's Spatial Audio Unity Plugin, which facilitates dynamic soundscape adjustments based on user interactions. Furthermore, the application of artificial intelligence in audio design is explored, with a discussion on the potential of Generative Adversarial Networks (GANs) for synthetic sound production and personalized audio experiences.
The study presents two significant case studies: the partnership between TCG World, STYNGR, and Downtown for interactive sonic environments, and research on Audio Augmented Reality (AAR) in art galleries. Ethical considerations, including privacy, accessibility, and the psychological impact of immersive audio, are critically examined.
The research concludes by exploring future directions, such as cross-modal integration and emotional AI systems in metaverse audio design, emphasizing the necessity for responsible development practices. Through this comprehensive analysis, the study aims to provide insights into the challenges and opportunities presented by audio design in virtual environments, contributing to the evolving landscape of metaverse technology.
Begault D.R., 3-D sound for virtual reality and multimedia, NASA, Cambridge 2000.
Bosman I.D.V., Buruk O., Jørgensen K., Hamari J., The effect of audio on the experience in virtual reality: a scoping review, in «Behaviour & Information Technology», n. 42, 2, 2023, pp. 165-199.
Chen Y., Huang L., Gou T., Applications and Advances of Artificial Intelligence in Music Generation: A Review, in «arXiv», 2024, arXiv:2409.03715.
Dam A., Lee Y., Siddiqui A., Lages W.S., Jeon M., Audio augmented reality using sonification to enhance visual art experiences: Lessons learned, in «International Journal of Human-Computer Studies», 2024, https://doi.org/10.1016/j.ijhcs.2024.103329.
Défossez A., Copet J., Synnaeve G., Adi Y., High fidelity neural audio compression, in «arXiv», 2022, arXiv:2210.13438.
Engel J., Agrawal K.K., Chen S., Gulrajani I., Donahue C., Roberts A., GANSynth: Adversarial neural audio synthesis, in «arXiv», 2019, arXiv:1902.08710.
Farinati L., Firth C., The Force of Listening, Errant Bodies Press, Berlin 2017.
Goodfellow I., Pouget-Abadie J., Mirza M., Xu B., Warde-Farley D., Ozair S., Bengio Y., Generative adversarial nets, in «Advances in Neural Information Processing Systems», n. 27, 2014.
Goodman S., Sonic Warfare: Sound, Affect, and the Ecology of Fear, The MIT Press, Cambridge 2010.
Herre J., Hilpert J., Kuntz A., Plogsties J., MPEG-H 3D Audio—The New Standard for Coding of Immersive Spatial Audio, in «IEEE Journal of Selected Topics in Signal Processing», n. 9, 5, 2015, pp. 770-779.
Kern A.C., Ellermeier W., Audio in VR: Effects of a soundscape and movement-triggered step sounds on presence, in «Frontiers in Robotics and AI», n. 7, 2020, p. 20.
Larsson P., Väljamäe A., Västfjäll D., Tajadura-Jiménez A., Kleiner M., Auditory-induced presence in mixed reality environments and related technology, in P. Dubois (ed.), The Engineering of Mixed Reality Systems, Springer, London 2010, pp. 143-163.
Lazaro M.J., Lee J., Chun J., Yun M.H., Kim S., Multimodal interaction: Input-output modality combinations for identification tasks in augmented reality, in «International Journal of Human-Computer Studies», n. 167, 2022, p. 102897.
Lee G.W., Lee J.H., Kim S.J., Kim H.K., Directional Audio Rendering Using a Neural Network Based Personalized HRTF, in «INTERSPEECH 2019: Show & Tell Contribution», 2019, pp. 2364-2365, https://www.isca-archive.org/interspeech_2019/lee19b_interspeech.pdf (accessed 20 April 2024).
Nordahl R., Nilsson N.C., The sound of being there: presence and interactive audio in immersive virtual reality, in K. Collins (ed.), The Oxford Handbook of Interactive Audio, Oxford University Press, Oxford 2014, pp. 213-233.
Peters B., Hoban N., Yu J., Xian Z., Improving meeting room acoustic performance through customized sound scattering surfaces, in «Proceedings of the International Symposium on Room Acoustics», Amsterdam, 15-17 September 2019, https://d-nb.info/1212269292/34 (accessed 20 April 2024).
Picone M., Mariani S., Virdis A., Castagnetti P., Digital twin & blockchain: Technology enablers for metaverse computing, in «IEEE», 2023, https://ieeexplore.ieee.org/document/10271919 (accessed 20 April 2024).
Serafin S., Geronazzo M., Erkut C., Nilsson N.C., Nordahl R., Sonic interactions in virtual reality: state of the art, current challenges, and future directions, in «IEEE Computer Graphics and Applications», n. 38, 2, 2018, pp. 31-43.
Tsingos N., Gallo E., Drettakis G., Perceptual audio rendering of complex virtual environments, in «ACM Transactions on Graphics (TOG)», n. 23, 3, 2004, pp. 249-258.
Wang P., Zhang X., Ai X., Wang S., Modulation of EEG Signals by Visual and Auditory Distractors in Virtual Reality-Based Continuous Performance Tests, in «IEEE Transactions on Human-Machine Systems», n. 53, 6, 2023, pp. 1001-1011.
Xie B., Head-related transfer function and virtual auditory display, J. Ross Publishing, Plantation 2013.
Zotter F., Frank M., Ambisonics: A practical 3D audio theory for recording, studio production, sound reinforcement, and virtual reality, Springer Nature, Cham 2019.