Harmonizing emotion and sound: a novel framework for procedural sound generation based on emotional dynamics

Hariyady Hariyady and Ag Asri Ag Ibrahim and Jason Teo and Ahmad Fuzi Md Ajis and Azhana Ahmad and Fouziah Md Yassin and Carolyn Salimun and Ng, Giap Weng (2025) Harmonizing emotion and sound: a novel framework for procedural sound generation based on emotional dynamics. International Journal on Informatics Visualisation, 8. pp. 2479-2488. ISSN 2549-9610

Text
FULL TEXT.pdf
Restricted to Registered users only
Download (3MB) | Request a copy

URL: https://joiv.org/index.php/joiv/article/view/3101

Abstract

The present work proposes a novel framework for emotion-driven procedural sound generation, termed SONEEG. The framework merges emotional recognition with dynamic sound synthesis to enhance user schooling in interactive digital environments. The framework uses physiological and emotional data to generate emotion-adaptive sound, leveraging datasets like DREAMER and EMOPIA. The primary innovation of this framework is the ability to capture emotions dynamically since we can map them onto a circumplex model of valence and arousal for precise classification. The framework adopts a Transformer-based architecture to synthesize associated sound sequences conditioned on the emotional information. In addition, the framework incorporates a procedural audio generation module employing machine learning approaches: granular and wavetable synthesis and physical modeling to generate adaptive and personalized soundscapes. A user study with 64 subjects evaluated the framework through subjective ratings of sound quality and emotional fidelity. Analysis revealed differences among samples in sound quality, with some samples getting consistently high scores and some getting mixed reviews. While the emotion recognition model reached 70.3% overall accuracy, it performed better at distinguishing between high-arousal emotions but struggled to distinguish between emotions of similar arousal. This framework can be utilized in different fields such as healthcare, education, entertainment, and marketing; real-time emotion recognition can be applied to deliver personalized adaptive experiences. This step includes acquiring multimodal emotion recognition in the future and utilizing physiological data to understand people's emotions better.

Item Type:	Article
Keyword:	Affective computing, emotion recognition, artificial emotional intelligence, procedural sound generation
Subjects:	Q Science > QA Mathematics > QA1-939 Mathematics > QA71-90 Instruments and machines > QA75.5-76.95 Electronic computers. Computer science T Technology > T Technology (General) > T1-995 Technology (General)
Department:	FACULTY > Faculty of Computing and Informatics
Depositing User:	SITI AZIZAH BINTI IDRIS -
Date Deposited:	28 Apr 2025 14:54
Last Modified:	28 Apr 2025 14:54
URI:	https://eprints.ums.edu.my/id/eprint/43600

Actions (login required)

View Item