Audio Deep Dive

How To Use the Affects of Audio on Emotions in Your Game

By: Colin Billiau

 

Abstract

This deep dive will explore the underlying determinants of why a sound evokes an emotion. Some of the factors that may influence the impact of the sound are physical, psychological, spatial, and more. With each of these determinants, I will relate them to a situation that might take place in a game so that hopefully when designing your own sounds and sound systems you might refer to these determinants to further enhance the sound experience of your game. This topic is incredibly complex so this paper will not be a complete comprehensive analysis of everything that might influence the effect of a sound on the listeners emotions. However, It will hopefully open you mind to thinking about the psychology around sounds and their impact on people.


Can Audio Affect our Moods and Emotions?

Though it might seem like a silly question it is still very important that we cover this question. if audio couldn’t affect our moods and/or emotions the basis for the whole deep dive becomes moot with that said audio absolutely can affect the listener’s moods and emotions. (Asutay et al., 2012; Tajadura-Jiménez & Västfjäll, 2008). It is incredibly effective at doing so. Since we can’t turn off hearing like we can with sight by closing our eyes, we use it as a warning system and as a way to survey change in our environment(Tajadura-Jiménez, 2008). Paired with visual stimuli a wide range of emotions can be evoked by audio. Even without any visuals, it can evoke emotion(Schäfer et al., 2015), however, paired with visual and physical stimulation you can have an even more salient emotion evoked. In fact, without audio films and games can seem distant and unrelatable (Ekman, 2008) so it is no surprise that audio can affect our moods and emotions as simply muting the audio on your favourite movie or game can dramatically reduce the impact of the media.

How can we define emotions so we can be on the same page, Ekman (2008) lays it out clearly and concisely:

-       Emotion is usually caused by a person (consciously or unconsciously) evaluating an event of some significance in relation to a goal or situation.

-       The core of an emotion is a readiness to act, and emotion influences actions by preparing for certain types of actions and causing a sense of urgency.

-       An emotion is usually experienced as a distinctive type of mental state, sometimes accompanied, or followed by bodily changes, expressions, or actions.

With these definitions audio definitely causes the listener to react in at least one of these ways regardless of the intention of the audio clip.

As far as music is concerned for a long time researchers disagreed that emotion can be induced by music or at least not the basic emotions that are evoked by everyday activities in the real world as it didn’t fit with the appraisal theory; which is the theory “that emotions arise, and are distinguished, on the basis of a person’s subjective evaluation of an event on appraisal dimensions such as novelty, urgency, goal congruence, coping potential, and norm compatibility”  (Juslin & Västfjäll, 2008). Today it is agreed that music does evoke emotions and psychological responses such as changes in heart rate, skin temperature, and hormone production. (Juslin & Västfjäll, 2008) Like all media audio is capable of inducing emotions that are not real since the stimulus that evokes it comes from a fictional world, however, the emotion is still being evoked in the participant’s mind.

What Qualities of a Sound Contribute to the Evoked Emotion?

Different factors can change how a sound is and how it makes someone feel. These factors are often called determinants in many research papers on the subject. The five determinants that can be categorized are physical, psychological, spatial, cross-modal, and others (Tajadura-Jiménez & Västfjäll, 2008). Physical determinants have been the most scientifically researched out of all the other determinants. Understanding the influence of each of these determinants on the listeners’ emotions is important to keep in mind when designing your audio or game.

Physical determinants

Physical determinants are the measurable components of a sound such as the loudness, frequency, tonal, or time structure of the sound. (Tajadura-Jiménez, 2008). These components can be used together to influence the audience's pleasantness or arousal. However, just looking at physical determents cannot capture the full reason for the emotional response. Some authors have argued the annoyance of a sound is only 20% caused by the physical determinant of that sound. (Tajadura-Jiménez, 2008). The physical determinants association with a physical object is then what gives the sound more or less arousal or pleasantness. For example, an aperiodic, low-frequency sound that is getting louder may indicate the presence of a large mammalian predator which will increase the arousal in the listener and be unpleasant as danger may be approaching. This statement by Gibson (1966, as cited in Tajadura-Jiménez, 2008) expresses the complicated nature of the effect of a sound’s physical determinants on the listener.

“The result is said to be a meaningless sensation having pitch corresponding to the frequency, a loudness corresponding to the amplitude, and a certain duration. Meaningful sounds, however, vary in much more elaborate ways than merely in pitch, loudness and duration. Instead of simple duration, they vary in abruptness of beginning and ending, in repetitiveness, in rate, in regularity of rate, or rhythm, and in other subtleties of sequence. Instead of simple pitch, they vary in timbre or tone quality, in combinations of tone quality, and in changes of all these in time… it is just these variables that are distinguished naturally by an auditory system. Moreover, it is just these variables that are specific to the source of sound – the variables that identify the wind in the trees or the rushing of water, the cry of the young or the call of the mother …”

 

Understanding physical determinants is quite important to audio design for games because with this knowledge you can bend sound to fit the desired emotional response to the sounds.


Psychological determinants

This determinant is solely based on the listener, as it is the listener's related emotion to the sounds that they are hearing. This factor can almost completely replace the impact of the physical determinant as people “emotionally react to “objects” and “events” and not only to “sounds”.” (Tajadura-Jiménez, 2008) this means that even though the sound might be pleasant at first if the in-game action associated with that sound is negative then the player will no longer find the sound be pleasant. The same will happen with how arousing the sound can be if the action in the game associated with the sound is important or not. Pairing a sound that has arousing physical properties, such as a relatively loud sound with a high tone and an abrupt start with an enemy attacking may give the player more feedback on when the enemy is attacking. Leading the player to hopefully relate that sound with an attack and they act accordingly.

Furthermore, “If people feel that a sound could be avoided, it is judged more annoying”(Guski, 1997) This quote I believe has lots of implications when it comes to game design. A great example of this in some games is when you are in low health a repetitive sound plays. These sounds are often designed to be annoying with their physical attributes, but it becomes even more annoying/arousing since the player can heal to remove the sound thus being able to avoid it. Designers use this to annoy the player into noticing their low health and doing something about it. Leveraging the ability to draw attention with avoidable noise you can get the player to do what you intended them to do. I would use this sparingly in any game as it might be overwhelming if too many sounds are played this way. Make sure to only use this to your advantage for stuff that is important such as health.  

         There is also more, it’s been shown that listening to self-representation sounds such as bodily functions like breathing or heartbeats causes the listener to become more aware of their bodies(Tajadura-Jiménez & Västfjäll, 2008). The sound of a heartbeat affects the listener's own heartbeat, rapid beats that can arouse the listener a lot leading to a potential increased heart rate for the listener.

Sounds affording imaginative immersion help the player identify with the characters and actions in the FPS game. Proprioceptive sounds, like the character's breathing, are especially potent immersive cues particularly if the game engine allows for a change in breathing rate following the speed and exertions of the character. Exteroceptive sounds, such as footsteps, provide similar affordances as they respond directly to player input. (Grimshaw-Aagaard, 2007)

All this to say that adding sounds that help the player better connect with the character they’re playing, such as some of the examples stated above, will increase their immersion and in turn their emotional availability to the game as the character may feel more relatable to the player.

Spatial determinants

This determinant is quite straightforward and luckily modern game engines do the heavy lifting as this determinant is solely based on how far or close the source of the sound is and with 3D sound, it is basically done on its own with a flick of a switch. There are two more components to this determinant. The first component is whether the sound is static or dynamic, in other words moving or still. Sounds that are moving closer are more likely to have a larger emotional impact than receding or immobile (Tajadura-Jiménez, 2008). Moving objects also can tell us their direction, which may be important, further increasing the impact on the listener. The second component is the sound of the scape, like how the sound is bouncing around in the environment, people are very aware of the sound in the environment and can tell the size based on the sound they’re hearing (Tajadura-Jiménez, 2008).

Since people can tell the size of the room by sound relatively well, they’ll be able to pick up on any discrepancy with the size of the room in the game in relation to the sounds they’re hearing. Having a large discrepancy may cause the player to have trouble getting immersed in the world. For a first-person game, you may want to be aware of this fact even more as first-person games are quite reliant on the immersive nature of the viewpoint.

Cross-modal determinants.

Cross-modal means the crossing between two modes, in this case, the modes are two different senses. It might be obvious that what you see and hear will combine to influence your emotions but nonetheless needs to be proven. “Multisensory research shows that human sensory systems are interlinked and that information available in one sensory modality influences the percepts from other sensory modalities.” (Tajadura-Jiménez, 2008). Games rely heavily on the fact that the visual and auditory cues are interlinked by the player. An arousing sound paired with arousing visuals will very likely be capable of catching the attention of the player. Ensuring that the sounds are linked with visuals can heavily increase the effectiveness of both pieces of feedback. When making an action game reliant on quick reaction times, having a distinct sound that plays at the beginning of an attack will better prepare the player to react as “the mean [reaction time] to detect visual stimuli is approximately 180–200 milliseconds, whereas for sound it is around 140–160 milliseconds” (Jain et al., 2015)

There’s also a heavy connection between auditory and vibro-tactility, (Tajadura-Jiménez, 2008) which for games would be the vibrations from the controller. If you had reservations about adding vibrations to your game, I would strongly recommend reconsidering it, especially for any action or story-heavy game. The intense moments in your game can be greatly improved in impact by adding vibrations that appropriately represent what’s happening on screen. It doesn’t even need to be perfectly appropriate, just somewhat relevant. Granted this advice is not super useful for PC or strategy games, but it’s still good to know for making action games.

Other Determinants

There are many other determinants of a sound many aren’t very pertinent to games. However, our pre-existing emotional state can influence the judgement of a sound. (Tajadura-Jiménez, 2008) “For instance, the startle reflex, associated with a defensive behavior, may be augmented when evoked during an ongoing negative emotional state”(Tajadura-Jiménez, 2008) Using the other aspects of game design to get the desired emotional state we can get the reaction that we want out of a bit of audio.

Conclusion

Using your intuition when designing sounds isn’t a bad idea. A lot of sound for games were made using intuition and I really recommend you continue to do so. However, understanding the underlying reasons why a sound elicits an emotion is still important to have in mind when designing the audio. Taking in to consideration the numerous determinants that influence the listeners reaction to a sound when designing our audio bits or systems may improve the effectiveness of those sounds and systems.

To quickly recap the determinants of the impacts of a sound on the listeners.

The broadest and what you as a designer can influence the most are the physical determinants. The physical determinants are tone, timbre, amplitude (loudness), and time structure (rate, abruptness, and rhythm). However, the physical trait of the sound only contributes to around 20% of the impact on the sound. Despite that, stretching the 20% to the extreme is very important for having the most effective sounds, since these physical determinants are effective regardless of the person's prior experiences. As a designer, you may never know what the player has gone through so leveraging what is widely universal will greatly increase the chances of your sounds doing their job.

Psychological determinants are the listener’s association with the sound they are hearing. “People “emotionally react to “objects” and “events” and not only to “sounds”.” (Tajadura-Jiménez, 2008). This means that if they hear footsteps slowly walking down the hall they might have little arousal to the sound, however, if it’s their boss's footsteps and they’re louder than usual the listener's arousal and stress will be higher than with the first example.

Spatial determinants simply put the size of the room the sound is transmitted through and how that affects the quality of the sound along with the movement of the sound in space. Our subconscious is great at telling the size of the room based on sound and also the direction the sound is coming from and the direction the sound is travelling. The direction of the sound 


Conclusion

Using your intuition when designing sounds isn’t a bad idea. A lot of sound for games were made using intuition and I really recommend you continue to do so. However, understanding the underlying reasons why a sound elicits an emotion is still important to have in mind when designing the audio. Taking in to consideration the numerous determinants that influence the listeners reaction to a sound when designing our audio bits or systems may improve the effectiveness of those sounds and systems.

To quickly recap the determinants of the impacts of a sound on the listeners.

The broadest and what you as a designer can influence the most are the physical determinants. The physical determinants are tone, timbre, amplitude (loudness), and time structure (rate, abruptness, and rhythm). However, the physical trait of the sound only contributes to around 20% of the impact on the sound. Despite that, stretching the 20% to the extreme is very important for having the most effective sounds, since these physical determinants are effective regardless of the person's prior experiences. As a designer, you may never know what the player has gone through so leveraging what is widely universal will greatly increase the chances of your sounds doing their job.

Psychological determinants are the listener’s association with the sound they are hearing. “People “emotionally react to “objects” and “events” and not only to “sounds”.” (Tajadura-Jiménez, 2008). This means that if they hear footsteps slowly walking down the hall they might have little arousal to the sound, however, if it’s their boss's footsteps and they’re louder than usual the listener's arousal and stress will be higher than with the first example.

Spatial determinants simply put the size of the room the sound is transmitted through and how that affects the quality of the sound along with the movement of the sound in space. Our subconscious is great at telling the size of the room based on sound and also the direction the sound is coming from and the direction the sound is travelling. The direction of the sound has impact on the listener’s arousal, moving sounds tend to be more around especially when they are approaching the listener.

Cross-modal simply put is the connection between all of a person’s senses. Games primarily work on the visual and auditory sense but also touch when using a controller. These senses are intertwined in our brain and influence each other. Pairing both visual and auditory in your game can greatly increase the immersion and effectiveness of both your audio and visuals.

         Hopefully with this in mind you’ll be more conscious of the audio design in your game. Don’t forget to use intuition as a guide but also try to base your design with some scientific facts.

 

Bibliography

Ekman, I. (2008). Psychologically motivated techniques for emotional sound in computer games. Proc. AudioMostly, 20–26.

Grimshaw-Aagaard, M. (2007). Sound and Immersion in the First-Person Shooter. Games Computing and Creative Technologies: Conference Papers (Peer-Reviewed).

Guski, R. (1997). Psychological methods for evaluating sound quality and assessing acoustic information. Acta Acustica United with Acustica, 83(5), 765–774.

        Jain, A., Bansal, R., Kumar, A., & Singh, K. (2015). A comparative study of visual and auditory reaction times on the basis of gender and physical activity levels of medical first year students. International Journal of Applied and Basic Medical Research, 5(2), 124–127. https://doi.org/10.4103/2229-516X.157168

Tajadura-Jiménez, A. (2008). Embodied psychoacoustics: Spatial and multisensory determinants of auditory-induced emotion. Chalmers University of Technology Gothenburg.

Tajadura-Jiménez, A., & Västfjäll, D. (2008). Auditory-induced emotion: A neglected channel for communication in human-computer interaction. Affect and Emotion in Human-Computer Interaction: From Theory to Applications, 63–74.