I want to talk a little more about authenticity in voiceover, but from a slightly different angle.

Is less more, or is more more when it comes to voice?

Imagine you were having a conversation with a stranger, but you weren’t allowed to see them. It might feel a bit awkward, wouldn’t it? So many of the cues we pick up from people are visual. This may explain why so many of us dread talking on the phone. Sure, some of us have very emotive voices, but many of us don’t.

The reason I bring this up is when you remove the visual element, and voice becomes the main conveyance of emotion, suddenly we have more room for vocal expression. We’ve all heard radio commercials with exaggerated performances (for better or worse). In fact it could be argued that when you take away visual cues, exaggeration is essential in defining that emotional context and creating a coherent narrative. It’s a bit like someone developing their upper body when they’ve lost the use of their legs. And this doesn’t just apply to commercial voiceover or even cartoons. Puppets also need that additional emotional thrust. Puppets (with some exceptions, of course) can’t alter their facial expressions. Their faces tend to be fixed. The funny thing is we feel like they have facial expressions because their vocal identities are so well defined. The exaggerated vocal performances have a way of compensating for the lack of facial cues. They become the main driver of the character’s personality.

Check out Cookie Monster and Sir Ian here. Ian is pretty relaxed, while still maintaining great presence. Cookie Monster, on the other hand, has to turn everything up to eleven just to keep up.

I tend to go through a warm-up exercise with voiceover performers where we exaggerate the prosody. If you’re not familiar with the word, it simply refers to the melodic, dynamic qualities of your voice when you speak. It doesn’t matter how serious the script, I’ll have them warm up by reading it as if they were announcing a game show or a used car ad. This works well because it’s much easier to go to your limit and then relax it than it is to start at zero and gradually bring it up.

The thing that often surprises people about this exercise is it feels absolutely ridiculous while you’re doing it, but it doesn’t sound nearly as bad when you hear it played back. This is mainly because no one can see you. The listener is only getting half the emotional cues. Now if they could see you, that might be cause for embarrassment.

Pro Tip: Install some curtains over the vocal booth window. You never know when they might make for a better vocal performance.

Now I’d like to turn this around a bit.

Contrary to what we would like to believe, emotional cues are actually not universal. They vary from person to person and from culture to culture. For more on this, check out How Emotions are Made by Lisa Feldman Barrett. If you’d like something quicker, or if reading isn’t your thing, there’s also her TED Talk.

The reason I bring up Dr. Barrett’s research is we often create media under the assumption that emotional cues are universal when in fact they are often cultural and learned. I’m not saying that we should create media devoid of emotion, nor should we refrain from creating distinctive characters and performances, but we should be cognizant of the other side of heightened vocal performance: stereotypes.

As we’ve established, vocal performances, especially when it’s a character, tend to be exaggerated. For example, when we can’t see that Nonna character, her performance may need to sound a bit more Italian than normal. This dynamic is challenging enough when it comes to ethnicity and nationality. It’s a much bigger deal when we talk about race.

The voice world has never been kind to people of colour. In the early days of radio, black roles were often given to white actors, essentially amounting to vocal blackface. For more on this, check out The Sonic Color Line by Jennifer Lynn Stover.

To this day, white voices are still seen as the default, while voices of colour are often delegated to the role of niche. In addition, black voice performers are often asked to sound ethnic or urban, which is often just code for stereotype. In the last couple of years there has been a big push in the VO industry to cast more voices of colour. Studio Resonate has done a great job getting that ball rolling.

I’m not going to completely unpack this subject here, nor am I the most qualified person to do so. I’d much rather direct you to voices who can. There were some great points made in this discussion hosted by Global Voice Acting Academy. It’s a longer one, but there’s lots of good stuff here. If you don’t have two hours to spare, maybe skip ahead to about 29 minutes. Jazzy Frizzle really drives the point home.

I hope all this shows why authenticity is so important. It goes beyond simple communication, and even connection. Yes, there’s been a big push in recent years for more authentic voices. But authenticity isn’t just fashionable; it’s ethical.

And yes, it is still a performance, but there should also be a certain amount of verisimilitude involved. If you want someone with a certain sound, get someone who naturally sounds that way. The animation world figured this out a long time ago. It’s rare these days to hear “cartoon voices”. These days they tend to cast people whose voices are naturally distinct.

Should a voice performance be bigger than real life? Often, yes. But it may also be worth developing a sense of when it’s gone too far.