The first voice ever heard in a cinema was not a human speaking on screen, but a disembodied narrator guiding the audience through the silent film era. Before the invention of synchronized sound, the voice was already a tool of storytelling, used by live musicians and stage actors to provide context for the images. This invisible art form evolved from the radio dramas of the early twentieth century, where the BBC began producing serialized stories that relied entirely on the imagination of the listener. The cast of the Sierra Leonean radio soap opera Atunda Ayenda demonstrates how this tradition continues to thrive in modern times, proving that the voice can build entire worlds without a single visual cue. Voice acting is the art of performing a character or providing information to an audience with one's voice, a craft that has expanded from simple announcements to complex emotional performances that define generations of entertainment.
The Character Behind The Mask
The voices for animated characters are provided by voice actors who must create a personality without the aid of facial expressions or body language. For live-action productions, voice acting often involves reading the parts of computer programs, radio dispatchers or other characters who never actually appear on screen, yet they carry the weight of the narrative. Producers and agencies are often on the lookout for many styles of voices, such as booming voices for more dramatic productions or cute, young-sounding voices for trendier markets. Some voices sound like regular, natural, everyday people; all of these voices have their place in the voiceover world, provided they are used correctly and in the right context. The role of a voice actor may involve singing, most often when playing a fictional character, although a separate performer is sometimes enlisted as the character's singing voice. A voice actor may also simultaneously undertake motion-capture acting, blending physical performance with vocal artistry to create a seamless digital human.
The Narrator's Burden
In the context of voice acting, narration is the use of spoken commentary to convey a story to an audience, serving as the bridge between the creator's vision and the listener's imagination. A narrator is a personal character or a non-personal voice that the creator of the story develops to deliver information about the plot to the audience. The voice actor who plays the narrator is responsible for performing the scripted lines assigned to them, often carrying the entire emotional arc of a documentary or audiobook. In traditional literary narratives such as novels, short stories, and memoirs, narration is a required story element; in other types of narratives such as plays, television shows, video games, and films, narration is optional but often essential for clarity. The voice actor must master the rhythm and tone of the text to ensure the story flows naturally, transforming written words into a living, breathing experience for the audience.
One of the most common uses for voice acting is within commercial advertising, where the voice actor is hired to voice a message associated with the advertisement. This has different sub-genres such as television, radio, film, and online advertising, each with its own distinct style and purpose. For example, television commercials tend to be voiced with a narrow, flat inflection pattern whereas radio commercials, especially local ones, tend to be voiced with a very wide inflection pattern in an almost over-the-top style. Marketers and advertisers use voice-overs in radio, TV, online adverts, and more; total advertising spend in the UK was forecast to be £21.8 billion in 2017. Voice-over used in commercial adverts had traditionally been the only area of voice acting where de-breathing was used, meaning artificially removing breaths from the recorded voice to stop the audience being distracted in any way from the commercial message that is being put across. However, removal of breaths has now become increasingly common in many other types of voice acting, altering the natural sound of the human voice to fit the demands of modern media.
The Global Dub
Dub localization is the practice of voice-over translation, in which voice actors alter a foreign-language film or television series to make it accessible to a new audience. Voice-over translation is an audiovisual translation technique, in which, unlike in Dub localization, actor voices are recorded over the original audio track, which can be heard in the background. This method of translation is most often used in documentaries and news reports to translate words of foreign-language interviewees. In Japan, occupations include performing roles in anime, audio dramas, and video games, and performing voice-overs for dubs of non-Japanese movies. Japan has approximately 130 voice acting schools and troupes of voice actors who usually work for a specific broadcast company or talent agency. They often attract their own appreciators and fans, who watch shows specifically to hear their favorite performer. Many Japanese voice actors frequently branch into music, often singing the opening or closing themes of shows in which they star, or become involved in non-animated side projects such as audio dramas or image songs.
The Silent Revolution
Automated dialogue replacement is the process of re-recording dialogue by the original actor after the filming process to improve audio quality or reflect dialogue changes, also known as looping or a looping session. ADR is also used to change original lines recorded on set to clarify context, improve diction or timing, or to replace an accented vocal performance. In the UK, it is also called post-synchronization or post-sync. Automated announcements are another form of voice acting, where voice artists are used to record the individual sample fragments played back by a computer in an automated announcement. At its simplest, each recording consists of a short phrase which is played back when necessary, such as the mind the gap announcement introduced on the London Underground in 1969, which is currently voiced by Emma Clarke. In a more complicated system, such as a speaking clock, the announcement is re-assembled from fragments such as minutes past, eighteen, and p.m. For example, the word twelve can be used for both Twelve O'Clock and Six Twelve.
The Synthetic Voice
Since the late 2010s, software to modify and generate human voices has become more popular, threatening the livelihoods of human performers. In 2019, AI startup Dessa created the computer-generated voice of Joe Rogan using thousands of hours of audio from his podcast, while video game developer Ubisoft used speech synthesis to give thousands of characters distinguished voices in its 2020 game Watch Dogs: Legion. Google announced that same year their solution to generate human-like speech from text. Most voice actors and others in the entertainment industry have reacted negatively to this development due to the threat it poses to their livelihood. The 2023 SAG-AFTRA strike included negotiations between the union and Hollywood studios about the regulation of AI, as well as discussions with video game studios about new terms that would protect voice actors who specialize in that field. Although SAG-AFTRA heralded the deal it struck with AI company Replica Studios as a breakthrough due to its supposed ability to give actors more control over licensing their voice and how it may be used, the deal received backlash for its actual lack of protections from prominent voice actors such as Steve Blum, Joshua Seth, Veronica Taylor, and Shelby Young.
The Deepfake Crisis
AI voices have caused concern due to the creation of believable audio deepfakes featuring celebrities or other public figures saying things they did not actually say, which could lead to a synthetic version of their voice being used against them. In October 2023, during the start of the British Labour Party's conference in Liverpool, an audio deepfake of Labour leader Keir Starmer was released that falsely portrayed him verbally abusing his staffers and criticizing Liverpool. That same month, an audio deepfake of Slovak politician Michal Šimečka falsely claimed to capture him discussing ways to rig the upcoming election. In January 2024, voters in the New Hampshire Democratic presidential primary received phone calls featuring an AI-generated voice of U.S. President Joe Biden that tried to discourage them from voting. The use of AI voices in video games and animation has also been criticized in general by voice actors such as Jennifer Hale, David Hayter, Maile Flanagan, and Ned Luke, who argue that the technology undermines the human connection that is the foundation of the profession.