AUGUST 2025
In this interview, we chat to Robert Mann, Voice Lead at Larian Studios and MASA juror, to explore the craft and creativity behind voice design in games. From building VO pipelines and directing actors, to balancing technical workflows with emotional authenticity, Robert shares insights on what it takes to bring characters to life - and what makes a truly memorable performance stand out…
MASA: What initially drew you to working in dialogue and voice design?
ROBERT: I've always been fascinated by how a single vocal nuance can completely shift the emotional tone of a scene. Coming from a film-sound background, I was drawn to the challenge of capturing, shaping and implementing voice performances that feel both natural and dynamic within interactive worlds.
Being in ‘voice’ in audio means we get a lot of access to the recording sessions themselves, and I’ve had the pleasure of directing and working with actors on previous projects, which has always been a joy for me as it feels like you’re at the heart of performance and creativity!
MASA: Love that - you’ve hit on both the artistry and the craft: treating even the smallest vocal shift as something that can transform an experience. Can you tell us a bit about what you’ve been working on since joining Larian Studios?
ROBERT: I can’t really say anything except that I have been working with a number of our internal teams to develop new VO production pipelines, building out internal workflows for performance capture, animation and audio, all in preparation for our next project!
MASA: Building the structure so creativity can happen on top - exciting. On to the next question - can you share an example where the interplay of dialogue and sound design significantly elevated a scene or player experience?
ROBERT: On F1 Manager 23 we wanted to replicate the experience of being on the pitwall. So we devised a system that would play back the actual real life radio messages from Formula 1 drivers and their engineers, when the relevant in-game events were triggered. The use of dialogue in this way gave the player the feeling of quick feedback and dynamic response whilst managing their team.
In order to achieve this, we first had to have an editor cut and master all the team radio audio from the previous season of Formula 1 (they are delivered in one large recording per driver).
These individual files then had to be tagged with info about what is contained within the radio message, and then we had to create a system to decide which one to play back whilst maintaining the correct driver + engineer pairing, as well when not to play a back. This priority system meant that VO would more accurately play in relation to game speed (players could speed races up), which events are firing, and stage of the race. As well as information about all the other drivers on the grid, as sometimes they would be mentioned by name. E.g “[Lewis] Hamilton ahead”.
The entirety of this system worked well with our sound design, particularly engine sound design. As the radio audio is not clean and has a lot of engine sound in there already, so it actually helped elevate and ‘worldise’ the F1 team radios.
MASA: That’s brilliant - using real radio moments adds so much authenticity and the roughness makes it feel even more immersive. Can you share an example where actors had to significantly adapt their voice?
ROBERT: On Warhammer: Age of Sigmar - Realms of Ruin, we worked with actors to embody towering, non-human characters. We pushed vocal boundaries in both direction and processing, often having to explore new tones and cadences that would still sit comfortably in the mix after creature voice design.
One particular highlight was working with our Orruk characters, we had to workshop how a Orruk would sound, how they would move and generally vocalise and ‘snap’ at each other with lots of animalistic growls mixed in with their dialogue. This really helped separate them out as a different species in the world.
We also had a host of demonic characters that significantly changed their voice to match the brief for characters like ‘Tzeentch, The Lord of Change’ to better support vocal processing we applied to characters.
MASA: Fascinating - essentially helping actors transform into something beyond human but still believable enough for players to connect with. When directing other voice actors, what strategies do you use to help them find the right performance?
ROBERT: When directing voice actors, I tend to first focus on providing clear emotional and narrative context so they understand the stakes, relationships, and tone of the scene. I will avoid rigid line reads and instead try to use evocative language to help an actor discover the performance themselves. I’ve often found that different actors require different direction techniques, sometimes to suit each actor’s style, sometimes to suit the style of dialogue.
It is imperative to listen closely to their delivery, and try to keep feedback concise and supportive to maintain momentum and morale, especially during long sessions. It’s a heck of a difficult job being in the booth with a bunch of people looking at you, hearing and analysing every word. So above all, it is important to create a calm, positive and trusting environment for the actor so that they feel empowered to deliver authentic, compelling performances.
MASA: What does ‘voice design’ mean to you and how do you approach designing the vocal identity of a character or project?
ROBERT: Voice design is about building a character’s presence sonically. This stems from how they speak, breathe, emote and exist in the world. It involves everything from casting and performance direction to technical choices like microphone choice, mastering decisions, processing, spatial placement and voice over post-processing.
MASA: How do you collaborate with sound designers and writers to ensure voice design aligns with the project vision?
ROBERT: I advocate for early collaboration, where voice sits alongside narrative and sound. By working closely with writers and sound teams from the outset, we ensure the character’s voice contributes to their identity and that transitions between spoken word and sonic space are seamless.
Comprehensive VO systems and workflows unlocks developers/designers and gives them more time to iterate and improve the content of the game. Rather than relying on work-arounds. So I find it an integral part of my role to consider the tech that drives those VO systems and pursue areas where it can be improved.
MASA: That early collaboration point seems so key. It feels like voice becomes stronger when it’s thought of as part of the ecosystem, not a late addition. On a practical level, how do you go about organising and directing larger crowd or group recordings? How do you approach organising and directing crowd or group voice recordings?
ROBERT: The 3 P’s… Preparation, Preparation and Preparation! A good crowd session feels loose but is built on very tight planning.
These sessions are often a bit of controlled chaos on the day, so it’s very, very important to ensure all the prepwork for identifying all the crowd assets you require ahead of time is done, that includes all the reaction types, layers, vocal groupings, gender groupings etc.
A key workflow point for me is that with each group of assets, write in contexts, keywords and examples and scenarios that help to provide clear direction. Have this printed out so that as you move through the ‘cues’ you have a plan to quickly look down and refer to. Crowd sessions often have a tonne of assets to capture. So having it printed out also helps ensure you maintain a good pace through all that content!
There should also be some room left for spontaneity within takes. As I said, it should feel loose and relaxed for the crowd artists there on the day! We want them bouncing off each other so that the room has a great creative vibe. It shouldn’t feel regimental.
MASA: Tight preparation that still leaves room for those magical, unexpected moments in-session. Looking at another side of the process, how do you collaborate with engineers and designers to ensure dialogue playback is seamless and immersive?
ROBERT: As a voice lead, I work closely with production to ensure the recording pipeline is as good as it can be, and then coders and designers to ensure dialogue playback is smooth, responsive and emotionally engaging.
With our development team, I help define the technical systems that trigger and manage dialogue, making sure voice lines play correctly in different gameplay contexts like combat, exploration or cutscenes. We collaborate on tools, middleware (like Wwise) and tagging systems to support seamless playback, localisation and syncing with facial animation. With particular focus on systemic gameplay dialogue.
On the design side I coordinate with narrative and level designers to process VO in ways that enhance gameplay and storytelling without disrupting flow. We fine-tune timing, context and variation so dialogue feels natural and immersive. I can often review narrative beats, test the dialogues, and adjust implementation based on feedback, acting as a bridge between creative goals and technical needs.
MASA: Sounds like you’re really sitting at the crossroads - translating creative intent into technical execution. And of course with so much going on sonically, how do you keep dialogue clear and impactful?
ROBERT: We design from the ground up with headroom and priority in mind. Mixing strategies like EQ carving, ducking, and reverb management help keep voices front and centre without overpowering the environment or music. This all starts with a solid understanding of the type of dialogue we are recording, so we can ensure that the projection level of dialogue recorded is mastered correctly to provide the correct amount of dynamics between VO types.
Every game will also have some kind of VO priority system, in which dialogues are sorted in priority levels, and depending on what is happening during the game at any given moment, only certain dialogues will play back.
MASA: Any advice for teams looking to better integrate voice acting with narrative design?
ROBERT: Try to bring voice into the conversation early. VO is often treated as an add-on, when in reality it should influence how scenes are written and experienced. Close collaboration between narrative and VO leads to far more authentic storytelling.
Secondly, don’t neglect the little details such as effort sounds. Focus on these areas really elevates a cinematic scene, or any gameplay dialogue, as it's often considered the glue that holds the dialogue together. The small detail of someone catching their breath after an intense scene, or a slight gasp surprise sound at a character pulling out a dagger. All these little nuances add character, complexity and depth to scenes. They humanise the characters and make them feel authentic to the players.
On the flipside of this, you can’t have a combat cinematic or action without those effort sounds. These require careful collaboration with the audio team to ensure sword clashes are punctuated with efforts, not drowned out by them (and vice versa). Know when to highlight the voice, and when to highlight sound design, and when they should both work together.
MASA: That’s such a valuable takeaway. Finally, what makes a voice performance worthy of a Music+Sound Award?
ROBERT: It’s the alchemy of authenticity, emotion, timing and design. When a performance not only fits the character but elevates the entire experience. Just like after a good film, a winning voice performance lingers with the player long after they've finished playing the game.
MASA: Thank you, Robert, for such an engaging conversation and for giving us a glimpse into the craft of voice design. We’re delighted to have you as part of the MAS Awards judging panel this year, where your perspective and experience will help spotlight the very best in the industry!
Visit Larian Studios’s site HERE