How to Use ElevenLabs Audio Tags for Voiceover - Artlist Blog
How to use audio tags with ElevenLabs AI voiceover  How to use audio tags with ElevenLabs AI voiceover  How to use audio tags with ElevenLabs AI voiceover  How to use audio tags with ElevenLabs AI voiceover  How to use audio tags with ElevenLabs AI voiceover 

Highlights

Unlock cinematic-level control over AI voice performance with ElevenLabs audio tags.
Learn how simple bracketed directions can transform a flat read into expressive, emotional, character-driven audio.
See real prompt examples and start directing voiceovers inside the Artlist AI Toolkit like a pro.

Table of contents

Artlist Blog Artlist Blog Artlist Blog Artlist Blog Artlist Blog

Eleven v3 is a highly expressive, performance-driven text to speech model from ElevenLabs. It is best for advanced voice acting, emotional depth, and directorial control, giving you the tools to make your audio feel alive without spending hours in a recording booth.

You can use Eleven v3 in the Artlist AI Toolkit with the AI Voiceover to shape delivery, inflection, timing, and mood with precision that feels almost cinematic.

One of its most powerful features is audio tags.

These tags let you guide the performance in ways that weren’t possible before. They’re supported across all available languages on Artlist — 71 languages, so you have range, flexibility, and total creative freedom.

In this tutorial, we’ll walk through what audio tags are, why they matter, and how to use them effectively. You’ll see real examples from our Artlist AI audio experts, so you can apply the same techniques in your own projects.

What are audio tags?

Audio tags are simple text instructions you place inside [brackets] directly in your prompt. Think of them as short, clear direction notes — the same type of cues you’d give a voice actor in the studio. 

You can guide emotion, pacing, intensity, character, tone, and even small physical actions like breathing or laughter.

You can add absolutely anything, but tags work best when they describe a sound, emotion, vocal quality, or type of delivery. Keep them purposeful and tied to performance.

[surprised] [whispers] [sigh] [gunshot] [accent] [clapping] [explosion]

Below, we’ll show you how powerful they can be.

Example 1: From flat to performance ready  

This example shows how fast audio tags can transform a flat read into something dynamic, textured, and genuinely funny. When you push performance direction into the prompt, Eleven v3 responds with a level of expressiveness that feels recorded, not generated.

Example 1 without audio tags

Prompt: Oh my god! I can’t, I can’t breathe! Oh my god, he just went “excuse me, miss” like a crazy person!

Example 1 with audio tags

Prompt: [dying of laughter] Oh my god! [laughing] I can’t [between laughter] I can’t breathe [laughing] [hilarious] Oh my god, [very fast] he just went [doing deep voice, mocking] “excuse me miss” [laughing] like a crazy person [laughing]

Example 1 with audio tags, using another voice

Example 2: Emotion control  

In this next example, the audio tags control the physical state of the voice, emotion, and pauses in between — the breathing, sighing, and crying — and the model adapts naturally. This example shows how this technique is one of the most powerful tools you have for emotional storytelling with AI voiceover.

Example 2 without audio tags

Prompt: I don’t know why I’m crying this hard… it just feels like a lot right now. I know I’ll be okay, I just need a minute to let it out.

Example 2 with audio tags

Prompt: [sobbing] I don’t know why I’m [sniff] [sniff] crying this hard… [crying] it just feels like [sigh] a lot right now.
[clear throat] I know I’ll be okay, I just need a minute to [sigh] let it out.

Example 3: Dramatic range 

Intensity is only half the story. What makes a performance land is contrast — the shift from a shout to a whisper, from fury to grief. This example shows how audio tags let you choreograph that range, beat by beat, like a director calling the shots in real time. Without tags, the delivery stays in one gear. With them, the same script becomes a scene. Listen to how the voiceover changes in the below examples. 

Example 3 without audio tags

Prompt: Look me in the eyes and tell me I’m wrong!!! Tell me you’re not the rat who’s been talking behind our backs!
Everyone in this room kept their mouth shut, except one person. So why does it smell like it’s you?
If you are the rat, you better confess now! Before the silence in this room turns into something you won’t walk away from. 

Example 3 with audio tags

Prompt: [shouting] Look me in the eyes and tell me I’m wrong!!! [Keep screaming] Tell me you’re not the rat who’s been talking behind our backs!

[Quietly almost whispering] Everyone in this room kept their mouth shut, [Inhale and pauses] except one person So why does it smell like it’s you?

[Talking in a sad way] If you are the rat, [breath again] [Shout] you better confess now! [Quiet again] before the silence in this room turns into something you won’t walk away from.

How to access ElevenLabs on Artlist 

Eleven v3 is available directly inside AI Voiceover in the Artlist AI Toolkit. There’s no setup, no syncing, no extra steps. You choose your voice, drop in your script, add your tags, and generate.

Step by step guide to using audio tags with Artlist AI Voiceover:

Step 1

Open the Artlist AI Toolkit and select AI Voiceover.

Step 2

Pick Eleven v3 from the voiceover model dropdown.

Step 3

Choose the voice you want to start with.

Step 4

Write your script in the text box. Add voice tags in [brackets] anywhere you need to guide delivery. Keep tags short and clear — emotion, tone, pacing, character, or action.

Step 5

Generate and listen. 

Step 6

If you want more character, push your tags further. If you want less intensity, dial them back. You control the performance in real time, just like directing an actor.

Step 7

When you are happy with your final voice download and find it in My Voices.

Artlist BlogArtlist Blog

Start directing your own voiceovers: try audio tags now 

Audio tags give you directorial control and open up a huge creative space for storytelling, character work, and dynamic voiceover, all without touching a microphone. Feel free to steal our prompt examples and try out audio tags for your own AI voices now with Eleven v3.  

Was this article helpful?
YesNo

Did you find this article useful?

About the author

Deborah Blank is the Artlist Blog Editor, with over 15 years of experience shaping content for global brands. An expert in AI models, video, and image generation, she’s passionate about empowering creators to tell better stories. Contact her on LinkedIn — she wants to hear from you!
More from Deborah Blank

Recent Posts