#How well will a professional voice clone sound of an English speaker who speaks with a non-English
1 messages · Page 1 of 1 (latest)
Short answer: Very well—if you train the Professional Voice Clone (PVC) with English recordings of that speaker in their own accent.
Key points verified in ElevenLabs docs:
- Accent comes from the voice data. If you clone a speaker using English audio, the PVC will keep that speaker’s English accent (e.g., French‑accented English, Scottish English). You can’t switch the accent after the clone is created; changing the training samples is the only way to influence accent. (help.elevenlabs.io)
- PVC is the recommended method for accurately reproducing unique or less common accents; Instant Voice Cloning may struggle with such accents. (help.elevenlabs.io)
- For best fidelity, provide sufficient English training audio: minimum ~30 minutes; ~2–3 hours yields the most accurate result. (elevenlabs.io)
- If you later generate in a different language than the one used for cloning, the voice may retain its original accent or mispronounce words. Train the PVC in the target language/accent you intend to use. (elevenlabs.io)
- PVC supports multiple English variants (e.g., US, UK, AU, CA), so English with non‑US accents is within supported scope. (help.elevenlabs.io)
In practice: A PVC built from English recordings of an English speaker with a non‑US accent will reproduce that accent closely, provided you supply enough clean, representative audio. (elevenlabs.io)
Sources:
How do I select the language and accent? – ElevenLabs
Why does my voice or accent not sound correct after cloning? – ElevenLabs
Professional Voice Cloning | ElevenLabs Documentation
Professional Voice Cloning | ElevenLabs Documentation
What languages are supported with Professional Voice Cloning (PVC)? – ElevenLabs
Professional Voice Cloning | ElevenLabs Documentation
What is the best type of content to record as source material to train a PVC?
Best source material for a Professional Voice Clone (PVC)
- Record material that matches your intended use case and keep the delivery consistent. For example, read long‑form narrative if you want an audiobook voice, or ad copy if you’ll use it for commercials. The PVC guide explicitly recommends “keep the style consistent” and training on the same style you plan to generate. (elevenlabs.io)
- Use only clean, single‑speaker recordings with no music, cross‑talk, or room echo; the model will learn any artifacts it hears. (elevenlabs.io)
- Provide enough runtime: minimum ~30 minutes, with better results from 2–3 hours of high‑quality audio. Prioritize consistent, high‑quality clips over sheer length. (elevenlabs.io)
- Align the content’s language (and accent) with how you’ll use the clone; training on a different language can lead to accent drift or mispronunciations. (elevenlabs.io)
- If you’re recording inside ElevenLabs, you can use the provided sample scripts (narrative, conversational, advertising). Choose the one that fits your target use and stick to that style. (elevenlabs.io)
- Additional official guidance: ensure most dialogue in your dataset matches the speaking style and intonation you want; avoid segments that deviate from that style. (help.elevenlabs.io)
- Optional (from ElevenLabs blog): if you specifically need broader expressive range within a project, capture varied but clean, well‑directed performances; for nuanced needs, consider separate sub‑clones per style/emotion rather than mixing very different deliveries into one dataset. (elevenlabs.io)
In short: record clean, single‑speaker audio in the exact style and language you plan to generate, for 30–180+ minutes, favoring consistency over variety. (elevenlabs.io)
Sources:
Professional Voice Cloning | ElevenLabs Documentation
Professional Voice Cloning | ElevenLabs Documentation
Professional Voice Cloning | ElevenLabs Documentation
Professional Voice Cloning | ElevenLabs Documentation
Professional Voice Cloning | ElevenLabs Documentation
Are there any tips to get good-quality cloned voices? – ElevenLabs
ElevenLabs Voice Cloning: 7 Tips for Pro Audio Quality
Professional Voice Cloning | ElevenLabs Documentation
What is a sub-clone?
I can’t find “sub-clone” defined anywhere in ElevenLabs’ public documentation or help center. The official voice-related terms you’ll see are:
- Professional Voice Clone (PVC) for high‑fidelity, verified voice models,
- Instant Voice Clone (IVC) for quick clones from short samples, and
- Voice Remixing for creating editable variants from an existing voice. (elevenlabs.io)
If you saw “sub-clone” in the UI or an email, please share a screenshot or the exact context and I’ll map it to the corresponding documented feature.
Sources:
Professional Voice Cloning | ElevenLabs Documentation
It appears as the last line in tip five of 7 tips for creating a professional-grade voice clone in ElevenLabs.
In ElevenLabs’ blog, “sub-clone” is used informally to mean creating additional, specialized versions of your Professional Voice Clone for specific traits—“For nuanced needs, build sub-clones tuned to accent, emotion, or age.” (elevenlabs.io)
It isn’t a distinct product type in the docs; you’d achieve this by creating separate PVCs or by using Voice Remixing, which generates a new voice variant without altering the original. (elevenlabs.io)
Sources:
ElevenLabs Voice Cloning: 7 Tips for Pro Audio Quality
Voice remixing | ElevenLabs Documentation
The account is limited to one PVC. Can you explain how the remixing to create variants works?
Short answer: Yes—you can use Voice Remixing on your single PVC to create additional saved “variants” of that voice. Remixing does not retrain or replace your PVC; it generates a new voice entry while leaving the original unchanged. This feature is available in the web app and via API and is currently in alpha. (elevenlabs.io)
What you can remix
- Allowed bases: voices you own—your PVC, your IVCs, and your Voice Design voices. The Text to Speech guide also notes you can remix default voices. You cannot remix voices from the community Voice Library. (elevenlabs.io)
How remixing works (UI)
- In Text to Speech, select your base voice, open Voice Remixing, describe the changes (e.g., “slightly higher pitch, softer delivery”), optionally provide a test script, generate previews, then save your preferred preview as a new voice. The original voice remains unchanged. (elevenlabs.io)
How remixing works (API)
- Two-step flow:
- POST /v1/text-to-voice/remix with voice_id and a voice_description (plus options like text, auto_generate_text, seed, loudness, guidance_scale, and prompt_strength when using v3). You receive one or more previews, each with a generated_voice_id. (elevenlabs.io)
- Create the new saved voice by calling the create endpoint with the chosen generated_voice_id. (elevenlabs.io)
Behavior, cost, and limits
- Original voice safety: “Remixing creates a new voice variant. Your original voice remains unchanged.” (elevenlabs.io)
- Iteration: No hard limit—you can iteratively remix variants to refine style (e.g., accent, pacing, tone). (elevenlabs.io)
- Cost: Charged based on the length of the test script used for remixing. (elevenlabs.io)
- Plan context: Your PVC slot limit governs how many distinct PVCs you can train (e.g., 1 on Creator/Pro/Scale). Variants created via Remixing are saved as additional voices; note that “the total number of custom voices you can have (including PVCs) depends on your subscription tier.” (help.elevenlabs.io)
If you’d like, I can walk you through the exact UI clicks or provide a minimal API
Yes, walk through the UI clicks.
Below is the documented, UI-level flow to create remixed variants (“sub‑clones”) of a voice you own:
- Go to Voices > My Voices in the ElevenLabs web app and pick a voice you personally own (your PVC, an IVC, or a Voice Design voice). You can’t remix voices from the public Voice Library; only voices in your own library are eligible. (help.elevenlabs.io)
- Open Voice Remixing for that voice (the feature is available in the core web platform). In the Remixing view, enter a natural‑language prompt describing the changes you want (e.g., accent, style, pacing, gender, audio quality). (elevenlabs.io)
- Choose a script for the preview:
- Use a default script, or
- Enter your own text and, if you’re using the v3 model, you may include audio tags like [laughs], [whispers], [sighs]. (elevenlabs.io)
- Set Prompt Strength (how strongly the remix follows your prompt) and generate previews. Start stronger if you want to explore a wider range, then iterate. (elevenlabs.io)
- Review the previews, select the one you like, and save it as a new voice. Remixing creates a new variant and does not alter your original voice. (elevenlabs.io)
Notes and constraints from the docs:
- Only voices you own can be remixed; public library voices cannot. (elevenlabs.io)
- Remixing produces a separate variant (“new voice”); your source PVC remains available unchanged. (elevenlabs.io)
- The ElevenLabs blog recommends creating targeted variants (“sub‑clones”) for accent, emotion, or age when you need nuanced differences. (elevenlabs.io)
If you need the programmatic equivalent, the API workflow is: generate preview(s) via the Remix endpoint for a given voice_id, then create the new voice from the chosen generated_voice_id—mirroring the UI’s preview-then-save steps. (elevenlabs.io)
The official docs do not publish the exact button labels/positions in the UI, but the steps above reflect the documented flow for Voice Remixing in the web app. (elevenlabs.io)
Sources:
What is My Voices? – ElevenLabs
Voice remixing | ElevenLabs Documentation
Voice remixing | ElevenLabs Documentation
Voice remixing | ElevenLabs Documentation
Remix a voice | ElevenLabs Documentation
Voice remixing | ElevenLabs Documentation
Voice remixing | ElevenLabs Documentation
ElevenLabs Voice Cloning: 7 Tips for Pro Audio Quality
Remix a voice | ElevenLabs Documentation
Voice remixing | ElevenLabs Documentation
When recording source material for a PVC, should non-English words that appear in the English language be recorded? I am specifically thinking of Irish words that are common in the English language as spoken in Ireland but necessarily in other English speaking jurisdiction's.
Short answer: The docs don’t require you to record non‑English words for a Professional Voice Clone. ElevenLabs recommends training your PVC with audio in the same language and accent you plan to generate with. So for Irish English, focus on English recordings spoken in your Irish accent. (help.elevenlabs.io)
If you expect frequent Irish loanwords/code‑switches:
- Multilingual use is best supported by providing training material per language; the ElevenLabs guide suggests 30–45 minutes per language and, for nuanced cases, creating sub‑clones tuned to accent. Note: only do this for languages the model supports. (Irish/Gaeilge is not listed among supported languages as of today.) (elevenlabs.io)
For occasional Irish words in otherwise English text:
- You can enforce correct pronunciations without adding separate Irish training audio by:
- Using SSML phoneme tags (IPA or CMU) on supported models. (help.elevenlabs.io)
- Adding entries in Studio’s Pronunciations Editor/Dictionary. (help.elevenlabs.io)
Generation tips:
- On the website, avoid mixing multiple languages in a single prompt because language detection is automatic. Via API, you can set language_code explicitly for ambiguous text. (help.elevenlabs.io)
Related guidance:
- The “7 tips” article recommends testing bilingual voices with mixed‑language lines to assess code‑switching, and suggests sub‑clones for accent‑specific needs. Use this for evaluation; for production, follow the language/accent and pronunciation controls above. (elevenlabs.io)
Sources:
How do I select the language and accent? – ElevenLabs
ElevenLabs Voice Cloning: 7 Tips for Pro Audio Quality
How can I force a certain pronunciation of a word or name? – ElevenLabs
How do I use the Pronunciations Editor in Studio? – ElevenLabs
How do I select the language and accent? – ElevenLabs
ElevenLabs Voice Cloning: 7 Tips for Pro Audio Quality
Is IPA or CMU available for V3 model?