how to use this : pronunciation_dictionary_locators | ElevenLabs | Page 1

hoary brambleBOT Jan 9, 2025, 8:43 AM

#

AI-generated Answer for hasna.chouikhi_04357

To use pronunciation_dictionary_locators in your text-to-speech requests, you need to specify the pronunciation dictionary ID and version ID. Here's a Python example using the ElevenLabs API:

import requests

# Set your API key and base URL
XI_API_KEY = "API_KEY_HERE"
BASE_URL = "https://api.elevenlabs.io/v1"
VOICE_ID = "TxGEqnHWrfWFTfGW9XjX"

# Headers for the request
headers = {
    "Accept": "application/json",
    "xi-api-key": XI_API_KEY
}

def text_to_speech(text, pronunciation_dictionary_id, version_id):
    """ Sends a text to speech request using a pronunciation dictionary. """
    # Define the URL for the text-to-speech endpoint
    url = f"{BASE_URL}/text-to-speech/{VOICE_ID}"

    # Payload for the request
    payload = {
        "model_id": "eleven_monolingual_v1",
        "pronunciation_dictionary_locators": [
            {
                "pronunciation_dictionary_id": pronunciation_dictionary_id,
                "version_id": version_id
            }
        ],
        "text": text,
        "voice_settings": {
            "stability": 0.5,
            "similarity_boost": 0.8,
            "style": 0.0,
            "use_speaker_boost": True
        }
    }

    # Make the POST request
    response = requests.post(url, json=payload, headers=headers)

    # Check the response status
    if response.status_code == 200:
        print("Audio file generated successfully.")
        with open("output_audio.mp3", "wb") as audio_file:
            audio_file.write(response.content)
    else:
        print("Error:", response.status_code)

def main():
    # Example text and dictionary IDs (replace with actual values)
    text = "Hello, world! I can now use pronunciation dictionaries."
    pronunciation_dictionary_id = "PD_ID_HERE"
    version_id = "VERSION_ID_HERE"

    # Call the text to speech function
    text_to_speech(text, pronunciation_dictionary_id, version_id)

if __name__ == "__main__":
    main()

This code demonstrates how to send a text-to-speech request using a pronunciation dictionary by specifying the pronunciation_dictionary_id and version_id in the payload. Make sure to replace placeholders like API_KEY_HERE, PD_ID_HERE, and VERSION_ID_HERE with your actual API key and dictionary details.

glass forge Jan 9, 2025, 8:47 AM

#

can you explain what can be the value of pronunciation_dictionary_id and the version_id

hoary brambleBOT Jan 9, 2025, 8:47 AM

#

AI-generated Answer for hasna.chouikhi_04357

The pronunciation_dictionary_id and version_id are identifiers you receive after uploading a pronunciation dictionary to the ElevenLabs API. These IDs are essential for using the pronunciation dictionary in text-to-speech requests.

To obtain these IDs, you need to upload your pronunciation dictionary file using the API. Here's a brief overview of how you can do this:

Upload the Pronunciation Dictionary: Use the API to upload your pronunciation dictionary file. You can refer to the Add from file documentation for detailed instructions and example code.
Receive the IDs: After a successful upload, the API will return a pronunciation_dictionary_id and a version_id. These are used to reference the specific dictionary and its version in subsequent API requests.
Use in Text-to-Speech: When making a text-to-speech request, include these IDs in the pronunciation_dictionary_locators parameter to apply the pronunciation rules from your dictionary.

Here's a code snippet demonstrating how to use these IDs in a text-to-speech request:

import requests

# Set your API key and base URL
XI_API_KEY = "API_KEY_HERE"
BASE_URL = "https://api.elevenlabs.io/v1"
VOICE_ID = "TxGEqnHWrfWFTfGW9XjX"

# Headers for the request
headers = {
    "Accept": "application/json",
    "xi-api-key": XI_API_KEY
}

def text_to_speech(text, pronunciation_dictionary_id, version_id):
    """ Sends a text to speech request using a pronunciation dictionary. Returns: An audio file. """
    url = f"{BASE_URL}/text-to-speech/{VOICE_ID}"
    payload = {
        "model_id": "eleven_monolingual_v1",
        "pronunciation_dictionary_locators": [
            {
                "pronunciation_dictionary_id": pronunciation_dictionary_id,
                "version_id": version_id
            }
        ],
        "text": text,
        "voice_settings": {
            "stability": 0.5,
            "similarity_boost": 0.8,
            "style": 0.0,
            "use_speaker_boost": True
        }
    }
    response = requests.post(url, json=payload, headers=headers)
    if response.status_code == 200:
        print("Audio file generated successfully.")
        with open("output_audio.mp3", "wb") as audio_file:
            audio_file.write(response.content)
    else:
        print("Error:", response.status_code)

def main():
    text = "Hello, world! I can now use pronunciation dictionaries."
    pronunciation_dictionary_id = "PD_ID_HERE"
    version_id = "VERSION_ID_HERE"
    text_to_speech(text, pronunciation_dictionary_id, version_id)

if __name__ == "__main__":
    main()

Replace "API_KEY_HERE", "PD_ID_HERE", and "VERSION_ID_HERE" with your actual API key, pronunciation dictionary ID, and version ID.

glass forge Jan 9, 2025, 8:55 AM

#

how can I use this : client.pronunciation_dictionary.add_from_file()

hoary brambleBOT Jan 9, 2025, 8:55 AM

#

AI-generated Answer for hasna.chouikhi_04357

To use client.pronunciation_dictionary.add_from_file(), you need to follow these steps:

Set Up Your API Key: Ensure you have your API key ready. You can find it in the 'Profile' tab on the ElevenLabs website.
Prepare Your PLS File: Create a lexicon .pls file that contains the pronunciation rules you want to apply. This file will be used to initialize your pronunciation dictionary.

Use the Function: You can use the function in your code as follows:

from elevenlabs import ElevenLabs

# Initialize the client with your API key
client = ElevenLabs(api_key="YOUR_API_KEY")

# Add a pronunciation dictionary from a file
client.pronunciation_dictionary.add_from_file(
    name="Your Dictionary Name",
    file=open("path_to_your_file.pls", "rb"),
    description="Description of your dictionary"
)

Parameters:
- name: A string used for identifying your pronunciation dictionary.
- file: The .pls file containing your pronunciation rules.
- description: An optional string to describe your dictionary.
Response: Upon successful execution, you will receive a response containing the pronunciation_dictionary_id and version_id, which you can use in subsequent API calls.

For more detailed information, you can refer to the Add from file - ElevenLabs documentation.

glass forge Jan 9, 2025, 8:57 AM

#

what is the structure of this file

hoary brambleBOT Jan 9, 2025, 8:57 AM

#

AI-generated Answer for hasna.chouikhi_04357

The file used with client.pronunciation_dictionary.add_from_file() must be in the .PLS format (Pronunciation Lexicon Specification). This format is an XML-based structure that defines pronunciation rules for specific words or phrases. Below is an example of a valid .PLS file structure:

<?xml version="1.0" encoding="UTF-8"?>
<lexicon version="1.0"
      xmlns="http://www.w3.org/2005/01/pronunciation-lexicon"
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
      xsi:schemaLocation="http://www.w3.org/2005/01/pronunciation-lexicon
        http://www.w3.org/TR/2007/CR-pronunciation-lexicon-20071212/pls.xsd"
      alphabet="ipa" xml:lang="en-US">
  <lexeme>
    <grapheme>tomato</grapheme>
    <phoneme>/tə'meɪtoʊ/</phoneme>
  </lexeme>
  <lexeme>
    <grapheme>Tomato</grapheme>
    <phoneme>/tə'meɪtoʊ/</phoneme>
  </lexeme>
</lexicon>

Key Points:

File Format: The file must be in .PLS format, not .txt or any other type.
Phoneme Alphabet: You can use either the IPA (International Phonetic Alphabet) or CMU Arpabet for the <phoneme> tag.
Case Sensitivity: Entries in the <grapheme> tag are case-sensitive. For example, "tomato" and "Tomato" must be defined separately if both are needed.
Structure: Each word or phrase is defined within a <lexeme> tag, with the <grapheme> specifying the word and the <phoneme> or <alias> specifying its pronunciation or replacement.

This file can be uploaded using the client.pronunciation_dictionary.add_from_file() method to create a pronunciation dictionary.

glass forge Jan 9, 2025, 9:06 AM

#

if I want to change the language for arabic or hindi or turkie

hoary brambleBOT Jan 9, 2025, 9:06 AM

#

AI-generated Answer for hasna.chouikhi_04357

Currently, Pronunciation Dictionaries are only supported for the English language when using the Turbo v2 model. Unfortunately, they do not support multi-language pronunciation, including Arabic, Hindi, or Turkish.

For multilingual models, you can use "Alias tags" to specify pronunciation by substituting words or phrases. For example, you can define that "UN" should be read as "United Nations." However, for non-English words, the only way to adjust pronunciation is by spelling the word phonetically to achieve the desired result.

If you need further assistance, feel free to ask!

glass forge Jan 9, 2025, 9:07 AM

#

and for flash 2.5 ?

hoary brambleBOT Jan 9, 2025, 9:07 AM

#

AI-generated Answer for hasna.chouikhi_04357

The Flash v2.5 model does not phoneme-based pronunciation rules in pronunciation dictionaries. However, you can use the alias tag to specify pronunciation using alternative words or phrases. For example, you can define that "UN" should be read as "United Nations" when encountered in a project.

If you want to change the language to Arabic, Hindi, or Turkish, you can use the alias tag to specify replacements in the desired language. Ensure that the dictionary file is in the correct .PLS format and includes the appropriate language code in the xml:lang attribute. Here's an example structure for a .PLS file:

<?xml version="1.0" encoding="UTF-8"?>
<lexicon version="1.0"
         xmlns="http://www.w3.org/2005/01/pronunciation-lexicon"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://www.w3.org/2005/01/pronunciation-lexicon
         http://www.w3.org/TR/2007/CR-pronunciation-lexicon-20071212/pls.xsd"
         alphabet="ipa" xml:lang="ar-SA">
   <lexeme>
     <grapheme>مرحبا</grapheme>
     <alias>Hello</alias>
   </lexeme>
   <lexeme>
     <grapheme>भारत</grapheme>
     <alias>India</alias>
   </lexeme>
</lexicon>

In this example:

xml:lang="ar-SA" specifies the language as Arabic (Saudi Arabia). You can change this to the appropriate language code for Hindi (hi-IN) or Turkish (tr-TR).
The grapheme tag contains the original word, and the alias tag specifies the replacement.

Let me know if you need further clarification!

glass forge Jan 9, 2025, 10:06 AM

#

why the dictionary is not applied the same error of pronunciation

hoary brambleBOT Jan 9, 2025, 10:06 AM

#

AI-generated Answer for hasna.chouikhi_04357

Mispronunciations can occur due to text and voice dependencies. To address this, you can use the Pronunciation Dictionary feature in the Projects tool to specify how words should be pronounced consistently. However, it's important to note that pronunciation dictionaries are currently only available for the English language on our Turbo v2 model. This limitation might be why you're experiencing issues with other languages.

If you're using the English V1 or Turbo V2 models, you can also use SSML phoneme tags to control pronunciation more precisely. Additionally, ensure that the voice is properly cloned in the language you want to maintain consistency. If regenerating specific sections is consuming more credits, check your project settings to ensure that only the necessary parts are being regenerated. For more detailed steps, refer to the Pronunciation Dictionary section of our guide to Projects.

hoary brambleBOT Jan 27, 2025, 3:02 PM

#

AI-generated Answer for ricasco

To use the alias tag with the Turbo v2.5 model for correcting the pronunciation of numbers Italian, you need to replace the <phoneme> tags in your dictionary with <alias> tags. The alias tag allows you to specify alternative words or phrases for pronunciation, which is supported by the Turbo v2.5 model.

Here’s how your updated dictionary_italian.pls file should look:

<?xml version="1.0" encoding="UTF-8"?>
<lexicon version="1.0"
      xmlns="http://www.w3.org/2005/01/pronunciation-lexicon"
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
      xsi:schemaLocation="http://www.w3.org/2005/01/pronunciation-lexicon
        http://www.w3.org/TR/2007/CR-pronunciation-lexicon-20071212/pls.xsd"
      alphabet="ipa" xml:lang="it-IT">
  <lexeme>
    <grapheme>0</grapheme>
    <alias>zero</alias>
  </lexeme>
  <lexeme>
    <grapheme>1</grapheme>
    <alias>uno</alias>
  </lexeme>
  <lexeme>
    <grapheme>2</grapheme>
    <alias>due</alias>
  </lexeme>
  <lexeme>
    <grapheme>3</grapheme>
    <alias>tre</alias>
  </lexeme>
  <lexeme>
    <grapheme>4</grapheme>
    <alias>quattro</alias>
  </lexeme>
  <lexeme>
    <grapheme>5</grapheme>
    <alias>cinque</alias>
  </lexeme>
  <lexeme>
    <grapheme>6</grapheme>
    <alias>sei</alias>
  </lexeme>
  <lexeme>
    <grapheme>7</grapheme>
    <alias>sette</alias>
  </lexeme>
  <lexeme>
    <grapheme>8</grapheme>
    <alias>otto</alias>
  </lexeme>
  <lexeme>
    <grapheme>9</grapheme>
    <alias>nove</alias>
  </lexeme>
  <lexeme>
    <grapheme>+</grapheme>
    <alias>più</alias>
  </lexeme>
</lexicon>

Key Changes:

Replaced <phoneme> with <alias>: The alias tag specifies the replacement text directly, which is supported by Turbo v2.5.
Updated xml:lang to it-IT: This ensures the dictionary is recognized as Italian.
Mapped numbers and symbols to their Italian equivalents: For example, 0 is mapped to "zero," + is mapped to "più."

Steps to Use:

Upload this updated .PLS file to your project settings in the ElevenLabs platform.
The AI will now replace numbers and symbols with their specified Italian pronunciations during text-to-speech generation.

This approach ensures that numbers and symbols are pronounced correctly in Italian. Let me know if you need further assistance!

#

AI-generated Answer for ricasco

If you are uploading the .PLS file to your project settings in the ElevenLabs platform, you don't necessarily need to modify the main.py file unless you plan to dynamically use the pronunciation dictionary in your application. However, if you want to ensure that the uploaded pronunciation dictionary is applied during text-to-speech generation, you may need to include the pronunciation_dictionary_locators in your API calls.

Here’s an example of how you could modify your main.py file to include the pronunciation dictionary in your text-to-speech requests:

import os
import logging
from fastapi import FastAPI, Request, Body
from fastapi.middleware.cors import CORSMiddleware
from pydantic import BaseModel
from CheckRestaurantHours import check_restaurant_hours
from typing import Optional
import json
import requests

logging.basicConfig(level=logging.INFO)

class DeliveryAddressRequest(BaseModel):
    address: str

app = FastAPI()
app.add_middleware(
    CORSMiddleware,
    allow_origins=["*"],
    allow_credentials=True,
    allow_methods=["*"],
    allow_headers=["*"],
)

ELEVENLABS_API_KEY = os.getenv("ELEVENLABS_API_KEY")
BASE_URL = "https://api.elevenlabs.io/v1"
PRONUNCIATION_DICTIONARY_ID = "YOUR_DICTIONARY_ID"  # Replace with your dictionary ID
VERSION_ID = "YOUR_VERSION_ID"  # Replace with your dictionary version ID

def generate_text_to_speech(text: str, voice_id: str):
    """
    Generate text-to-speech audio using ElevenLabs API with a pronunciation dictionary.
    """
    url = f"{BASE_URL}/text-to-speech/{voice_id}"
    headers = {
        "Accept": "application/json",
        "xi-api-key": ELEVENLABS_API_KEY
    }
    payload = {
        "model_id": "eleven_turbo_v2_5",
        "pronunciation_dictionary_locators": [
            {
                "pronunciation_dictionary_id": PRONUNCIATION_DICTIONARY_ID,
                "version_id": VERSION_ID
            }
        ],
        "text": text,
        "voice_settings": {
            "stability": 0.5,
            "similarity_boost": 0.8,
            "style": 0.0,
            "use_speaker_boost": True
        }
    }
    response = requests.post(url, json=payload, headers=headers)
    if response.status_code == 200:
        return response.content  # This is the audio file
    else:
        logging.error(f"Error generating TTS: {response.status_code}, {response.text}")
        return None

@app.get("/api/check-restaurant-hours")
async def api_check_restaurant_hours(timezone: str = "Europeome"):
    """
    Check if the restaurant is currently open and get scheduling information.
    """
    logging.info(f"Request to check restaurant hours for timezone: {timezone}")
    return await check_restaurant_hours(timezone)

@app.post("/api/generate-tts")
async def api_generate_tts(text: str, voice_id: str = "YOUR_VOICE_ID"):
    """
    Generate text-to-speech audio with the pronunciation dictionary.
    """
    logging.info(f"Request to generate TTS for text: {text}")
    audio = generate_text_to_speech(text, voice_id)
    if audio:
        return {"message": "TTS generated successfully", "audio": audio}
    else:
        return {"message": "Failed to generate TTS"}

Key Changes:

Added generate_text_to_speech Function: This function sends a request to the ElevenLabs API, including the pronunciation_dictionary_locators to apply your uploaded .PLS file.
Environment Variables: Ensure your API key is stored securely in an environment variable (ELEVENLABS_API_KEY).
New Endpoint: Added a /api/generate-tts endpoint to handle text-to-speech requests.

Steps to Implement:

Replace YOUR_DICTIONARY_ID and YOUR_VERSION_ID with the actual ID and version of your uploaded pronunciation dictionary.
Replace YOUR_VOICE_ID with the ID of the voice you want to use.
Ensure the .PLS file is correctly uploaded to your project settings in the ElevenLabs platform.

This setup will allow you to use the alias tags and other pronunciation rules defined in your .PLS file during text-to-speech generation.

#how to use this : pronunciation_dictionary_locators

Key Points:

Key Changes:

Steps to Use:

Key Changes:

Steps to Implement: