#How does the speech to text featurue work? I just clicked "Transcribe files", uploaded an audio fil
1 messages · Page 1 of 1 (latest)
ElevenLabs does not currently offer a speech-to-text feature. The platform focuses on text-to-speech, speech-to-speech, voiceover, and dubbing functionalities. If you're looking to transcribe audio files into text, you may need to explore other platforms that specialize in transcription services.
If you have further questions or need help with other features, feel free to ask!
Um ... what? Yes — you do
To use the speech-to-text feature, you can upload your audio file, and the system should automatically generate a text transcript. The transcript will typically appear in your "History" tab once the process is complete. If nothing happens after uploading your file and you see a greyed-out name, it might indicate an issue.
Here are a few steps you can try to resolve this:
- Refresh the Page: Sometimes, refreshing the page can help the system process the file.
- Check File Size and Length: Ensure your audio file is under 50MB and less than 5 minutes long. If it's longer, break it into smaller sections.
- Clear Cache and Cookies: Clear your browser's cache and cookies, then try again.
- Switch Browsers or Devices: Test the feature on a different browser or device to see if the issue persists.
If the problem continues, you may want to record the issue, including network data, for troubleshooting. You can use tools like Jam.dev to create a screen recording. Once recorded, share the details with the support team for further assistance.
Where can I find this History tab?
The location of the History tab depends on the mode you're using:
- Simple Mode: The History tab is located on the left side of the screen, next to the "Generate" button.
- Advanced Mode: The History tab is in the panel on the right side of the screen. On narrow screens, it will appear on the left side, next to "Generate."
From the History tab, you can access and download your previously generated files.
Simple and Advanced mode? What? Where are those? I'm in the Speech to Text product. There's nothing like that