#When running speech-to-text the srt part of the transcript is missing a lot of words. This issue ha

1 messages · Page 1 of 1 (latest)

elder saddleBOT
#
AI-generated Answer for lkoerkemeier

🔍 Searching the web...

spiral hazel
#

Ok, what did you find?

elder saddleBOT
#
AI-generated Answer for lkoerkemeier

I don’t have enough context yet. What ElevenLabs topic should I investigate? Examples:

  • API: text-to-speech (REST/WebSocket/streaming), speech-to-speech, Speech Synthesis Markup Language support, rate limits, or Projects API
  • Voices: cloning requirements/consent, Voice Design, Voice Library usage
  • Pricing/quotas/billing
  • Platform features: Dubbing/Projects, Audio Native, safety/policy

Tell me the exact question or endpoint (e.g., “streaming TTS over WebSocket in Node,” “voice cloning consent rules,” or “dubbing limits”), and I’ll look it up in the official docs and cite the sources.

spiral hazel
#

When running speech-to-text the srt part of the transcript is missing a lot of words. This issue has been reported before and it still seems to be a problem

elder saddleBOT
#
AI-generated Answer for lkoerkemeier

Thanks for flagging this. I can’t find any official notice in the ElevenLabs docs or Help Center that SRT exports are a known open issue today. If you’re still seeing missing words, here’s how to verify and improve results using documented features:

  • Verify what the API returned vs. the SRT export

    • Retrieve the transcript JSON and compare the words array (with timestamps) to the SRT in additional_formats. The JSON is the ground truth for what the model recognized; discrepancies help isolate whether the issue is in export rather than recognition. See Get transcript (additional_formats in the response). (elevenlabs.io)
    • When creating transcripts, you can request extra export formats (e.g., SRT) via additional_formats; confirm you’re requesting SRT there. (elevenlabs.io)
  • Improve recognition of important terms

    • If specific words are dropped (brand names, jargon), use Scribe v2 with Keyterm Prompting (up to 100 terms) to bias the model toward those words. (elevenlabs.io)
  • Realtime tuning (if you’re streaming)

    • Enable include_timestamps to receive committed_transcript_with_timestamps and adjust VAD settings (vad_threshold, vad_silence_threshold_secs, min_speech_duration_ms, min_silence_duration_ms) to reduce premature segment cuts that can omit short words. (elevenlabs.io)
  • Subtitle-grade timing

    • If you already have the intended text (or the model’s full text) but need precise timings, use the Forced Alignment API to align the audio to that text and then render SRT from the returned word/character timestamps. This is explicitly recommended for matching subtitles to video. (elevenlabs.io)
  • UI export

    • If you’re using the Transcript Editor, confirm you’re exporting SRT from the export menu and compare it against the on-screen transcript. (elevenlabs.io)

If you can share a minimal example (transcription_id and a short audio sample) I can help pinpoint whether the drop happens in recognition or formatting. For a formal investigation, please submit a ticket and include the transcription_id and files. (help.elevenlabs.io)

Sources:
Get transcript | ElevenLabs Documentation
Create transcript | ElevenLabs Documentation
Transcription | ElevenLabs Documentation
Realtime | ElevenLabs Documentation
Forced Alignment | ElevenLabs Documentation
Transcripts (product guide) | ElevenLabs Documentation
How can I contact support? – ElevenLabs

spiral hazel
#

The drop of words happens only on the additional_formats the main JSON is correct.