#When running speech-to-text the srt part of th transcript is missing a lot of words

1 messages · Page 1 of 1 (latest)

unreal ivyBOT
#
AI-generated Answer for ssaue_19420

Short answer: this is expected if you’re exporting “subtitles” (SRT) rather than the raw transcript.

Details and fixes:

  • Subtitles vs transcript: in ElevenLabs, subtitles are a separate object from the transcript. Subtitles must obey formatting constraints (characters per line, lines on screen, cue duration) and do not allow overlapping segments. Because of these constraints, auto-generated SRT may contain fewer words than the full transcript. Changes you make to the transcript do not change the subtitles (and vice versa).
  • Get a more complete SRT: export the SRT from the Transcript view (not the Subtitles view). Transcript exports are not bound by the subtitle editor’s constraints. In the Transcript Editor, use Export → SRT.
  • If you do want to keep using the Subtitle Editor: relax the rules so more words fit per cue (three‑dot menu → Edit rules), and use split/merge cues to restructure where needed. Subtitles do not support overlapping segments, so resolve overlaps to avoid content being dropped.
  • If timing drift is the cause (words exist in the transcript but don’t appear where expected in SRT), you can realign text to audio with the Forced Alignment API, then export to SRT.

If you can share whether you exported SRT from the Transcript or Subtitles view, I can point to the exact steps.

random jacinth
#

We are using the speech-to-text Scribe_v1 api

unreal ivyBOT
#
AI-generated Answer for ssaue_19420

Thanks—Scribe v1 is the right model for high‑accuracy batch STT and it supports exporting captions like SRT via the API.

To help diagnose “missing words” in the SRT export, please confirm the following:

  • Do the missing words appear in the JSON transcript (text/words array) but not in the SRT? Scribe v1 returns word‑level timestamps in JSON; comparing these to the SRT will tell us if this is an export/formatting issue vs. recognition. (elevenlabs.io)
  • Are you requesting SRT via the additional_formats parameter on POST /v1/speech-to-text, and then reading it from the additional_formats array when you GET the transcript? The additional_formats items are returned as base64 content—be sure you’re decoding the payload before parsing. (elevenlabs.io)
  • Is your source audio multichannel? If so, are you using use_multi_channel=true? In multichannel mode, channels are transcribed independently; ensure you’re generating/combining captions from all channels if needed. (elevenlabs.io)
  • If the audio language is known, are you setting language_code? Providing it can improve recognition quality. (elevenlabs.io)

Notes and references:

  • Scribe v1 provides precise word‑level timestamps, diarization, and supports 99 languages (useful for building captions from the words array if needed). (elevenlabs.io)
  • ElevenLabs STT supports exporting subtitles, including SRT and VTT. (elevenlabs.io)

If you can share whether the words are present in the JSON but absent in SRT, plus an example transcription_id, I can suggest exact next steps based on the API’s fields. (elevenlabs.io)

Sources:
Speech to Text | ElevenLabs Documentation
Create transcript | ElevenLabs Documentation
Create transcript | ElevenLabs Documentation
Create transcript | ElevenLabs Documentation
Speech to Text | ElevenLabs Documentation
Convert Voice to Text with AI — Fast Voice to Text Converter
Get transcript | ElevenLabs Documentation

random jacinth
#

The missing words appear in the JSON transcript, but not in the additional_formats srt. We are not using multichannel and not setting language. Here is an example transcript: VFW5eqK6WeAam9FCx7km