#Despite using STT with languageCode set to 'ja' for Japanese transcription, I sometimes receive tra
1 messages · Page 1 of 1 (latest)
AI-generated Answer for ttiger4287
Short answer: This is expected when Scribe v2 detects multiple languages. Setting languageCode to ja does not hard-lock output to Japanese; Scribe v2 automatically detects and transcribes whatever language it hears, and Scribe v2 Realtime can switch languages mid-stream.
What the docs say
- Scribe v2 provides automatic multi‑language transcription; a single file can contain multiple languages and the model will detect and transcribe them accordingly. (elevenlabs.io)
- Batch STT (POST /v1/speech-to-text) accepts an optional language_code (ISO‑639‑1 or ‑3, e.g., ja or jpn). The page documents temperature but does not claim language_code forces a single-language output. (elevenlabs.io)
- Realtime STT exposes language_code as an optional query param and supports automatic language detection; you can also request the detected language in each committed_transcript_with_timestamps event. Realtime additionally supports “text conditioning” via the previous_text field (which can bias the next output). (elevenlabs.io)
- Product guide: you may select a primary language, or leave Detect on; “any languages within the audio will be automatically detected.” (elevenlabs.io)
How to keep output Japanese-only (workarounds)
- Batch: Pass language_code=ja (or jpn) on POST /v1/speech-to-text; keep temperature at 0 if you want more deterministic phrasing. This hints the model but does not disable detection of other languages present in the audio. (elevenlabs.io)
- Realtime:
- Set language_code=ja in the websocket query. Enable include_language_detection=true and drop any committed_transcript_with_timestamps segments whose language_code isn’t ja. (elevenlabs.io)
- Avoid sending previous_text in English/Italian; this conditions the model and can steer output toward that language. (elevenlabs.io)
- If your recordings actually contain English/Italian utterances (intros, code‑switching, background speech), Scribe v2 will transcribe those as such by design. Remove those sections or segment audio before transcription. (elevenlabs.io)
- If certain Japanese terms are being misrecognized into English/Italian, add them as keyterms to bias recognition. (elevenlabs.io)
Limitations
- There is no documented “force single language” or “no language switching” flag beyond providing language_code; automatic detection remains active. (elevenlabs.io)
References
- Transcription overview and features (Scribe v2) (elevenlabs.io)
- Scribe v2 blog: automatic multi‑language transcription, keyterms, entity detection (elevenlabs.io)
- Batch STT API (POST /v1/speech
What's the best way to compel language code or nearly compel it?