#Japanese TTS Issue: Initial Kanji misidentified as Chinese (Context Bias at Sentence Start)

4 messages · Page 1 of 1 (latest)

junior scarab
#

I’ve identified a specific pattern regarding the mispronunciation of Japanese Kanji. When a sentence begins with a word that exists in both Chinese and Japanese (e.g., "毎日", "京都"), the model defaults to a Chinese-influenced pronunciation for that initial word, even when the rest of the sentence is clearly Japanese.

Key Observation:
The issue seems to be triggered by the position of the word. If these Kanji appear at the very beginning of the input string, the language identification or phonetic mapping defaults to Chinese.

Examples:

Case 1: 毎日、スマホの画面を見つめて...

Result: "毎日" is pronounced as "Měirì" (Chinese) instead of "Mainichi" (Japanese).

Case 2: 京都に行きたい。

Result: "京都" is pronounced with Chinese phonetics (Jīngdū) instead of "Kyoto".

Technical Hypothesis:
The model might be lacking enough look-ahead context at the start of the sequence to correctly distinguish between CJK languages for shared Hanzi/Kanji characters.

Environment:

Model: [s2-pro]

Language: Japanese (JA)

primal rain
#

cant reproduce, seems like it's the issue of your reference audio

#

and you can add a control tag to increase the stability

#

like this [japanese] 毎日、スマホの画面を見つめて