I made a request like this:
const response = await fetch("https://api.fish.audio/v1/tts", {
method: "POST",
headers: {
"Authorization": `Bearer ${process.env.FISH_API_KEY}`,
"Content-Type": "application/json",
"model": "s1",
},
body: JSON.stringify({
text: "かえす",
format: "wav",
reference_id: voice,
}),
});
I expected to hear かえす (ka-e-su) once.
Actual result is a 47s long audio that repeats the word (attached the wav for it).
The bug only happens for maybe 5% of generations. The voice id is "63bc41e652214372b15d9416a30a60b4".
Is this a known issue? Any known ways I can ensure quality or avoid this issue?