#Wendigos Voice Cloning
1186 messages Β· Page 2 of 2 (latest)
Hello I saw a video on this
https://www.youtube.com/watch?v=dQ841Pd6YvQ
It seems to a free alternative to eleven labs i think
I have no idea if its as good, or actually free
Just something to look into π
Here's the Qwen3-TTS Demo app I showed in the video: https://huggingface.co/spaces/Qwen/Qwen3-TTS
It's only a matter of time until there's an open UI for it that beats Eleven Labsβall for free.
Read about the time someone cloned my voice for a video training series, unauthorized: https://www.jeffgeerling.com/blog/2024/elecrow-responded-apolo...
Yes I am aware of Qwen3-TTS and am working on adding support for it in VoiceBox! The challenge is the lack of streaming input/output support with current server implementations https://github.com/vllm-project/vllm-omni/issues/938
Once a fully fleshed-out server is released with streaming support it will definitely be added
Bet ty for the response, so once it will be added, will it be free to use?
Yep! You'll need a decent GPU and will need to set up your own TTS server locally, but once thats done it'll be free
This looks very interesting. I figure it's possible to make it work in any language..?
Yep! The smart clip playback works in any language and the voice cloning works with 32 languages https://help.elevenlabs.io/hc/en-us/articles/13313366263441-What-languages-do-you-support
Nice. I'm considering this for a modpack I'm preparing for a big local YT channel/streamer group. Reading the readme I'm not entirely sure but it sounds like everything can be prepared in advance since all players in the lobby share the same API keys for the services via config, correct?
Yes that's correct! If you want to use the realtime features I recommend getting the creator plan on elevenlabs to get allotted more simultaneous connections.
That plan also has more Speech-to-Text time than Azure does since I could optimize it more. I also recommend having the host enable "config sync" and syncing settings that way
hi, I'm currently trying to set this up. I want to use realtime responses and I'm attempting to use ChatGPT for chat (gpt-4.1-mini) and Elevenlabs for STT and TTS. I've set the API keys in all three locations, adjusted the language code for elevenlabs (2 char), predefined a voice and entered its voice id. When I'm in-game mimics won't talk at all. Is it because I'm attempting this in a solo lobby?
also is the chat model fine or are others much better?
can you elaborate on that config sync part, you mentioned? is it only syncing the api keys?
because the voice id needs to be different for each player, right?
Hey! Can you grab your log file from {Game Folder}/BepInEx/LogOutput.txt so I can take a look?
For sure! The feature syncs api keys and settings (realtime enabled, model selection, etc), but voice IDs can either be set per user or will automatically populate if the voice is autocloned.
Sure, still tinkering around. I'll grab the next one. Would just the lines starting with [Wendigo.. ] be sufficient as well?
I recommend Gemini flash as I found it to be faster than gpt mini but any model optimized for speed should be good.
Yes those and any errors if you're getting them!
Interesting, seems like I'm getting the same issue on my end. I wonder if v73 changed something
I'll be looking into it this evening!
ok, I'll spare you my log then.
The STT and Chat are working normally but for some reason the TTS isnt being played
what I'm also seeing, but this might be some incompatibility, is masked are spawning directly inside of my player character. Disabling only wendigos and nothing else resolves this.
I also checked if it's lostenemyfix mod that places enemies on a nearby navmesh when they would otherwise spam errors due to not finding a valid one but that wasn't it.
Ah that may be a debugging function I was using to teleport masked to me, I may have missed commenting it out somewhere
Just for clarification: any of the chat service providers requires setting up api billing, correct?
I wanted to prepare gemini to try it, made a google cloud account where it said try for 90 days / 500$ but it won't let me set up a billing profile for that account for some reason. Or rather it won't let me connect the billing profile I created to the account. I'll just give it some time ig
ok so I resubscribed to elevenlabs and I can hear the masked now. Do you have an active subscription?
Yea, I bought the 5$ tier today
Oh you can go to https://aistudio.google.com/ and get an api key from there, no need for google cloud
Thats really odd then, looks like I will need your log file!
that's where I created the api key but I do have to set up billing there as well, no?
Oh I see, not sure what the issue is with that. For testing you should still be able to use an api key on the free tier
Were you speaking during that session? I dont see any STT detections. Is VoiceMeeter Output (VB-Audio VoiceMeeter VAIO) the correct input device? @rough fable
That is the correct one yea. I also have some mod that displays the voice activation icon for the normal game voice stuff
let me make sure I'm speaking. I'm restarting continuously so maybe I didn't say anything there.
In the console after you speak you should see [Wendigos STT]: RECOGNIZED:
It looks like the STT is initializing properly so I suspect it cant hear you for some reason if that isnt coming up
can't get that nope. also cycled through my input devices in the game settings to make sure it's correct and the level is fine.
this looks sus tho..?
[Wendigos Log] Clips count: 0
[Debug :GeneralImprovements] Updated time display.
[Debug :GeneralImprovements] Updated time display.
[Debug :GeneralImprovements] Updated time display.
[Debug :GeneralImprovements] Updated time display.
[Wendigos Log] Clips count: 0
[Debug :GeneralImprovements] Updated time display.
Saving changed settings
[AI Manager] Starting speech recognition.
[Wendigos Log] Set to VoiceMeeter Output (VB-Audio VoiceMeeter VAIO)
[Info :LethalPerformance] Saved 1 save(s)
[Debug :GeneralImprovements] Updated time display.
Connected to ElevenLabs Scribe (NAudio).
Device 'VoiceMeeter Output (VB-Audio VoiceMeeter VAIO)' not found. Defaulting to device 0.
[Debug :GeneralImprovements] Updated time display.
[Debug :GeneralImprovements] Updated time display.
[Debug :GeneralImprovements] Updated time display.
[Debug :GeneralImprovements] Updated time display.
[Debug :GeneralImprovements] Updated time display.
[Wendigos Log] Clips count: 0
Doesn't seem to be an issue with VoiceMeeter and the virtual audio stuff either. Tried selecting my usb mic directly.
got this again though
Connected to ElevenLabs Scribe (NAudio).
Device 'Microphone (SC440 USB Microphone)' not found. Defaulting to device 0.
Interesting, it looks like the speech sdk I'm using isn't recognizing the device identifier for some reason. Can you run the game and use the shortcut SHIFT + V + B to open the voicebox GUI and tell me what input devices are listed there? @rough fable
the recording in that gui also works fwiw
hmm so theres likely an issue with mapping the device name to an NAudio device index (for the actual STT). If you're cool with it, I can provide an updated VoiceBoxModLib.Core.dll that has a potential fix!
Actually I just went ahead and published VoiceBox v0.3.5 to thunderstore! You should be able to update it in about an hour. Try updating and let me know if its fixed!
Thanks Tim! I'll try when I get back home.
Hmm, still can't get it to work and not seeing that [RECOGNIZED] you mentioned. Does this look as expected?
[Wendigos Log] Created GUI Manager
[Wendigos Log] Chat Manager Object is: null
[Wendigos Log] Clearing chared masked dict
[Wendigos Log] STT MANAGER IS: _AIManager (UnityEngine.GameObject)
[AI Manager] Starting speech recognition.
[Wendigos Chat] Creating chat manager object. Disregard "Service config is null" errors.
[ServiceFactory] Chat service config is null. No chat service will be created.
[Wendigos Chat] Created Chat manager.
[Wendigos Log] CLIENT IDS: 0 xAthrNtCydF0CrAmyO2f
When I created the ElevenLabs API key I restricted it to TTS and STT endpoints. Is that fine? Or is the problem that it's not even capturing any clips for me -> the only reoccuring log I see during rounds: [Wendigos Log] Clips count: 0?
All that is normal and your Elevenlabs setup is fine. That message displays whenever a masked tries to play an idle clip when there arent any. Seems like the input device still isn't being detected. Do you get the same "defaulting to device 0" error?
Also, are you on Windows or linux?
No, the device 0 error is gone. I'm on Windows
I just tried a less bloated profile. Will send log
Interesting, so my fix did resolve the NAudio device name issue at least.
Sounds good!
I dont see anything out of the ordinary on first glance... a few more debugging steps we can try:
- Try using your microphone directly instead of voicemeeter
- If you're up for it you can try setting up an Azure speech service (free tier) and trying that for STT. There is a guide on the mod page that walks through how to set one up!
- tried it all even the wave device (speaker out looped back as input)
- Unfortunately it looks like I can't. Microslop refuses to let me create a new account after I tried logging in with an old hotmail account where it went on a redirection spiral. I guess I've tripped their heuristics now.
Even tried on my phone and mobile network.. I love microsoft π
ok finally got azure set up. worked immediately.
Ok that's super weird then! Once I'm home I'll look into what's up with the Elevenlabs STT.
If you want, you can install the mod asyncloggers which shows logs from async services like the STT backend. That might illuminate what's up with Elevenlabs
I have that installed. How can I view the logs?
Ah I see, they're just in the same log so I guess no new info there
One more debugging step before I add a bunch of logging to a custom voicebox dll, can you try Elevenlabs STT with an unrestricted api key and the language code set as eng?
sure
nope, doesn't recognize my speech anymore.
You sure it's eng not en? https://en.wikipedia.org/wiki/List_of_ISO_639_language_codes
ISO 639 is a standardized nomenclature used to classify languages. Each language is assigned a two-letter (set 1) and three-letter lowercase abbreviation (sets 2β5). Part 1 of the standard, ISO 639-1, defines the two-letter codes, and Part 3 (2007), ISO 639-3, defines the three-letter codes, aiming to cover all known natural languages, largely...
I was using de before, not deu or ger
both should work, I've been using eng so I was double checking
ok π
Time to write up some logging haha
Alright here's a modified VoiceBox dll with super verbose Elevenlabs STT logging. Place it in {game folder or mod profile folder}/BepInEx/plugins/Tim_Shaw-VoiceBox/ and replace the old VoiceBoxModLib.Core.dll
hmm, I should be getting spammed with new logs ig but I don't see anything different.
Here
It did show the issue tho! Raw response: {"message_type":"invalid_request","error":"Invalid vad_threshold: '0,4'. Must be a number between 0.1 and 0.9"}
now I just have to figure out why on earth thats a comma and not a .
ohhhh are you european by chance? There might be some localization going on with the decimal being converted to a string!
That is crazy that that was the issue, how does that even happen lmaoooo
Fixing rn and will publish VoiceBox v0.3.6 soon!
@rough fable can you test to see if this dll works?
yup, works!
Sweet! I'll publish the updated version shortly. Is it cool if I credit you in the changelog/readme for helping fix this issue? and if so what @ should I use to tag you?
yea sure. @rough fable is fine ig
interesting. azure struggled a lot when I was using any English terms mid sentence or even entire English sentences but ElevenLabs just ranslates it instantly lol
Published updates for VoiceBox and Wendigos! Should be available in about an hour
Can we get an update that adds a config to restore the mimics back to their original form? with the mask and the arms out animation?
Hmm, any idea why sometimes I can't hear their voices?
[ElevenLabs STT] Raw response: {"message_type":"partial_transcript","text":"What's up?"}
[Info :NaturalSelection] Missing data container for (Urchin|ID: 365). Creating new data container...
[Debug :NaturalSelection] (Urchin|ID: 365) Final size: Small
[Info :NaturalSelection] Missing data container for (Urchin|ID: 366). Creating new data container...
[Debug :NaturalSelection] (Urchin|ID: 366) Final size: Small
[ElevenLabs STT] Raw response: {"message_type":"committed_transcript","text":"What's up?"}
[ElevenLabs STT] Raw response: {"message_type":"committed_transcript_with_timestamps","text":"What's up?","language_code":null,"words":[{"text":"What's","start":23.179,"end":23.439,"type":"word","speaker_id":null,"logprob":-0.127381960550944,"characters":[{"text":"W","start":23.179,"end":23.199},{"text":"h","start":23.199,"end":23.219},{"text":"a","start":23.219,"end":23.239},{"text":"t","start":23.239,"end":23.359},{"text":"'","start":23.359,"end":23.359},{"text":"s","start":23.379,"end":23.439}]},{"text":" ","start":23.439,"end":23.519,"type":"spacing","speaker_id":null,"logprob":-0.00238037109375,"characters":[{"text":" ","start":23.439,"end":23.519}]},{"text":"up?","start":23.519,"end":23.699,"type":"word","speaker_id":null,"logprob":-0.10380299886067708,"characters":[{"text":"u","start":23.519,"end":23.639},{"text":"p","start":23.639,"end":23.699},{"text":"?","start":23.699,"end":23.699}]}]}
[Wendigos STT] RECOGNIZED: What's up?
Added clip successfully.
[Wendigos Log] COUNT: 5
[Wendigos Log] Masked dist is: 232,9006
[Wendigos Log] Masked dist is: 233,5887
[Wendigos Log] Masked dist is: 65,12283
[Wendigos Log] Masked dist is: 22,0806
[Wendigos Log] Masked dist is: 8,81206
Yo : VBGbA9UvwZJjAc14FkCX
Streaming is already in progress.
: VBGbA9UvwZJjAc14FkCX
Streaming is already in progress.
: VBGbA9UvwZJjAc14FkCX
I think it's whenever it's printing that warning 'Streaming is already in progress'.
Each masked is assigned an exclusive user to mimic so only 1 masked will speak.
Or is that masked speaking only sometimes?
Hmm, yes at first he was only occasionally responding. Then I tried a few more rounds and coldn't get any voice out of him despite it printing the responses in console.
That was in the debug profile with less masked spawning. Oh, actually I should still have that log too.
[23:36:11.8313239] [Info : Unity Log] [ElevenLabs STT] Raw response: {"message_type":"committed_transcript","text":"Test, Test,"}
[23:36:11.8894070] [Info : Unity Log] [ElevenLabs STT] Raw response: {"message_type":"committed_transcript_with_timestamps","text":"Test, Test,","language_code":null,"words":[{"text":"Test,","start":25.471,"end":25.971,"type":"word","speaker_id":null,"logprob":-0.560723876953125,"characters":[{"text":"T","start":25.471,"end":25.551},{"text":"e","start":25.551,"end":25.711},{"text":"s","start":25.711,"end":25.791},{"text":"t","start":25.791,"end":25.971},{"text":",","start":25.971,"end":25.971}]},{"text":" ","start":25.971,"end":26.251,"type":"spacing","speaker_id":null,"logprob":-0.0086669921875,"characters":[{"text":" ","start":25.971,"end":26.251}]},{"text":"Test,","start":26.251,"end":26.831,"type":"word","speaker_id":null,"logprob":-1.0394973754882812,"characters":[{"text":"T","start":26.251,"end":26.352},{"text":"e","start":26.352,"end":26.431},{"text":"s","start":26.431,"end":26.651},{"text":"t","start":26.651,"end":26.831},{"text":",","start":26.831,"end":26.831}]}]}
[23:36:11.8924078] [Info : Console] [Wendigos STT] RECOGNIZED: Test, Test,
[23:36:11.9001272] [Info : Console] Added clip successfully.
[23:36:11.9014081] [Info : Console] [Wendigos Log] COUNT: 2
[23:36:11.9014081] [Info : Console] [Wendigos Log] Masked dist is: 2,908579
[23:36:11.9014081] [Info : Console] [Wendigos Log] Masked dist is: 2,314281
[23:36:12.0412957] [Info : Unity Log] Received set ship lights RPC. Lights on?: False
[23:36:12.4092602] [Info : Console] Hier. : xAthrNtCydF0CrAmyO2f
[23:36:12.4092602] [Warning: Unity Log] Streaming is already in progress.
[23:36:12.4092602] [Info : Console] HΓΆrt mich? : xAthrNtCydF0CrAmyO2f
[23:36:12.4103907] [Warning: Unity Log] Streaming is already in progress.
[23:36:12.4103907] [Info : Console] : xAthrNtCydF0CrAmyO2f
So here both masked were right next to me. I should have heard something I assume.
I made some tweaks to the websocket logic, try this!
I'll have to try tomorrow. Thank you!
Thanks for the reminder, I'll add that in the next wendigos build once I fix the TTS websocket issue!
Tried it on my smaller mod profile. There was one mimic.
I was constantly talking. At the very start the mimic was playing a few random clips (original recordings), some of which abruptly cut out.
Other than that I couldn't hear any of the ai chat responses that were printed to console/log.
Here's the log
Now, what's weird to me is that the ElevenLabs TTS was working before when I was using Azure for STT, right? Let me check that again.
Yeah, no.. Exact same thing with Azure STT. Maybe the changes that fixed the regional serialization broke the TTS stuff in some way?
Sometimes the word detections are off so cutoff on clips can happen\
I'm not sure whats going on because on my end everything is working fine, I even did a few rounds and spawned masked and the masked that is mimicking me always responds
@rough fable here is an updated VoiceBox build with verbose logging on the Elevenlabs TTS side
I'll give it a spin in a bit!
Ohhhh, I'm just an idiot. Debug logs made me realize I only updated one of the ElevenLabs API key in this profile. Sorry for the confusion :/
It's all good now. Thanks again!
Great to hear its working!
Pushed Wendigos v2.0.4 with that addition!
When I quit the game to main menu and load back in, in the new round I can't hear mask voices and get this:
[Wendigos STT] RECOGNIZED: Ich geh schon mal rein.
Added clip successfully.
[Wendigos Log] COUNT: 1
[Wendigos Log] Masked dist is: 6,949234
Noise heard relative loudness: 0,05536064
Noise heard relative loudness: 0,04317421
System.AggregateException: One or more errors occurred. (A task was canceled.) ---> System.Threading.Tasks.TaskCanceledException: A task was canceled.
--- End of inner exception stack trace ---
at System.Threading.Tasks.Task.ThrowIfExceptional (System.Boolean includeTaskCanceledExceptions) [0x00011] in <1071a2cb0cb3433aae80a793c277a048>:IL_0011
at System.Threading.Tasks.Task.Wait (System.Int32 millisecondsTimeout, System.Threading.CancellationToken cancellationToken) [0x00043] in <1071a2cb0cb3433aae80a793c277a048>:IL_0043
at System.Threading.Tasks.Task.Wait () [0x00000] in <1071a2cb0cb3433aae80a793c277a048>:IL_0000
at TimShaw.VoiceBox.Core.ElevenLabsTTSServiceManager.InitWebsocket (System.Net.WebSockets.ClientWebSocket webSocket, TimShaw.VoiceBox.Components.StreamingAudioDecoder audioDecoder, System.Threading.CancellationToken token) [0x000a9] in <489e94a5e069496b9e62857fcfcf408c>:IL_00A9
at TimShaw.VoiceBox.Components.AudioStreamer.InitStreaming (TimShaw.VoiceBox.Core.ITextToSpeechService service, System.Threading.CancellationToken token) [0x00096] in <489e94a5e069496b9e62857fcfcf408c>:IL_0096
at TimShaw.VoiceBox.Components.TTSManager.RequestAudioAndStream (System.String promptChunk, System.Boolean isFinalSegment, TimShaw.VoiceBox.Components.AudioStreamer audioStreamer) [0x0000f] in <489e94a5e069496b9e62857fcfcf408c>:IL_000F
at Wendigos.ElevenLabs.StreamAudioChunk (System.String promptChunk, System.String voiceID, System.Boolean isFinalSegment, TimShaw.VoiceBox.Components.AudioStreamer audioStreamer) [0x0002e] in <3166c4302a5c4344ba6304c4122e45e3>:IL_002E
---> (Inner Exception #0) System.Threading.Tasks.TaskCanceledException: A task was canceled.<---
Awesome!
Ah I know what the issue is, I refactored how the TTSManagers are created and forgot to clear the dict when you leave the game
I'll publish shortly but here's the fix if you want to test!
when playing recorded clips, when / how likely is the mod to try to pick an "appropriate" response?
like if a masked hears you address it, will it always try to respond appropriately, or could it say something unrelated?
and how much context does it have? just your question, or its past responses to you?
also
I know local voice cloning was abandoned, but any chance we could use a local llm for smart clip selection?
That's already possible with ollama!
The mod always sends the clip transcriptions to the chat model. If there isn't a good response the chat model likely just picks arbitrarily
Currently the context is only what is said, not previous conversation. This is to reduce latency
Also, I'm working on adding Qwen3-TTS support for local voice cloning but I haven't found a suitable local server that supports streaming input/output
Awesome!
makes sense, might be cool to have a chance of not asking the model and instead picking randomly (as though they didn't hear you or something)
makes sense. how bad do you think it'd be to add only the most recent response for example?
in the video clip after a masked said it was by a fire exit and was asked "what did you say?" it repeated itself then elaborated. I guess that was random / repetition but it sold the illusion crazy well, I think that'd be a neat feature if it wouldn't impact it too bad
oh nice, that'd be interesting
and great for my privacy-minded friends lol
Been goofing around with this and had some interesting conversations already lol
the German TTS is a little robotic sometimes. Is there a noticeable difference between voices created on the lowest ElevenLabs tier vs the creator tier? Do you know, Tim?
Also tried some prompt engineering to improve the responses but somehow it always ends up with the AI mocking me and calling me names xD
Now, if I could request one feature, well.. more of an enhancement, that'd be a config setting that maps voice IDs to steam IDs and syncs them automatically to clients. Like steamID1:voiceID1, steamID2:voiceID2. I figure since you already have the sync with clients it might be an easy addition and very nice QoL unless I'm missing something..?
I guess this works fine with Zombies or other mods that spawn masked in non-native ways..?
Can you tell me what API endpoints are required in ElevenLabs for experimental?
@winged tapir does it work with v80?
No it broke
Hi, whenever I try loading into a moon, I get this in the BepInEx window
and i cant load in
tbh, this is way too much hastle since i'm looking at the voicebox tutorial.
Looks like v80 made some changes that may have broken the mod, I'll look into it over the next few days
Cool cool
After further testing, the mod works just fine on v80. @haughty oriole can you post your log so I can see whats going on?
I cant determine much from just this log message. Can you post the entire log?
Its possible! v80 is still in beta so no guarantees on mod compatibility ofc
Seems to be working on my end for the moment at least
wss://api.elevenlabs.io/v1/text-to-speech/ TTS websocket
wss://api.elevenlabs.io/v1/speech-to-text/realtime STT websocket
https://api.elevenlabs.io/v1/voices/add if using autoclone
Did you test it with MoreCompany? MoreCompany breaks the TZP voice changer, could that be a problem with Wendigo?
Thats the issue, MoreCompany is broken on v80 currently
looks like they uploaded a fix recently, should hopefully be working once the update is published on Thunderstore
Hmm, may not be related at all but I've had the game hard crash on me twice over the last week or so and both times it was immediately after an ElevenLabs error like this
ElevenLabs Error [resource_exhausted]
[ElevenLabs STT] WebSocket closed by server. Status: NormalClosure, Description: resource_exhausted
[Wendigos STT] Speech to Text service cancelled: Reason=Error Error: ElevenLabs Error [resource_exhausted]
Attempting to reconnect... [1/3]
[AI Manager] Starting speech recognition.
[Info :BellMonster] Playing bellStep sound (Chasing)
Connected to ElevenLabs Scribe (NAudio).
Crash!!!
========== OUTPUTTING STACK TRACE ==================
0x00007FFA59134A16 (ntdll) RtlWaitOnAddress
0x00007FFA590FFCB4 (ntdll) RtlEnterCriticalSection
0x00007FFA590FFAE2 (ntdll) RtlEnterCriticalSection
0x00007FFA1A2CFA57 (winmmbase) mmTaskCreate
0x00007FFA1A2D17C9 (winmmbase) waveInPrepareHeader
0x000001DEF7BCC258 (Mono JIT Code) (wrapper managed-to-native) NAudio.Wave.WaveInterop:waveInPrepareHeader (intptr,NAudio.Wave.WaveHeader,int)
0x000001DEF7BCCB7B (Mono JIT Code) NAudio.Wave.WaveInBuffer:Reuse ()
0x000001DEF7BCC8D3 (Mono JIT Code) NAudio.Wave.WaveInEvent:DoRecording ()
0x000001DEF7BCC703 (Mono JIT Code) NAudio.Wave.WaveInEvent:RecordThread ()
0x000001DEF7BCC68B (Mono JIT Code) NAudio.Wave.WaveInEvent:<StartRecording>b__29_0 (object)
0x000001DEF7BCC629 (Mono JIT Code) System.Threading.QueueUserWorkItemCallback:WaitCallback_Context (object)
0x000001DB9AADBEEE (Mono JIT Code) System.Threading.ExecutionContext:RunInternal (System.Threading.ExecutionContext,System.Threading.ContextCallback,object,bool)
0x000001DB9AADB96B (Mono JIT Code) System.Threading.ExecutionContext:Run (System.Threading.ExecutionContext,System.Threading.ContextCallback,object,bool)
0x000001DD35FA3893 (Mono JIT Code) System.Threading.QueueUserWorkItemCallback:System.Threading.IThreadPoolWorkItem.ExecuteWorkItem ()
...
if you have any use for the full dmp or log lmk
happened again
Update used by player client rpc has been called for interact trigger: DriverSeatTrigger
Triggering animated object trigger bool: setting to False
StopIgnition is disabled! Netcode for GameObjects does not support disabled NetworkBehaviours! The InteractTrigger component was skipped during ownership assignment!
StopIgnition is disabled! Netcode for GameObjects does not support disabled NetworkBehaviours! The InteractTrigger component was skipped during ownership assignment!
[Info :GeneralImprovements] Applied 2 queued monitor changes.
[Debug :GeneralImprovements] Updated time display.
[Debug :GeneralImprovements] Updated time display.
Starting ignition!!!
ElevenLabs Error [resource_exhausted]
[ElevenLabs STT] WebSocket closed by server. Status: NormalClosure, Description: resource_exhausted
[Wendigos STT] Speech to Text service cancelled: Reason=Error Error: ElevenLabs Error [resource_exhausted]
Attempting to reconnect... [2/3]
[AI Manager] Starting speech recognition.
[Debug :GeneralImprovements] Updated time display.
Connected to ElevenLabs Scribe (NAudio).
Crash!!!
same stack trace
I have a feeling it always happens when this resource exhausted error occurs at the same time some audio source in-game is trying to play sound. this time it was the exact moment I attempted to start the cruiser. last time it was the exact moment the bellmonster attempted to play its aggro sound.
This mod seems like a really cool concept. Is it taxing for the hostβs pc?
Might be a race condition where the restart happens before buffers are cleared, let me know if the fix in this dll works!
Not at all! Each player has individual connections to relevant services, the host only does 1 config sync when everyone joins (if you use that feature). The mod should be very lightweight for everyone
Thanks! Not easy to check because it happens so sporadically but I'll be running that and let you know if it happens again over the next couple of days.
V81 update incoming?
The mod is working in v80, as I said the bug you've sent is not an issue with my mod
Interesting... I'll test it out for myself
Hey! Really cool concept for a mod, but I asm having issues getting it to work on my end. I know it is quite experimental, but I paid for an eleven labs api key and now I am invested.
I have my error below. I believe I am passing the api keys correctly through the cofing and everything. I am trying to use the voice cloning through eleven labs, and it does actually have a cloned version of my voice (super cool, very spooky, but it seems within he logs whenever it trys to pull from that voice, it fails and defaults to playing random voice clips it pulls (that works).
Parameter name: key
at System.Collections.Generic.Dictionary`2[TKey,TValue].FindEntry (TKey key) [0x00008] in <1071a2cb0cb3433aae80a793c277a048>:IL_0008
at System.Collections.Generic.Dictionary`2[TKey,TValue].get_Item (TKey key) [0x00000] in <1071a2cb0cb3433aae80a793c277a048>:IL_0000
at Wendigos.STT.SendToChatAndStreamAudioResponse (MaskedPlayerEnemy closest_masked, System.String playerName, System.String player_speech) [0x0008a] in <dfeb82a886cc482f86134eef5ca43b65>:IL_008A
at Wendigos.STT+<>c__DisplayClass20_0.<InitCallbacks>b__2 () [0x00161] in <dfeb82a886cc482f86134eef5ca43b65>:IL_0161```
Also the STT azure connection occasionally fails and crashes my game, but that doesnt happen too often. I am also down to test just anything to get this working.
Is this an error with the mod? Let me know if you need more logs!
If your crashing looks like mine posted above, try the unreleased version Tim has provided.
I'm curious, are you using auto cloning? I haven't seen that error you send on my end and it sounds like that's the only part between our two setups that could be different.
fixed the issue but can't tell you how. Just did a verify files and reimported modpack
Different issue though - If I say "Can you hear me?" to a masked it repeats "Can you hear me?" like six times and replies at the same time "Yeah I can hear you". I tried turning max voice clips to the minimum (1) but it didn't make a difference; wondering if I fucked up somewhere. Example here, I want 'oh' and it got really hung up on that but still replied to other stuff. https://streamable.com/n1j26p
i am trying auto cloning
i will try fix later today i suppose
to clarify: I'm not using auto cloning
but yeah, i don't even know if it's related at all. just a guess
its not really crashing, just the program not working and defaulting to spamming audio clips
I don't know why, but I just can't get it to work.
Tried (a paid) Elevenlabs API for STT first, but that seems broken. Then switched to a free Azure subscription, I see it sent a few clips and then just hit me with a:
[Connection was closed by the remote host. Error code: 1007. Error details: Quota exceeded. Cid: SessionId: e10e36ee2a8f49fe85f013575773ddbc
Altough when I check Azure, the quota doesn't seem exceeded and it tells me I only made 5 requests in the last 24hrs.
Does anyone else have an idea as to whats happening?
Appreciate it!
EDIT: I used a paid azure subscription and that worked, seemed like some quota thing still caused it, or I did something wrong, anyway its fixed, sounds great!
~~I had it working for a while, but ran into request issues. But for some reason I now keep getting: ~~
[Warning: Unity Log] Audio source failed to initialize audio spatializer. An audio spatializer is specified in the audio project settings, but the associated plugin was not found or initialized properly. Please make sure that the selected spatializer is compatible with the target.
~~Im fairly certain it has to do with the Wendigos plugin, as it starts spamming my logs whenever I talk to a nearby masked.~~Any clues to what I might've changed accidentally?
Okay the top part resolved itself, but didn't seem to cause it.
So the problem seems to be in the usage of API keys.
It's weird to me since it prints these messages in the log:
[Info : Console] [Wendigos STT]: Creating AI manager object for STT service. Disregard "Service config is null" errors.
[Warning: Unity Log] [ServiceFactory] Chat service config is null. No chat service will be created.
[Warning: Unity Log] [ServiceFactory] STT service config is null. No STT service will be created.
[Warning: Unity Log] [ServiceFactory] TTS service config is null. No TTS service will be created.
[Warning: Unity Log] Chat service API key not found.
[Warning: Unity Log] TTS service API key not found.
[Info : Console] [Wendigos STT] STT Service Created.
Altough, these API keys work (ChatGPT for Chat and Elevenlabs for TTS and STT).
In the logs later it is confirmed that at least the chat service works (by giving me a reply to run through TTS), altough it prints it in stream mode (don't think this is an issue, since it gave a reply to my question one single time, a few days ago with that same setup.
[Info : Console] [Wendigos STT] RECOGNIZED: Okay, so what's going on now?
[Info : Console] Added clip successfully.
[Info : Console] [Wendigos Log] COUNT: 1
[Info : Console] [Wendigos Log] Masked dist is: 6.560073
[Warning: Unity Log] PlayOneShot was called with a null AudioClip.
[Debug : Imperium] [PROFILE] Objects refresh time : 11
[Debug : Imperium] [PROFILE] Total objects refresh time : 11
[Info : Console] : cxG7uDG1BbuP6ObKuREX
[Info : Console] Not : cxG7uDG1BbuP6ObKuREX
[Info : Console] much : cxG7uDG1BbuP6ObKuREX
[Info : Console] , : cxG7uDG1BbuP6ObKuREX
[Info : Console] just : cxG7uDG1BbuP6ObKuREX
[Info : Console] trying : cxG7uDG1BbuP6ObKuREX
[Info : Console] to : cxG7uDG1BbuP6ObKuREX
[Info : Console] survive : cxG7uDG1BbuP6ObKuREX
[Info : Console] ! : cxG7uDG1BbuP6ObKuREX
[Info : Console] What : cxG7uDG1BbuP6ObKuREX
[Info : Console] about : cxG7uDG1BbuP6ObKuREX
[Info : Console] you : cxG7uDG1BbuP6ObKuREX
[Info : Console] ? : cxG7uDG1BbuP6ObKuREX
[Info : Console] : cxG7uDG1BbuP6ObKuREX
[Info : Console] : cxG7uDG1BbuP6ObKuREX
[Info : Console] : cxG7uDG1BbuP6ObKuREX
[Debug : Imperium] [PROFILE] Objects refresh time : 11
[Debug : Imperium] [PROFILE] Total objects refresh time : 11
[Debug : Imperium] [PROFILE] Objects refresh time : 11
[Debug : Imperium] [PROFILE] Total objects refresh time : 11
Elevenlabs usage/history only shows status codes 200 en 1000 (which are okay according to their docs)...
Any clue as to what's going on? And what I can do to make it work (again) π
Sorry for spamming the discord. π