#Having issues with the Google Collab to train my custom Wake Word

1 messages · Page 1 of 1 (latest)

atomic iron
#

I have been trying to get the Google Collab to work to train a custom wake word but i keep getting errors on the first step. I have tried Chrome and Firefox but still the same errors:

NameError                                 Traceback (most recent call last)
<ipython-input-2-2f6055b1a63f> in <cell line: 0>()
     44                 )
     45 
---> 46 text_to_speech(target_word)
     47 Audio("test_generation.wav", autoplay=True)

<ipython-input-2-2f6055b1a63f> in text_to_speech(text)
     36 
     37 def text_to_speech(text):
---> 38     generate_samples(text = text,
     39                 max_samples=1,
     40                 length_scales=[1.1],

NameError: name 'generate_samples' is not defined
#

Also this error:

---------------------------------------------------------------------------
UnpicklingError                           Traceback (most recent call last)
<ipython-input-3-fefc9a52262f> in <cell line: 0>()
     44                 )
     45 
---> 46 text_to_speech(target_word)
     47 Audio("test_generation.wav", autoplay=True)

2 frames
/usr/local/lib/python3.11/dist-packages/torch/serialization.py in load(f, map_location, pickle_module, weights_only, mmap, **pickle_load_args)
   1468                         )
   1469                     except pickle.UnpicklingError as e:
-> 1470                         raise pickle.UnpicklingError(_get_wo_message(str(e))) from None
   1471                 return _load(
   1472                     opened_zipfile,

UnpicklingError: Weights only load failed. This file can still be loaded, to do so you have two options, do those steps only if you trust the source of the checkpoint. 
    (1) In PyTorch 2.6, we changed the default value of the `weights_only` argument in `torch.load` from `False` to `True`. Re-running `torch.load` with `weights_only` set to `False` will likely succeed, but it can result in arbitrary code execution. Do it only if you got the file from a trusted source.
    (2) Alternatively, to load with `weights_only=True` please check the recommended steps in the following error message.
    WeightsUnpickler error: Unsupported global: GLOBAL piper_train.vits.models.SynthesizerTrn was not an allowed global by default. Please use `torch.serialization.add_safe_globals([SynthesizerTrn])` or the `torch.serialization.safe_globals([SynthesizerTrn])` context manager to allowlist this global if you trust this class/function.

Check the documentation of torch.load to learn more about types accepted by default with weights_only https://pytorch.org/docs/stable/generated/torch.load.html.
nimble moon
#

Did you get anywhere with this? I'm having the same issue

mystic merlin
#

Hey folks I trained mine about it 4 months ago and it worked as described in the HA guide. Now though… I am getting the same error as you.

Anyone got any insight?

cobalt escarp
#

I think recently the Google collaboration got updated to python 3.12 and it seems most of the dependencies are from python 3.10. I tried to do this recently and got all the way to the final step. I ended up having to set up a whole environment on my machine to perform the last few steps.

#

Fair warning as well. I recently discovered that there is a issue with open wake word and it will time out a stream token after 5 minutes and then will not play audio and will have to be restarted. There is an open github issue thats been around for several months that I haven't seen get any traction outside of some iffy work arounds. Im not trying to train a new micr wake word but if course that environment seems to be just as bugged.

mystic merlin
# cobalt escarp Fair warning as well. I recently discovered that there is a issue with open wake...

Thanks for that. This would make sense… that’s been my experience. Interesting you say that about open wake word, as my Atom Echo custom wake word assistant has been acting a little crazy the last few weeks. Lots of night time triggers, TTS notifications but no speech and deranged garbage that no one asks about!

This was my reason for training a new wake-word - I suspected that current one was causing the erroneous triggers as it only has two syllables.

I’ll go check that out.

cobalt escarp
#

There is all micro wake word which seemes to be a bit more stable but I haven't had much luck trying to train my wake word for that.

#

If you have the code from them it's likely you are running micro wake word currently. But you may not have a working set up if you try to switch to open wake word due to this issue.

#

Voice Assistant generating 404 audio URLs after inactivity · Issue...

#

You can also train a new micro wake word model but it's a different environment which I've been having issues with that one as well.

nimble moon
#

I’ll probably just set it up locally, it’ll likely be way faster to process too

cobalt escarp
#

The 404 issue does seem to be with devices through esphome

#

But yeah its he'll trying to fight all of the dependencies in the wake word train ing environments. I need to sort out what is causing the micr wake word docker image to fail, I suspect it is another dependency issue.

nimble moon
#

Annoyingly my WSL install doesn't want to run CUDA through Jupyter so I'm trying the 2025.07 runtime on Colab...

nimble moon
#

Turns out there's a Glados wake word in the community repo anyway lol

cobalt escarp
#

Yeah thats what im using right now till I can get the micro wake word environment working, yeah I was having issues with Jupiter and had to roll cuda back to version 12 to get it to work. Its only in a vm though so at least I can just change whatever I need and not care.

mystic merlin
#

I dont have a fancy GPU to do the training myself… what should my expectations be on an M4 Pro with 24GB memory?

cobalt escarp
#

For training microwave word the training will peak around 20gb if ram. I just had to purchase Google colab pro to get enough resources to run it. The ram for me was the only problem @mystic merlin

#

I am actually looking to use up my resources as that was the main thing I needed. If you would like to let me know what you were looking I can try and run it through my set up and train a new microwakeword model for you. May not be till m9nday though

#

Otherwise you can do it on cpu only its just going to take more time, for me the training with gpu took about 2 hours maybe or an hour and a half. But like 75% of that was just collected and preparing data which it collects like 130 gbs of data

#

Otherwise microwake worked well for me I just need to upgrade my esp board as I accidentally git abiut the worst variant that exists

mystic merlin
#

Hey man that’s very kind of you. I’d be happy to send you a coffee via PayPal or whatever if you did.

I’m looking to train my model to say “hey Morwen”

It’s a lovely Welsh name. Not that common in Scotland and similar to the Scottish Morven… decided Morwen is less likely to be triggered accidentally., and it’s nice too.

I have it set to “Morwen” currently and think more syllables will help reduce accidental triggers.

cobalt escarp
#

Could you spell it phonetically for me i want to make sure I get the tts to say it correctly before training. Im not familiar with the pronunciation

#

Like more-win?

#

Or more-when

#

I had issues with the tts pronunciation when I was going for "reginald" ended up with reh_ginoldh lol

cobalt escarp
#

@nimble moon @atomic iron if either of you would be interested in a trained micro wake word model I don't mind giving a hand

nimble moon
cobalt escarp
#

Cool just wanted to check. I know a full custom wake word on my set up was a hill i was willing to die on and it was a massive headache to get working. Hoping to save some nice e folks the headache

mystic merlin
cobalt escarp
#

No sweat ill get it trained when I get the chance. For sure on Monday ill have time though.

mystic merlin