#How to use Xeus Embedder with RVC?

1 messages · Page 1 of 1 (latest)

ember kayak
#

Does anyone have advice on how to use the Xeus encoder with RVC?

I checked out the exp/vocoders branch and see that Xeus is referenced their but that branch is a bit out of date and I had a few errors running it. I merged main into the branch and have been able to make a few changes to xeus_test_model.py to get a plot to display. For the test class I am loading in the checkpoint found here. https://huggingface.co/espnet/xeus/blob/main/model/xeus_checkpoint_new.pth

When I try to run the full application it has me select custom embedder with a ".bin" or ".json" extension. Is there something I need to do to get the checkpoint in this format?

My goal is to use an embedder that will reproduce multilingual sounds better. For example a rolling "r" or even laughing/crying.

Also I have a 4090. I know this model is a lot larger than Hubert - will there be any issues using it with this setup?

Thanks!

balmy dew
#

Does this also work for datasets of singers?

ember kayak
#

I assume it would. I believe someone had tested it previously but I haven't been able to get it working yet. I have done a small test using mHubert-147 which uses 147 languages (the default is Hubert that is only trained on English) but it didn't really solve my problem with the rolled "R" sound.

steady depot
ember kayak
#

I seem to have it at least partially working. The exp/vocoders branch has a class that handles converting the Xeus checkpoint.pth file into the required format. I made a few changes to allow you to select the model from the UI menus but that was most of what was needed.

I did an initial test with about 12 minutes of audio data and after a few epochs I was getting NaN's on Tensorboard.

Any idea on if this is because I used such a small dataset?