This model is a NELL model designed to test the KLM 4.4 MRF pre-trained model.
For testing this model, it is recommended to use Applio 3.2.8 or Codename Fork.
In the case of the 4.4 Model, high-frequency samples amplified in the BAD SAMPLE channel have been included in the dataset to suppress mirroring within the model itself. These samples exceed the sample rate, causing aliasing, and by training the model with these samples, the generator is guided to avoid producing spectrograms that do not exist in the inference target. This approach is similar to deliberately training the model with noise data to forcibly remove silence noise.
However, as this can amplify actual aliasing in some samples, the speakers were separated and trained in distinct channels to prevent high-frequency interference. Despite this training method, some mirroring in the ultra-high frequency range overlaps with the actual waveform used in speech, making it impossible to completely avoid such cases through dataset adjustments. Nevertheless, since the signal remains weak, it is not expected to cause significant issues in the generated audio (though variations may occur depending on the fine-tuned model’s voice).
Additionally, in the case of KLM 4.4, all vocal data was recorded in 3-second segments.
This means there are no instances where a singer's performance is abruptly cut off in the middle of a phrase. Currently, all voice actors have been recalled to re-record all songs. This process will continue to be updated as the KLM version progresses
DataSet -
24 Mins of [conversational tone]
14 Mins of [recitative tone or a read-aloud style]
7 Mins of [agitated tone or an impassioned tone]
1 Min of [Shouting]
8 Mins of Sing
Training -
Batch Size 4 [x2 GPUS]
Total Epoch 1001
Pretrained model KLM 4.4 exp x2 MRF
Model Link - https://huggingface.co/SeoulStreamingStation/RVC_Voice_Models/resolve/main/NELL_MRF.zip?download=true

