One of the questions many people have wondered about for a long time is, 'Is it possible to create a model without using a pre-trained model?' Recently, I haven't been able to focus much on this due to numerous modifications that needed to be made, but I can finally provide an answer.
As the name suggests, NPIM stands for Non-Pretrained Independent Model. This means that this model is created purely from the voice data of the model itself, without the use of any pre-trained models.
While it may not be efficient in many ways, it is possible if you have sufficient data. However, it requires a tremendous amount of time and data. Due to its inefficiency, using a pre-trained model is probably better for your mental well-being. Additionally, in most cases, lowering the index value allows the model to refer to data from the pre-trained model, enabling it to infer even when the model's own data is lacking. However, since this model is trained solely on its own dataset, in some cases, a higher index value may be required to reduce artifacts.
Because the dataset is entirely composed of Korean, artifacts may occur in some pronunciations that are difficult to produce in Korean.
**Info **
Dataset : 275 mins (4.35 Hours) [Normal / Angry / Sad / Shout / Sing]
Batch Size : 16
f0 method : RVMPE
Total Epochs : 4200+ Epochs / 185000+
Sample Rate : 48khz
VA : Chung Ah Han
Model Link - https://huggingface.co/SeoulStreamingStation/RVC_Voice_Models/resolve/main/Nell_Test_NPIM.zip?download=true