#Are large datasets for Ov2 bad?

1 messages · Page 1 of 1 (latest)

sudden dune
#

Been redoing my Mio model for a bit on Ov2. Have like 40 mins for a dataset which I know is crazy high but it is what it is. I've got it to like 450 epoch/45K steps before overtraining but idk, it sounds a bit off. Like the esses and breaths sound choppier than the model I did on the OG pretrains. Some weird humming and chopping artifacts too.

Does Ov2 encourage smaller datasets than that or is this just something dodgy in my dataset? Or perhaps a mix of both. Thanks!

tulip oxide
#

But again, I haven't tested Ov2 enough, so take that with a grain of salt

bronze valve
#

i would say the same thing based on my testing

#

i would even say less than 5min maybe

foggy zealot
#

i try on 2 min with same epoch so its have artifacting

gleaming berry
#

from my testing using large datasets will generally just sound the same when you train it for long enough for hours using 0v2 and V2

but on smaller datasets its better to use 0v2, for longer it doesnt really matter from my test

gaunt ferry
#

30 epochs and 2h of data

tulip oxide
#

if you're on a long dataset you might as well use normal pretrains

gaunt ferry
#

im just testing new pretrains

bronze valve
#

its not making it worse or anything, i just always liked the output more without the new pretrain. except on short datasets

sudden dune
#

I guess it's a mix of both perhaps lmao. I hear some of the same artifacting in the OG Mio model so yeah