#Kaggle is broken

1 messages · Page 1 of 1 (latest)

olive thistle
#

It started with me not being able to make samples due to it giving me and error but now it just completely stopped my training like 10 epochs in and gave that full error as shown in the ss

sullen dirge
#

batch size?

#

i think it can do max 8 (4x2)

#

unsure if it can 16 (8x2) yt_nails

olive thistle
sullen dirge
#

kaggle has two gpus

#

so if you want to use batch size 8, you need to set the batch size to 4 in the ui

#

each gpu does batch size 4, so that becomes batch size 8

olive thistle
#

ive always put 8 and its been fine, whats up with it now?

sullen dirge
#

no idea what...

olive thistle
#

I can try 4 and see if that fixes it?

sullen dirge
sullen dirge
#

batch size 4 in kaggle would be the actual batch size 8

olive thistle
#

so youre telling me ive been using batch size 16 for the past like 8 months..

olive thistle
#

Im gonna commit.

#

But even for batch size 16, it takes for everrrr

#

I have to spend MORE timeee

sullen dirge
#

because T4 Gpus are from 2018 and in rvc multigpu training is slower for no reason

#

it shouldnt be slower but for some reason, it is slower

#

or at least, in applio multigpu is slower (potentially slower in the original rvc too but i havent compared speeds yet)

#

batch 16 is ok as long your dataset is not small (less than 10 minutes)

#

after batch 64 things worsen

olive thistle
#

I normally use batch size 16 (Thought it was 8) for datasets around 30+ minutes

sullen dirge
#

thats great, it's fine

#

at that point everything works up to 64 max

olive thistle
#

What do you recommend for a 2 hour and 30 minute dataset

#

..

sullen dirge
#

32

olive thistle
#

I tried yesterday and the error fucked it up so I gave up

#

oh alr

reef hawk
#

crazy how me and him didn't know this

sullen dirge
reef hawk
olive thistle
#

yeah, its kinda stupid

sullen dirge
olive thistle
#

Ig im stupid

sullen dirge
#

nono, the docs should mention it since it's meant to help new model makers

#

and not all have prior ai experience

reef hawk
#

so I never knew

sullen dirge
#

if you have let's say 4 gpus

batch 4 in the ui will become batch size 16 instead

#

is called 'effective' batch size

reef hawk
#

because it's multiplting times 4

sullen dirge
#

exactly

#

each gpu is running batch size 4

#

but at the end it becomes batch 16

reef hawk
#

is there a reasoning for that?

sullen dirge
#

speed boost

reef hawk
#

fair

olive thistle
#

So do we half each batch size we'd normally use from now on, or is it still good?

sullen dirge
#

if you wanna use batch 8, use batch 4 in the ui of kaggle

olive thistle
#

welp, ig ill be training alot slower

olive thistle
#

My training keeps stopping and gives me this error over and over and over

olive thistle