#Is there an equivalent of flash boot for CPU-only serverless?

22 messages · Page 1 of 1 (latest)

worthy osprey
#

I was trying to figure out if there was a way to have a CPU job only fire up when it was needed so it would not accrue charges when idle (like flash boot for GPU serverless) Thanks!

pseudo shale
#

Cpu serverless does this

pseudo shale
worthy osprey
#

Oh apologies I did not see the reply for ssome reason

#

yeah I thought I tried it and it ran continously but let me look into it

pseudo shale
worthy osprey
#

OK it's working fine - my question is - for GPU+FlashBoot the initial delay time is less than one second - for CPU it is more than 6 seconds! Is there a way to reduce the CPU initial wait time? (same container, same request, the container has no CUDA so runs fine on CPU, just has that long initial delay)

haughty tulip
#

@worthy osprey what's your workload?

worthy osprey
#

low so far but how come a huge LLM package can flash boot to GPU in 0.5 sec, but a dinky 8 GB container takes 6 sec of delay on a CPU?

#

for my app turnaround time is key

#

falsh boot on GPU cloud is truly incredble

#

warm start on CPU cloud inexplicably 10x slower

haughty tulip
#

Sorry, I more so meant how are you using the CPUs so we have a better understanding of why it takes six seconds to cold boot

worthy osprey
#

ok same container as for GPU. uses Jason's simple py handler for api calls, on CPU cloud

#

way simple

#

running a bash script with a couple of executables, very boring (sox, lame and piper, voice processing stuff)

#

a typical job takes 2 sec of exec time on either GPU or CPU cloud (I'm not using the GPU at all as far as I can tell, no CUDA in environment)

worthy osprey
#

any ideas?

pseudo shale
#

So you're getting different performance on serverless with your cpu pod?

#

Depending on the package you use it can be used to cached to ram, but flashboot for serverless shouldn't be available yet I guess

pseudo shale