Directing requests from the same user to the same worker | Runpod | Page 1

warm quarry Feb 15, 2024, 3:34 PM

#

Guys, thank you for your work. We are enjoying your platform.

I have the following workflow. On the first request from the user, the worker does some hard stuff about 15-20s, caches hard stuff and all subsequent requests are very fast ~150ms. But if some of the subsequent requests goes to another worker, it should repeat this hard stuff again (15-20s). Is there any possibility to direct all the subsequent calls from the same user to the same worker?

daring dewBOT Feb 15, 2024, 3:34 PM

#

To help others find answers, you can mark your question as solved via Right click solution message -> Apps -> ✅ Mark Solution

raw kiln Feb 15, 2024, 3:36 PM

#

You only really benefit from FlashBoot if you have a constant flow of requests. Otherwise you can either set an Active worker or increase the idle timeout.

vagrant prawn Feb 15, 2024, 3:40 PM

#

@warm quarry you can use request count scaling, and do something like first 100 requests you only need 1 worker, etc

warm quarry Feb 15, 2024, 4:26 PM

#

It seems I have introduced a bit of confusion with my explanation of workflow. I will expand on it. My model is working on rendered construction drawings pdf. When user makes some request, pdf downloads from s3 and then renders high quality image, depending on pdf can take ~ 5s-30s. Each user has there own pdf. On subsequent request if request arrives to the same worker, hard work (downloading, rendering) already done, only model evaluates which is fast (150 ms). But if request arrives to another worker, it should download and render everything again. If we are scaling our workers to 10-20, what we are planning to do, it quite ruin the experience for the user, because on every pdf it will have 10-20 very slow requests.

odd zinc Feb 15, 2024, 5:08 PM

#

warm quarry It seems I have introduced a bit of confusion with my explanation of workflow. I...

You sound like you want a caching mechanism, your best bet is a network storage

#

All pods on serverless can attach to a network storage, which would allow you to persist data in between workers / runs

#

And then that way all workers have the same backing storage

#

https://docs.runpod.io/serverless/references/endpoint-configurations#select-network-volume

#

Essentially your workflow then should like:

Worker gets a job
Check network storage for client id > if exists pull existing resources > if not create a new folder
Continue with the job from whatever xyz point.
Write results if needed to network storage for other workers

warm quarry Feb 15, 2024, 5:14 PM

#

yep, that's nice, thanks a lot. The only thing it will limit workers to one datacenter

odd zinc Feb 15, 2024, 5:14 PM

#

warm quarry yep, that's nice, thanks a lot. The only thing it will limit workers to one data...

I think that's just the cost you need to eat, or you can write to a firebase file storage is what I do and I download it from there

#

B/c of what you said, I actually prefer to use my own storage mechanism, especially if your files' arent insanely big for the final files it sounds like, or the initial resources

#

https://github.com/justinwlin/FirebaseStorageWrapperPython

#

(my personal wrapper lol)

warm quarry Feb 15, 2024, 5:18 PM

#

🙂 Much appreciated. But would firebase be faster then just uploading files to s3?

#

Never heard about it

odd zinc Feb 15, 2024, 5:18 PM

#

warm quarry 🙂 Much appreciated. But would firebase be faster then just uploading files to ...

Firebase is backed by Google Cloud Buckets / it is run by google as a Google service

#

it just an easier wrapper around Google Cloud Buckets, so I like Firebase

odd zinc Feb 15, 2024, 5:19 PM

#

warm quarry 🙂 Much appreciated. But would firebase be faster then just uploading files to ...

S3 is also fine

#

I just hate AWS

#

xD

#

Honestly, if I could, Id avoid aws / google buckets if i could 😆, but no better file storage / object storage providers out there

warm quarry Feb 15, 2024, 5:19 PM

#

🙂 Who loves them ?

odd zinc Feb 15, 2024, 5:20 PM

#

But yeah, I also just think it's easier for me to have an easy wrapper around Google firebase file storage + they got a nice UI + I get a ton of file storage for free before I need to pay for it

#

So it is great for me, for developing cause I dont need to keep paying AWS ingress/egress costs

warm quarry Feb 15, 2024, 5:20 PM

#

nice, nice

#

We just already have a lot of infra around s3 😦

odd zinc Feb 15, 2024, 5:21 PM

#

Haha, then go with S3

#

But yeah not too bad, and I think cause I work with really long audio / videos with runpod, if your files can be optimized before sending / downloading it (compressing, converting file formats, stripping unnecessary data etc) it can also help with getting things moving faster. but honestly ur files sound small enough where that might not be necessary

#

idk how big ur files are tho

warm quarry Feb 15, 2024, 5:23 PM

#

it depends, a lot of pdfs quite small 30 mb, but render time can be quite big, Some of them about 500 mb.

odd zinc Feb 15, 2024, 5:24 PM

#

warm quarry it depends, a lot of pdfs quite small 30 mb, but render time can be quite big, S...

I see, I think for the bigger ones what I do is for S3

#

they support range downloads

#

So you can in parallel upload / download

#

files

#

So for large files, that is prob what you want to look into, that is what i did for my larger files

warm quarry Feb 15, 2024, 5:25 PM

#

Thanks for the help

odd zinc Feb 15, 2024, 5:25 PM

#

Yup no worries, and if you REALLY want to xD:
https://discord.com/channels/912829806415085598/1200525738449846342

You can optimize even further with a concurrent worker hahaha. idk how much GPU u are eating up tho

#

But im my mind a PDF renderer might not be eating up GPU resources all the way - i could be wrong

#

but it something ive been playing with

#

lol, but my video / audio transcriber does eat up a lot of resources so i could only get maybe 2 concurrent things going at once

#

Anyways gl 🙂

warm quarry Feb 15, 2024, 5:27 PM

#

Good luck for you too 🙂

odd zinc Feb 15, 2024, 5:28 PM

#

Just a summary so I can mark this solution:

Can use network storage to persist data in between runs
Use a outside file storage / object storage provider
If using Google cloud / S3 Bucket, for large files can use parallel downloads / uploads; there should be existing tooling out there; or can obvs custom make ur own

#Directing requests from the same user to the same worker