#Directing requests from the same user to the same worker

1 messages · Page 1 of 1 (latest)

warm quarry
#

Guys, thank you for your work. We are enjoying your platform.

I have the following workflow. On the first request from the user, the worker does some hard stuff about 15-20s, caches hard stuff and all subsequent requests are very fast ~150ms. But if some of the subsequent requests goes to another worker, it should repeat this hard stuff again (15-20s). Is there any possibility to direct all the subsequent calls from the same user to the same worker?

daring dewBOT
#

To help others find answers, you can mark your question as solved via Right click solution message -> Apps -> ✅ Mark Solution

raw kiln
#

You only really benefit from FlashBoot if you have a constant flow of requests. Otherwise you can either set an Active worker or increase the idle timeout.

vagrant prawn
#

@warm quarry you can use request count scaling, and do something like first 100 requests you only need 1 worker, etc

warm quarry
#

It seems I have introduced a bit of confusion with my explanation of workflow. I will expand on it. My model is working on rendered construction drawings pdf. When user makes some request, pdf downloads from s3 and then renders high quality image, depending on pdf can take ~ 5s-30s. Each user has there own pdf. On subsequent request if request arrives to the same worker, hard work (downloading, rendering) already done, only model evaluates which is fast (150 ms). But if request arrives to another worker, it should download and render everything again. If we are scaling our workers to 10-20, what we are planning to do, it quite ruin the experience for the user, because on every pdf it will have 10-20 very slow requests.

odd zinc
#

All pods on serverless can attach to a network storage, which would allow you to persist data in between workers / runs

#

And then that way all workers have the same backing storage

#

Essentially your workflow then should like:

  1. Worker gets a job
  2. Check network storage for client id > if exists pull existing resources > if not create a new folder
  3. Continue with the job from whatever xyz point.
  4. Write results if needed to network storage for other workers
warm quarry
#

yep, that's nice, thanks a lot. The only thing it will limit workers to one datacenter

odd zinc
#

B/c of what you said, I actually prefer to use my own storage mechanism, especially if your files' arent insanely big for the final files it sounds like, or the initial resources

#

(my personal wrapper lol)

warm quarry
#

🙂 Much appreciated. But would firebase be faster then just uploading files to s3?

#

Never heard about it

odd zinc
#

it just an easier wrapper around Google Cloud Buckets, so I like Firebase

odd zinc
#

I just hate AWS

#

xD

#

Honestly, if I could, Id avoid aws / google buckets if i could 😆, but no better file storage / object storage providers out there

warm quarry
#

🙂 Who loves them ?

odd zinc
#

But yeah, I also just think it's easier for me to have an easy wrapper around Google firebase file storage + they got a nice UI + I get a ton of file storage for free before I need to pay for it

#

So it is great for me, for developing cause I dont need to keep paying AWS ingress/egress costs

warm quarry
#

nice, nice

#

We just already have a lot of infra around s3 😦

odd zinc
#

Haha, then go with S3

#

But yeah not too bad, and I think cause I work with really long audio / videos with runpod, if your files can be optimized before sending / downloading it (compressing, converting file formats, stripping unnecessary data etc) it can also help with getting things moving faster. but honestly ur files sound small enough where that might not be necessary

#

idk how big ur files are tho

warm quarry
#

it depends, a lot of pdfs quite small 30 mb, but render time can be quite big, Some of them about 500 mb.

odd zinc
#

they support range downloads

#

So you can in parallel upload / download

#

files

#

So for large files, that is prob what you want to look into, that is what i did for my larger files

warm quarry
#

Thanks for the help

odd zinc
#

But im my mind a PDF renderer might not be eating up GPU resources all the way - i could be wrong

#

but it something ive been playing with

#

lol, but my video / audio transcriber does eat up a lot of resources so i could only get maybe 2 concurrent things going at once

#

Anyways gl 🙂

warm quarry
#

Good luck for you too 🙂

odd zinc
#

Just a summary so I can mark this solution:

  1. Can use network storage to persist data in between runs
  2. Use a outside file storage / object storage provider
  3. If using Google cloud / S3 Bucket, for large files can use parallel downloads / uploads; there should be existing tooling out there; or can obvs custom make ur own