#Running 30 a100 workers

1 messages · Page 1 of 1 (latest)

pine raptor
#

Can I run 30 a100 workers for an endpoint. We have a business process which needs low processing time requirements. I want to test how much would it cost to hanle 30 request in this platform daily basis to see if it is going to be feasable for us. How can I increase worker number it is allowing me to increase to 5 right now.

pine pivotBOT
#

To help others find answers, you can mark your question as solved via Right click solution message -> Apps -> ✅ Mark Solution

fading iris
#

Currently a minimum balance is required to increase the number of workers. It should be $200 for 20 workers and $300 for 30 workers. Once you have a higher minimum balance you should see a button in the console under the "Serverless" section where you make new endpoints (like in this image)

pine raptor
#

Not planning to spend that much if the performance and price balance of the platform is not a fit for us is there any special case that someone can increase my quota temporarly ? @fading iris

pine raptor
#

@fading iris I have another question no needed for incrasing quota

#

I use a100 gpu setting in colab and same a100 gpu setting in here

#

In google colab inference takes 3 minutes to finish for same data but here it is almost 6 min

#

I am not talking about cold start + processing time

#

this is the time that it takes to process data for the model

#

What may cause such difference. I am not sure this question is too out of context but this is one of the first AI workload we are going to deploy in gpu so I do not have much experience with GPU virtualization etc .

#

And I calculated the price according to compute time of colab + - variations

#

doubling the time in same hardware seems a little bit too much

#

By Hardware I mean the gpu maybe ram and cpu is not matching. But model does not consume on other sources that much.

unkempt oracle
#

maybe the model takes forever to load from the network drive?