#required Hardware

1 messages · Page 1 of 1 (latest)

neat zodiac
#

I work at a place that currently works on an AI chatbot for a software dep. of around 200 people.
What hardware would be required to properly run this chatbot with a LLM with around 60b params. Does anyone have a rough idea?

(lets say each person uses the bot max. twice/h)

Thanks in advance!

toxic coralBOT
#

Ayo? @neat zodiac level 1 !!! lfg

rough barn
# neat zodiac I work at a place that currently works on an AI chatbot for a software dep. of a...

A rough estimate would be multiple high-performance servers or a cloud-based solution with significant computational power and memory capacity to handle the model's processing demands efficiently. The exact hardware requirements would depend on factors such as the complexity of the interactions expected peak usage and desired response times. So to make a long story short, you need at least an rtx 3000 series and a decent CPU to get everything to run efficiently especially if it's a bunch of ppl using it. :p you kinda need to rely on cloud services tbh or make your own

neat zodiac
rough barn
#

I'm not sure though how much just a singular a6000 can take

#

Cuz I've never really used one b4

#

Although I doubt that it would struggle much

rough barn
#

DDR5 ofc

#

Just make sure to have lots of a6000's

neat zodiac
neat zodiac
#

"less secure" as in less reliable

rough barn
rough barn
#

But tbh compared to the a6000

#

It isn't really that worth it :p

#

And really ram can help in caching, data transfer, data preprocessing, parallel processing and system operations

rough barn
#

So like it's always a good idea to also have a good amount of ram

rough barn
#

More than 64?

neat zodiac
#

U think 1 aint enough?

rough barn
#

It won't really harm the gpu so go ahead

#

Unless u overclock

#

Which is not a good idea tbh

neat zodiac
rough barn
#

Yeah no over clocking is a no no

neat zodiac
#

Its company money, so idc, I just need to provide proper reasons, otherwise we wont get the funding for this project

rough barn
#

And let's say u want to branch out

#

U may need more

#

But in reality u already have 2 gpus so no need to get a new one

neat zodiac
toxic coralBOT
#

Ayo? @neat zodiac level 2 !!! lfg

rough barn
#

Ohh then good

#

Should do the trick

neat zodiac
#

What

rough barn
#

But make sure u also have a good cpu

neat zodiac
#

Wym = what do you mean lol

rough barn
neat zodiac
rough barn
#

I'm a little confused :p bare with me

neat zodiac
#

Nahh, i said wym I already have 2, wym means what do you mean

#

We dont have any

rough barn
neat zodiac
#

We

#

Dont

#

Have

#

Any

bright vessel
#

(e.g. privacy, or custom models)

neat zodiac
#

or to train other models

#

and we dont wanna get sued soooo yeahhhhhhhhhhh

#
  • like to keep our customers
#

lol

#

also we live in germany, data protection law is extremely serious/strict here

bright vessel
#

Fair. There are european models as a service available, they should comply with the GDPR

neat zodiac
#

4090s support multi GPU(ing), right? like using to 4090s for example to run an AI chatbot

#

I need to collect some examples lol, meeting with management is on thursday and till then we need to show them why we need specific hardware

bright vessel
#

I'm not too knowledgeable on the big-scale deployments, but perhaps renting cloud GPUs to test on would be wise ?

bright vessel
#

Not for future deployment, but only to collect figures and see which hardware / software suite is appropriate

neat zodiac
#

and I kinda dont wanna take that route lol

toxic coralBOT
#

Ayo? @neat zodiac level 3 !!! lfg

bright vessel
#

Ah fair 😅

neat zodiac
#

@bright vessel whats a good AMD option for a good workstation gpu?

bright vessel
#

So with AMD you're essentially gambling

neat zodiac
#

isnt that technically a good option?

#

looking at the 2nd comparison

#

to the 6000 ada

bright vessel
#

assuming things work smoothly. It's tremendously worse on consumer cards, but enterprise should be fine

neat zodiac
#

like on paper (on the screenie) the w7900 looks fire

bright vessel
#

On paper yeah, when it comes to software support, that's when fun stuff happens

#

Nvidia's CUDA has been the de facto standard for years now, while AMD's solutions are quite new

neat zodiac
#

this is what the ollama discord server says

bright vessel
#

They work, but bugs are to be expected

neat zodiac
#

hmmmmm

#

okay o looking at the cores, amd is lacking

#

this is what ive written down so far

neat zodiac
#

We aint training the models

#

we just wanna use them

cedar swan
neat zodiac
bright vessel
#

Take a look at the advertised "TFlops" and memory bandwidth, those are the relevant numbers

neat zodiac
bright vessel
#

Honestly good luck with that lol

#

Managing to just wing it™️ and land with something that works is a challenge for sure

neat zodiac
neat zodiac
#

or here

#

for the a6000

rough barn
#

Scary stuff :>

bright vessel
neat zodiac
#

we using ollama

bright vessel
#

So that's llama.cpp under the hood if I remember right

neat zodiac
bright vessel
#

Yup it is

neat zodiac
#

what cores am I supposed to look at?

#

*TFLOPS

bright vessel
neat zodiac
toxic coralBOT
#

Ayo? @neat zodiac level 4 !!! lfg

neat zodiac
#

obviously selfhosted

bright vessel
#

Make sure to check, but I'm almost certain you're looking for the normal "single precision" or "half precision" TFlops

neat zodiac
#

my workday is over now, imma continue tmr, thanks so far prayge ill let u know in case I need more stuff tmr xd cuz me has to do some comparing between cpu's next lol

bright vessel
#

You might be interested in the koboldai discord server, found over at https://koboldai.org/discord

They're a community of enthusiasts, some of them run bigger servers than others, but I'm sure you'll get great answers there.
Plus they've got some of the people working on the inference backends

Discord

This community is dedicated to the usage and development of KoboldAI's software, as well as broader text generation AI. | 11301 members

neat zodiac
surreal river
#

@neat zodiac is there any specific reason why you want to focus on a 70b model instead of a 34 or 7 billion one?

neat zodiac
#

We benchmarked the 7b one and it was quite amazing so I assume the 34b one should be even better, why not using the best one? ^^

surreal river
# neat zodiac I meant the 34b one

I see, i've not tested 34b but a 13b works perfectly on A6000 using Fast API, we had a test run with 200+ users and was pretty smooth in terms of usage

#

Also sorry if you have mentioned it above already but are you looking to host it locally or going for a cloud server?

neat zodiac
#

Im in the car now, driving to the office, lemme get there rq and imma get back to u ^^

surreal river
#

All good, i'm going to sleep been up it's 8 AM here lol, you can post here i'll be up in like a few hours 4-5

neat zodiac
#

oki @bright vessel me is back

#

how important is a fast cpu?

neat zodiac
bright vessel
#

Not really sure, the best I've used was a 6600xt with a ryzen 3600... I really don't know how it scales

neat zodiac
#

Mby u know @rancid hatch

#

Wrong lisa

rough barn
#

hai

#

im here

neat zodiac
#

@rough barn

#

Well

#

Lol

rough barn
#

i wouldnt wanna be bottlenecked or anything

#

cuz usuallu llms also need good cpus too

#

and a bunch of ppl are gonna be using it as a server all at once so u might run into problems :p

rough barn
neat zodiac
#

👍

bright vessel
neat zodiac
#

2 4090s

bright vessel
#

Check the PCIe lanes of the platform you'll be running on

#

Most consumer boards will halve the bandwidth available to the first card when you have two

#

and can unless halve / quartier the bandwidth available to the second one

neat zodiac
#

can u also recommend me a proper mainboard?

bright vessel
#

Again, I don't do much of the server stuff nails

neat zodiac
#

something that we can use when we wanna add 3 or 4 cards

#

brrrrrrr

bright vessel
#

The KoboldAI discord would have answers for sure though

neat zodiac
neat zodiac
#

ty ❤️

#

btw @bright vessel

#

I legit for the love of god have no clue what tflops im supposed to look at lmfao

bright vessel
#

Single precision

neat zodiac
#

ahhhh

bright vessel
#

Even though I'm surprised they don't give half-precision figures

rough barn
#

i mean he said around 200 people

neat zodiac
#

god dammit why dont the yjust name everything the same

bright vessel
rough barn
#

i just wouldnt want any bottle necking to take place u know

bright vessel
#

Hence, why you might just choke your GPUs by simply adding an extra nvme or GPU on a 5600x platform

bright vessel
neat zodiac
rough barn
#

they'll probably help a lot more

neat zodiac
rough barn
#

and discords just for that

neat zodiac
#

Im askin around in other servers as well but so far nothing really helpful

rough barn
#

ask bing too :3

#

maybe it helps i dunno

#

but it might say dumb stuff :p

neat zodiac
#

@bright vessel bad first impression in the kobald server lmfao

bright vessel
rough barn
neat zodiac
#

I can tell

rough barn
#

ig do more research or gamble :3

neat zodiac
#

this guy .-.