Groq Server | Mistral AI | Page 1

pure musk Feb 26, 2024, 8:42 PM

#

You should work with Groq to get really fast responses. https://groq.com/
I am not affiliated with Groq in any way.

lunar shale Feb 27, 2024, 6:47 AM

#

i think with the new partnership on azure .. i dont think thats the way

#

also groq deployments arent really as scaleable as you think

#

its 230mb sram per accel at 20k per pop

#

deployments are extremly expensive and power hungry

#

unless auzure gets 100k of those

#

also models have to be separated compiled for the accels - pretty much like cuda but differently

#

the real upside on groq is in different industrys

#

say defence / medical / finance

#

or hyper scale

#

its pretty much what we done in fpga accelerators like the vc1902 for a few years now

#

just a bigger chip

#

#

the lower card here in my mashine is a vck5k

#

and uses the vc1902 fpga fabric

#

there lpu(groq) is pretty much a systolic array with high sram

#

just asic vs fpga

wooden musk Feb 27, 2024, 9:01 AM

#

lunar shale its 230mb sram per accel at 20k per pop

20k per pop is the comically overpriced Mouser price. It's a 14nm chip with no DRAM, gotta be very cheap to make, so the price per card is likely somewhere between 500$-5k depending on the volume.

lunar shale Feb 27, 2024, 9:02 AM

#

chip cost is a function of wafers orderd

wooden musk Feb 27, 2024, 9:02 AM

#

that's why I'm saying "depending on the volume"

lunar shale Feb 27, 2024, 9:02 AM

#

i paid 13k for my vck

#

and its 1/4 as powerfull as the groq

#

that stuff is just that pricy

#

also .. they charge what they want really as they have the only accelerator expect for cerebras in that line

#

but tbh i would rather use cerebras wafer engine in such an embarkment

#

the unit economics are not customer scale

wooden musk Feb 27, 2024, 9:04 AM

#

they said themselves that the price is not 20k

#

there's a tweet

lunar shale Feb 27, 2024, 9:05 AM

#

even if its 2.5 mil for a 70b model you get quite a few h100's for that and can actually train on em

#

this is raw inference

#

again .. mistral doesnt operate own datacenters - it would be for the datacenters to buy those accelerators and make em available to mistral

#

so really the one you would need to be talking to is ms azure 🙂

#

but akwardly enough they rather buy h100's

#

and its not like groq is overnight either .. they been there for quite some time

queen axle Feb 29, 2024, 2:44 PM

#

lunar shale even if its 2.5 mil for a 70b model you get quite a few h100's for that and can ...

the problem with H100 is not the price ) even with 2.5M$ you can't any right now ... Groq is a perfectly valid alternative and (beside any secret term with microsoft) Mistral would have a lot to gain to partner with them and have their model served by Groq

lunar shale Feb 29, 2024, 2:46 PM

#

i mean thats where opinions divide - it still has to be in a datacenter - and the models have to be compiled to run on groq in a very custom way

#

as with any custom accelerator

#

thats usually a longer process

#

also scaling is difficult as you need a other full deployment

#

and as for leadtimes - sure they are 6 months with h100's

#

but there is a bit more to it then just inferenec

#

as with groq all you can do is int 8 inference

#

cerebras is overall the better bet

#

if one would embark down that route

queen axle Feb 29, 2024, 2:48 PM

#

lunar shale i mean thats where opinions divide - it still has to be in a datacenter - and th...

That's why i said "partener" , Groq allreadry serve mistral-7B and mixtral on theyr own API, i am sure they wouldserve small, medium and large under some conditions

lunar shale Feb 29, 2024, 2:49 PM

#

its a risky move as it would be a first for a newer player - ofc they have the hardware

#

but there is noone else on there api's expect for open source /weight models

#

its early days - so partnering without the operational hardware would question scalability .. / as of now i see it more as a preview product

#

but for someone to push the production infernce on them is a risky move

queen axle Feb 29, 2024, 2:50 PM

#

wait and see

#Groq Server