#Groq Server
1 messages · Page 1 of 1 (latest)
i think with the new partnership on azure .. i dont think thats the way
also groq deployments arent really as scaleable as you think
its 230mb sram per accel at 20k per pop
deployments are extremly expensive and power hungry
unless auzure gets 100k of those
also models have to be separated compiled for the accels - pretty much like cuda but differently
the real upside on groq is in different industrys
say defence / medical / finance
or hyper scale
its pretty much what we done in fpga accelerators like the vc1902 for a few years now
just a bigger chip
the lower card here in my mashine is a vck5k
and uses the vc1902 fpga fabric
there lpu(groq) is pretty much a systolic array with high sram
just asic vs fpga
20k per pop is the comically overpriced Mouser price. It's a 14nm chip with no DRAM, gotta be very cheap to make, so the price per card is likely somewhere between 500$-5k depending on the volume.
chip cost is a function of wafers orderd
that's why I'm saying "depending on the volume"
i paid 13k for my vck
and its 1/4 as powerfull as the groq
that stuff is just that pricy
also .. they charge what they want really as they have the only accelerator expect for cerebras in that line
but tbh i would rather use cerebras wafer engine in such an embarkment
the unit economics are not customer scale
even if its 2.5 mil for a 70b model you get quite a few h100's for that and can actually train on em
this is raw inference
again .. mistral doesnt operate own datacenters - it would be for the datacenters to buy those accelerators and make em available to mistral
so really the one you would need to be talking to is ms azure 🙂
but akwardly enough they rather buy h100's
and its not like groq is overnight either .. they been there for quite some time
the problem with H100 is not the price ) even with 2.5M$ you can't any right now ... Groq is a perfectly valid alternative and (beside any secret term with microsoft) Mistral would have a lot to gain to partner with them and have their model served by Groq
i mean thats where opinions divide - it still has to be in a datacenter - and the models have to be compiled to run on groq in a very custom way
as with any custom accelerator
thats usually a longer process
also scaling is difficult as you need a other full deployment
and as for leadtimes - sure they are 6 months with h100's
but there is a bit more to it then just inferenec
as with groq all you can do is int 8 inference
cerebras is overall the better bet
if one would embark down that route
That's why i said "partener" , Groq allreadry serve mistral-7B and mixtral on theyr own API, i am sure they wouldserve small, medium and large under some conditions
its a risky move as it would be a first for a newer player - ofc they have the hardware
but there is noone else on there api's expect for open source /weight models
its early days - so partnering without the operational hardware would question scalability .. / as of now i see it more as a preview product
but for someone to push the production infernce on them is a risky move
wait and see