#How much would it cost to run Neuro all week, non-stop?

1 messages · Page 1 of 1 (latest)

sullen lynx
#

we dont know any specifcs but i think it would probably be a few thousand

neon pumice
#

~~$1100 is an upper bound, assuming the main costs of Neuro are the LLM and custom TTS components, and that both are paid for via a SOTA API.
Evil seems to speak around 9000 words/hour or 45k characters/hour (incl. spaces)
So in a month of evil streams, she will have spoken 7.6M characters. We'll round this up to 8M characters, and say that this is also 4M worth of tokens.
Then we add up the costs, and add an extra 20% to cover the rest.

I think this overestimates the cost by a decent chunk.
Vedal could have negotiated a lower cost for the TTS stuff because he has a very predictable need for TTS and could have gotten a custom plan (assuming that you can negotiate for token costs like you can for your TV plan).
The LLM inference is also definitely much cheaper than $15/MTok. It's likely not a frontier model, and is probably hosted on a gpu instance or even locally
https://azure.microsoft.com/en-us/pricing/details/cognitive-services/speech-services/
https://www.anthropic.com/pricing#anthropic-api~~
Edit: the LLM costing is completely wrong - I ignore the cost of the input tokens (this undermines this upper bound somewhat)

night ravine
neon pumice
#

the big costs of Neuro are definitely R&D

sullen lynx
#

Yep

south walrus
#

llm + tts are not all of neuro

neon pumice
#

apparently Neuro is on "LLM Model 6, Iteration 18" or smth similar
assuming that iteration refers to each finetune of a model, then around 6*18 = 190 finetuning runs have occurred for the LLM in the past 2 years (leading to a whopping 2 finetunes every week). Even if Neuro is just an 8B, I think it's still a somewhat costly to finetune that twice a week
Of course, TTS, filter etc. probably also have hefty training costs

It's much harder for us outsiders to estimate the cost of R&D - we don't know even know what components are in her system, and how ambitious Vedal is with R&D

south walrus
#

electricity costs could easily not be insignificant

#

input tokens would probably cost more ngl on an api

night ravine
sullen lynx
#

Vedal works on neuro full time so yes

night ravine
# south walrus electricity costs could easily not be insignificant

Yup. I suppose one could probably make an approximation of the total until now using the average electricity cost per hour in the UK x the total amount of time Neuro has streamed since her debut. Though, we don't know how much you train her off stream so there's that too

#

lol. Guys, Vedal may be egging us go launch a financial investigation on his operations NeuroClueless

neon pumice
#

nah he's just throwing us red herrings

neon pumice
# south walrus electricity costs could easily not be insignificant

eh suppose you had 8 4090s running nonstop
8 * 0.5kW * 24hr * 7days * 12p/kWh = £80
doesn't seem that much compared to the cloud costs, but that's probably because the cloud costs I calculated are inflated for some reason
I guess if Neuro was mostly local, this would be a pretty big chunk of the cost

night ravine
sullen lynx
#

Neuro isn’t local iirc

south walrus
#

but also pretty sure if neuro was claude and i was using the api it would cost wayyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy more than $60

#

in a model like claude 3.5 sonnet

#

and 200k context window

#

if you were using the whole context window, each request alone would be $0.75 for input tokens

#

and assuming those 8 million characters are like

#

idk being generous here only 20000 requests

#

thats $15000

#

which is a bit more than your estimate of $60

neon pumice
night ravine
#

That's a lot neuro7

#

Man, that last subathon must have been expensive.

sullen lynx
#

the subs would have paid off the expenses no?

night ravine
# sullen lynx the subs would have paid off the expenses no?

No idea tbh. Though, assuming that Vedal uses Claude (Which he was probably joking about NeuroTease idk), we would need to go back to the subathon vods and compute an estimate of the total gross return from donations and subs during that period. Moreover, we could use data from his total hours streamed with Neuro (Someone uploaded the data for that on the forums), and simply to a Revenue - ((average token requests made by Neuro in an hour x total hours streamed during the subathon) x the token input cost) to get to a rough estimate

south walrus
#

thankfully neuro is not the claude api

#

or I could be bankrupted who knows

#

also if you don't wanna pay silly amounts for input context you can just utilise claudes new caching thing

#

or better yet

#

host ur own os model

#

use any inference engine with kv caching