How much would it cost to run Neuro all week, non-stop? | Neuro-sama Headquarters | Page 1

sullen lynx Sep 13, 2024, 4:39 PM

#

we dont know any specifcs but i think it would probably be a few thousand

neon pumice Sep 13, 2024, 5:35 PM

#

~~$1100 is an upper bound, assuming the main costs of Neuro are the LLM and custom TTS components, and that both are paid for via a SOTA API.
Evil seems to speak around 9000 words/hour or 45k characters/hour (incl. spaces)
So in a month of evil streams, she will have spoken 7.6M characters. We'll round this up to 8M characters, and say that this is also 4M worth of tokens.
Then we add up the costs, and add an extra 20% to cover the rest.

I think this overestimates the cost by a decent chunk.
Vedal could have negotiated a lower cost for the TTS stuff because he has a very predictable need for TTS and could have gotten a custom plan (assuming that you can negotiate for token costs like you can for your TV plan).
The LLM inference is also definitely much cheaper than $15/MTok. It's likely not a frontier model, and is probably hosted on a gpu instance or even locally
https://azure.microsoft.com/en-us/pricing/details/cognitive-services/speech-services/
https://www.anthropic.com/pricing#anthropic-api~~
Edit: the LLM costing is completely wrong - I ignore the cost of the input tokens (this undermines this upper bound somewhat)

night ravine Sep 13, 2024, 6:25 PM

#

neon pumice ~~$1100 is an upper bound, assuming the main costs of Neuro are the LLM and cust...

I see. Thinking about it, it would be really cool for someone to do a data analysis of the potential monthly/ yearly costs of Neuro until now. neuroKufufu (I may try to do this)

neon pumice Sep 13, 2024, 6:26 PM

#

the big costs of Neuro are definitely R&D

sullen lynx Sep 13, 2024, 6:31 PM

#

Yep

south walrus Sep 13, 2024, 6:31 PM

#

llm + tts are not all of neuro

neon pumice Sep 13, 2024, 6:32 PM

#

apparently Neuro is on "LLM Model 6, Iteration 18" or smth similar
assuming that iteration refers to each finetune of a model, then around 6*18 = 190 finetuning runs have occurred for the LLM in the past 2 years (leading to a whopping 2 finetunes every week). Even if Neuro is just an 8B, I think it's still a somewhat costly to finetune that twice a week
Of course, TTS, filter etc. probably also have hefty training costs

It's much harder for us outsiders to estimate the cost of R&D - we don't know even know what components are in her system, and how ambitious Vedal is with R&D

south walrus Sep 13, 2024, 6:32 PM

#

electricity costs could easily not be insignificant

#

input tokens would probably cost more ngl on an api

night ravine Sep 13, 2024, 6:32 PM

#

neon pumice the big costs of Neuro are definitely R&D

Most likely. I mostly wonder if Neuro is currently suistainable wih donos and stuff

sullen lynx Sep 13, 2024, 6:34 PM

#

Vedal works on neuro full time so yes

south walrus Sep 13, 2024, 6:34 PM

#

neon pumice ~~$1100 is an upper bound, assuming the main costs of Neuro are the LLM and cust...

this math seems quite wrong

night ravine Sep 13, 2024, 6:35 PM

#

south walrus electricity costs could easily not be insignificant

Yup. I suppose one could probably make an approximation of the total until now using the average electricity cost per hour in the UK x the total amount of time Neuro has streamed since her debut. Though, we don't know how much you train her off stream so there's that too

#

lol. Guys, Vedal may be egging us go launch a financial investigation on his operations NeuroClueless

neon pumice Sep 13, 2024, 6:37 PM

#

~~nah he's just throwing us red herrings~~

neon pumice Sep 13, 2024, 6:39 PM

#

south walrus electricity costs could easily not be insignificant

eh suppose you had 8 4090s running nonstop
8 * 0.5kW * 24hr * 7days * 12p/kWh = £80
doesn't seem that much compared to the cloud costs, but that's probably because the cloud costs I calculated are inflated for some reason
I guess if Neuro was mostly local, this would be a pretty big chunk of the cost

night ravine Sep 13, 2024, 6:40 PM

#

neon pumice ~~nah he's just throwing us red herrings~~

We're on to him Erm

sullen lynx Sep 13, 2024, 6:42 PM

#

Neuro isn’t local iirc

south walrus Sep 13, 2024, 6:53 PM

#

neon pumice eh suppose you had 8 4090s running nonstop 8 * 0.5kW * 24hr * 7days * 12p/kWh = ...

compared to ur somehow $60 of claude costs its not insignficant

#

but also pretty sure if neuro was claude and i was using the api it would cost wayyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy more than $60

#

in a model like claude 3.5 sonnet

#

and 200k context window

#

if you were using the whole context window, each request alone would be $0.75 for input tokens

#

and assuming those 8 million characters are like

#

idk being generous here only 20000 requests

#

thats $15000

#

which is a bit more than your estimate of $60

neon pumice Sep 13, 2024, 7:14 PM

#

south walrus if you were using the whole context window, each request alone would be $0.75 fo...

ahhh i see
i was wondering how those other costs came in
evilWheeze i was way off

night ravine Sep 13, 2024, 7:33 PM

#

That's a lot neuro7

#

Man, that last subathon must have been expensive.

sullen lynx Sep 13, 2024, 7:35 PM

#

the subs would have paid off the expenses no?

night ravine Sep 13, 2024, 7:42 PM

#

sullen lynx the subs would have paid off the expenses no?

No idea tbh. Though, assuming that Vedal uses Claude (Which he was probably joking about NeuroTease idk), we would need to go back to the subathon vods and compute an estimate of the total gross return from donations and subs during that period. Moreover, we could use data from his total hours streamed with Neuro (Someone uploaded the data for that on the forums), and simply to a Revenue - ((average token requests made by Neuro in an hour x total hours streamed during the subathon) x the token input cost) to get to a rough estimate

south walrus Sep 13, 2024, 8:22 PM

#

thankfully neuro is not the claude api

#

or I could be bankrupted who knows

#

also if you don't wanna pay silly amounts for input context you can just utilise claudes new caching thing

#

or better yet

#

host ur own os model

#

use any inference engine with kv caching

#How much would it cost to run Neuro all week, non-stop?