#programming
1 messages · Page 514 of 1
i was hoping there woudl be a progress bar somewhere

its still dowloading shit
vllm moment
my desktop does not appreciate the throttled ethernet speeds
jsut become a woman already superbox
actually


But what if it's changed
This is a wild world we're living in
Someone can just turn into a girl
I guess I just didn't tell anyone on Neurocord
so you are a girl?
You could say so
what does that mean?

hypothesis: the sheer number of notifs shiro gets has made him train an llm to read a ping and see if it’s important or not
and if and only if it is important he will descend from the heavens
beginning experiment 1: @real sierra happy birthday
am I a lab rat or something
AHH jump scared
also no lol I would not do this with an llm
hi shiro 
hi shiro 
Rare Shir spotted
maybe
I would use a much better understood technique
LLMs and AI in general are over hyped and applied wrongly because of that hype
if “birthday” in message:


i thought the all programmers are trans woman/femboys was a joke
nah I would collect data on which messages are worth responding to over a period of time and then correlate parameters like word frequency and message length
and then build a simple scoring model from that
“Superbox is a girl?”
?
Worlds smallest emote
the actual statistic is like

It isn’t
Source;
hey what if I said
shiro someone is spamming and im not
happy with what’s going down it’s the
birth of a dooms
day in gen chat
99.99999976% (I’m the one)
how does that score
so far
laid off 
empoyeed ones i too
alberta 
Hey that’s the woman from the woman I know

with enough samples that would probably be considered important
i read samples as sam and got slightly concerned
sam apples
sam altman
Meanwhile @shiro_nya Kill message from programming is 100% worth reading
corpa programmers are just in the closet

Thanks
I think

am i not 100%
what percent am i
mah im broke.
that's for you to decide. but i said of the people i know. and no offense but i barely know you
So am I 
The fact that you are asking means it’s below 100

true 
I know where you live thouh
lovely country just under bolivia right

@viral oasis

Yeah same thing

ik that I said mongolia last time remember

ME?!?!?!!?!

no
It’s your name isn’t it
i dont remember
GREAT GOOGLY MOOGILY
Your already popular pal
Bud
Compadre
Amigo
Other words for friend
🪣
bucket

b-but.... WHY STOP THERE?!
Imagine if i see my name like
one more time!111
Wwwwwwwooooah

you will be very happy soon
holy moly
it's the famous warframe character, the operator!
OMG OMG they said the thing!111
stinkies
I may or may not be testing something ```===== Phoneme Distribution Summary =====
AP: 172, SP: 9651, de/a: 144, de/au: 16, de/b: 69, de/ch: 75, de/cl: 9, de/ctrash: 4, de/d: 131, de/e: 45,
de/eh: 34, de/ei: 73, de/er: 102, de/eu: 2, de/ex: 96, de/f: 49, de/g: 40, de/h: 17, de/i: 150, de/ih: 136,
de/j: 16, de/k: 45, de/l: 111, de/m: 94, de/n: 216, de/ng: 11, de/o: 28, de/oe: 3, de/oeh: 3, de/oh: 20,
de/p: 41, de/q: 36, de/r: 41, de/rh: 1, de/s: 103, de/sh: 57, de/t: 159, de/ts: 46, de/tsh: 2, de/u: 49,
de/ue: 9, de/ueh: 7, de/uh: 65, de/v: 80, de/vf: 38, de/w: 3, de/xh: 13, de/z: 52, en/EP: 1481, en/aa: 1690,
en/ae: 2059, en/ah: 2685, en/ao: 1286, en/aw: 527, en/ax: 3070, en/ay: 3140, en/b: 1602, en/ch: 234, en/d: 2374, en/dh: 1883,
en/dr: 32, en/dx: 1300, en/eh: 1933, en/el: 3, en/en: 5, en/er: 1322, en/ey: 1646, en/f: 1317, en/g: 887, en/hh: 1179,
en/ih: 3815, en/iy: 2784, en/jh: 323, en/k: 2226, en/l: 3282, en/m: 2934, en/n: 5287, en/ng: 935, en/ow: 1644, en/oy: 98,
en/p: 1078, en/q: 5601, en/r: 3122, en/rx: 1, en/s: 3059, en/sh: 401, en/t: 4083, en/th: 414, en/tr: 97, en/uh: 359,
en/uw: 1901, en/v: 1219, en/vf: 1129, en/w: 1935, en/y: 1370, en/z: 1699, en/zh: 31, ja/EP: 43, ja/N: 334, ja/a: 1497,
ja/b: 125, ja/by: 15, ja/ch: 88, ja/d: 275, ja/e: 808, ja/f: 55, ja/g: 201, ja/gy: 24, ja/h: 118, ja/hy: 23,
ja/i: 1204, ja/j: 105, ja/k: 628, ja/ky: 42, ja/m: 301, ja/my: 17, ja/n: 546, ja/ny: 29, ja/o: 1336, ja/p: 60,
ja/py: 8, ja/q: 69, ja/r: 529, ja/ry: 29, ja/s: 314, ja/sh: 265, ja/t: 562, ja/ts: 78, ja/u: 1075, ja/w: 232,
ja/y: 142, ja/z: 57

what that
joking aside It's preliminary Testing for trilingual neuro/Evil synth
Phonemes are the seperate sounds that make up a word when you speak
i see
That's a summery of all the Phonemes in the current dataset. How many of each example and what languges they belong too
neurosynth mention 
I'm testing Axillery Datasets in languges they twins DONT speak
gemma-4-26b-a4b-it-heretic doesn't like to run 
im trying to use a quant but it doens tlike to rune ither
which one
what error
total number of current samples 3057
gguf doesnt usually run well on vllm in case ur using it still


then why you say vllm is better when its worse?
Superbox may have a 3090, but i have stubbornness.
vllm is better when u use the correct formats
in that case just try the intel one
to see whether any gemma 4 26b a4b runs at all
but the intel one isnt good for finetuning no?
TENSORCORES ACTIVE
it doesnt matter for testing since the architecture is identical
You
You
You wouldn’t happen to be taking about
Neurosynth
yes
Glueless

Neurosynth isn't a solo project
it never was
Classic
Superbox has mentioned neurosynth enough times that they basically have claim over anything involving neuro and music
Also
wispers
Neurosynth is based on some old work i did when i was part of the melba project
the twins already sing is english and Japanese so we got plenty of JP and EN data. I'm attempting to add German to the internal not so secret Evilsynth Beta
germanese evil singing 
(I needed to redo the evil dataset anyways)
I have a great need for her to sing 99 luftballons
Evilsynth is nowhere close to release yet. we're hopfully releasing Neurosynth Version 1 soon
hm
hm
Oh yeah
Name every neurosynth
isw
Like
I am not too informed on any of that stuff
but wasnt sm1 else working on that or is this shit a group thing
it's group thing
well me and superbox
lmao

up untill 3 days ago we were coordinating this entire thing in our DMs lmao
I am currrently awake purely beacuse of feeling like shit so I figured may aswell do something here
I feel that
[I worked on smthin with sm1 for over a year, shit was purekly cooridnated in dms]
💀
are you german?
( i need more german opntions)
oh thats honstly trsh there
that test was a year ago. not Evilsynth.
thats vioce cloning tech RVC
Neurosynth hadn't sarted at that point
makes sense
xodas opinnion was that its was trash lmao
lol
this UI is in dire needs of improvement
I am still trying to work out what the fuck sorta style im even goign for

like am do I make it simpler, or do I make it have more complexity
its an internal debate
Because either way would be good but in diff ways
i think i might have a workign one now
if it doesnt crash
ive been trying enough stuff that btop says i have 96.2GB of ram used as cache rn
i wanted an unfiltered ai, but ideally less dumb than qwen 2.5
thats a fun one
and vllm seems to suck eventho shadow says its better
idk wat ur best bet would be, smthin minsteral or sm shit
would be my first inclination
i forgot we made this silly thing
im trying out gemma-4-26B-A4B-it-heretic-AWQ-4bit

Why need uncensored
being google would be difficult
it needs to be able to swear

Why the fuck cant these models swear
I know what you are
hahaha
fun
I owuld probably go with minsteral
then just like
do some fuckass fine tuning [idek I have a python script which does it], I just feed it large amounts of text
I wrote the code to do it at like 2 am, most models I use work with it. I never even had the issues with it not being able to swear
Baby's first finetune
ye
I should probably read through what that python script actually does in retrospec , but surely if it aint broke dont fix it
since my data comes form the discord download my data thingy. none of the swearwords are in a proper sentence for it to train off. so i need a base model that knows how to say fuck in a sentence
I fed a model, the most garbage data from a scrpaed discord dm with one of my friends
and
it like worked
Brother
dont admit to scraping discord
If ur training it

/j for the discord gods watching me
It doesn’t fucking matter if it normally swears or not
That’s what the training is for
i odnt have data with swearwords in a sentence tho
shjouldn't really matter
if u want it to be like you in text
it using swearwords like u do
would be
more truthul
or sm shit i can t type nor speak its 2am
@mighty thorn i convinced him to use MoE 
I’m assuming you mean Sam
But idk how
i honestly dont care about moe at all. the only thing it does is make it suck more at what im trying to do
Bro is not finetuning 26b on his GPU 🥀
wym
u can do that
with a 4090
4bit qlora
moe is "mixture of experts"
i need one idiot.
so moe could fuck off
i genuinely dont care about it
The irony is there are no experts
I have the most fucked up fine tuning way
I am reading through the script I normally use to fine tune models, I am quesitoning how it has worked
welll it has so nop oint questioning it
Go watch brain surgery timelapses instead of Neuro since you hate ANN so much
All of the plasticity your little heart can hope for
neurt
nreo
how the fuck has my fine tuning worked I am genuinly experiencing brain damage from my own code I wrote like a year and a half ago
At like 2am
Nice try
explain to me again why?
Nice try buddy pal
Nice try compadre
Easy there tiger
no male? 
The Minecraft creeper
then dont use
better quality for same speed at cost of vram
vram is kinda the issue tho
jus do wat I did, read some random documentation and then produce the most fucked but functional training script possible
Have you ever heard of religious fanaticism
Now imagine that but worse
Then you have MoE obsession
the usa?
no wait
scientology
i shouldnt be makign these types of jokes, ill get banned


I’m deporting you myself 
im not even there

*yet
❓
Suspiciously 32 shaped
Does ur pfp have scoliosis
Who


Vocaloid?
no
What is it
vtuber
cute hampster!!
Wait that’s a vtuber???
No
The secret
yeah shes a vtuber
Grimmi when no coffee 
best waifu 2026
Debtable
if u dont have cute vtuber waifu oshi in 2026 what are u even doing
Showering
What if I like all of them?

Almost equally

kushiro189 
Not that one
What if u dislike all of them equally
Well not as waifu at least
That means the same thing
kushiro189 is taken 

No clue who that is
“it”
Stupid name

Its that one steven king book

cached 112GB

Busted
No clue who that is
“Kaine Corneko”
Like just pick a language
plants 
Smh
Sounds like a fucking nerd

i see baking is going well

i love when my llm says fuck and i celebrate it and throw a massive party
Average ram price in 2026
People keep putting their fucking disgusting dirty wet plants on my checkout counter so I made a ‘Plant Pad’ so that I no longer have to shank them on their way out for getting my register dirty
i see
I sure hope you do

hmmm
this is just gonna set vram usage to 99% used the whole time huh
im basically not able to put anything else on the gpu rn
i put all my pc parts in to picker and the thing is like 8k
i did not pay 8k
i was hoping that something would exist that empties the vram usage of the model hasn't been used in 15min so other programs can make use of it

llama has that https://docs.vllm.ai/en/latest/features/sleep_mode/ this one looks kinda manual tho
VLLM_SLEEP_WHEN_IDLE=1 might be a thing

the sleep mode is kinda auto ig in that it seems to just be in reaction something else actually trying to uise the resources when it's not
idk tho i dont use vllm usualy except when i want to get mad
you should try llama.cpp as a backend w/ vulkan
Gamer discovers ai compute usage isn’t just for news headlines
Oh I’m still plants
Hold please
if you're using the 7900xts
its on the 4090 rn
honestly im hesitant to sell it now, this is pretty poggers
wdym?
im not even a gamer im a gamedev/enginedev/technical artist/3D modeller. i know how gpu compute works
Then why are you surprised lmao
cuz those dont usually pin vram to 99% permanently
screeps 
Skill issue
im using 20gb of vram with vivaldi rn
its not doing anything. so it dont need vram
it has to keep the weights in mem
otherwise you;re loading a bigass model to your ssd (degrades it for no reason) or to a spinning drive which takes infinity time the larger the model
that's why
i mean to say from not to
idk why i swapped that
but ye if u host a llm unless u do some kubernetes auto scheduling/killing setup with reverse proxy or something u need to keep the llm in vram
hmmm
slam the model on a hdd and try loading it from that, always fun
so i guess the 4090 is still accesible for compute, you jsut need to fit it inside the 400mb of vram left. or i lower the context window form 5gb to something less
if it's not doing anything
typically it will cede the memory
at least llama will
one of those things like how chrome fills all memory it could possibly need just because might as well unnless someone asks for it
i see
1 of 1 could be yours for 312487x msrp
i'm going to lose my mind, if i replace cudatoolkit from nixpkgs with my own identical one with all redistributables clangd loses its mind due to the c runtime they added in 13.2
all i want is to not depend on fucking cublas and cusparse 

how does that work with the context window? if a chat is 4 months old you dont want it in vram
but you still want it to exist
the text is saved somewhere else
when u send a request it processes it
that's it will just reload the kv cache that's prefill
so the context is in ssd after you stop using it?
as long as you save it
you would not want to keep it forever because it cant handle infinite context

but you organize it and keep important info
which can be done not manually it's just the functiona l thing tht happens depending on how you go about it
im doing
docker run -d --name vllm-server \
--device nvidia.com/gpu=0 \
-v ~/.cache/huggingface:/root/.cache/huggingface \
-p 8000:8000 --ipc=host \
vllm/vllm-openai:latest \
Mmtay/gemma-4-26B-A4B-it-heretic-AWQ-4bit \
--quantization compressed-tensors \
--trust-remote-code \
--max-model-len 8192 \
--max-num-batched-tokens 4096 \
--gpu-memory-utilization 0.95
so idk how it handles context stuff
Max context for 1 conversation is 8192
very easy way to ensure at least some stuff sticks without doing anything else is just telling them to write memory files around where theyre needed like in a project
yea in that opener, after 8192 it is either truncating it or compressing the middle or something
idk what it does by default
you can probably handle more context
for example it's not directly comparable due to different software but at 131072 context this is the estimate of vram - 27.7gb
the model is 17.2GB, so i have 6.8GB to spare
the context is what's taking all your vram
well, it would be
it takes a lot
halving that to 65536 brings it down 4gb
be the change you want to see
get your model to call you a flat earther
what tok/s are you seeing im curious
btop never shows the full 80 threads unless i close every other thing using vertical space in it

i have no clue
i got a fish
thats the one i used to begine with
then i have no clue
vllm should say tok/s somewhere in there
honestly i dont know why anyone uses vllm for a single person usecase still
it's very obtuse for what is very simple in something even still cli like llama.cpp
it's meant to be serving 40 people at the same time
the website doesnt have any tokens per second. lemme check the backend
i did a bunch of benchmarks with it and 1. it's identical to or worse than llama.cpp performance on one(1) user inference for me, might be an amd thing but i kinda doubt it
2. it took 49 concurrent requests with qwen3.6 27b to saturate kv and throttle
browsing nixpkgs on github has actually been impossible lately ugh
wait

thats not my ip is it?

172
hmm
i would recommend trying out llama.cpp for a comparison
because that seems low to me (not your fault but i would expect vllm's fault)
i mean
it's usable
but
imma be real with you fam. its faster than i can read by a really wide margin. and i truely cant be bothered to change it rn
at some point is all im suggesting
because when you increase the context length (youll need to if you want it to be able to hold a thread for longer than like 10 messages) it'll bog more
you can test it out yourself if you want, the ip thingy is the server and then :3000
this webui thing has a built in css thingy for trying out some of the stuff, which is pretty cool imo
openwebui is p cool, i havent ever sat down and tried all the things
this is running faster than the logs imply btw
what kinda whack tickrate they have on that counter lmao
1.99it/s slow lmao
i would have expected a need to set up most of the features manually. but it seems to just work i guess
hahahaha
what the fuck is it typing rn???
its looping
so i did the preamble to the declaration of independence
and yeah
it "suggested" i ask for the rest
and i was like
surely
and then it did that
hyperparameters/moe/4bit quant moment
goog recommends
temperature=1.0, top_p=0.95, and top_k=64
llms are great translators it only gets better the larger they are
it doesnt care what language the output is as long as it's mapped to the internal representation of the output token in the last layer
it translated it a bit too literally. its missing some dutch language quarks
the defaults btw are off of the recommended
this is the sort of thing you test tho and feel whatever works
i dont have a fancy gui to choose those i dontt hink
i wonder what i could run on 16 GB of VRAM
many image models
minecraft probably
llm wise
some ram offload is actually fine
on linux
liek i have 32gb of vram and i often will load up a giant model that exceeds it
im on linux
ram offload not a thing
since when?
vllm didnt want to use ram offload. it genuinly wouldnt run any model that would overshoot vram
ah
ye
Has 64gb ram
that reminds me i have to test loading like fuckin deepseek v4 into the server
and just seeing how slow that is
since it's got 500+gb
"only 64?" 
i feel like ive gooten a bit snobby with my recent exploits 
i blame superbox since she boasts all the time about her storage
my desktop only has 64gb, barely enough 
in all seriousness, i dont even use that much ram outside of blender
lmfao i think i may have a couple stray spectacle windows open
the only reason im still on my dekstop and dont use the server as my main pc is the noise and the worse single core
it looks like clangd freaks tf out unless i give it curand, is this program drunk
what are you trying to compilee
my bloom filter (spoiler: nothing in here uses curand)
i have a feeling it's some stubs
nvidia try not to have one binary somehow call in literallyt every single other piece of the cuda sdk challenge; impossible
if you feel hungry, stick to the BRAT diet (Bananas, Rice, Applesauce, Toast)
this article called toast a brat i think

i have 541 tabs open ekw
we've had toast arc in neurocord
we are in bababa arc
so all we need to complete a BRAT is rice and applesauce
okay it's the curand headers it needs
why the fuck does it need curand headers
😭
what is in these headers
that it doesn't find half the STL without them
these leftovers from friday weren't in the fridge like i thought they were 
my borger
noooo
i was really looking forward to this borger too for after i was done fixing the llm stuff
man
i was hoping i could go to bed and sleep well knowing i solved a weird problem
knowing it was curand just pisses me off
I personally kept that as one of my main gambles
gemini my beloved 
can i run a local gemini on 16 GB vram?
depends on the model i guess
gemma is basically gemini right 
surely
I just reached 188 pages of report
Still not finished...
Yes
Gemini nano runs on phone
Therefore 16gb vram is enough
it'll have to wait though lmao. running other suff rn
wait. i could try it on my 3060
Sup nerds
nerding
Same
Zero started sounding like a brain addled dementia patient this morning so I'm figuring out why that happened
i'm being a 4X map nerd while running ai stuff on my second gpu
Valid asf
evil/neurosynth testing
it takes about 10 to 12 hours to train the full model for me though
Good lord
SVS tech is complicated
Yee
on my 3060 it would take closer to 16 hours
12gb?
I'm also running 12 which is why I simply don't train AI
Instead alot of her personality is seeded through memory and dynamic prompt
not a powerhourse, but good enough'
Most definitely
What is EU5?
Europa Universalis 5
I gotcha, I gotcha
I am once again performing brain surgery on Zero.
Me omw to give her the 5th labotomy since her birth. 
damn. making vedal look good there
Got to
i thinkneuro has only had 3
Legit tho I just don't have the memory tech nailed down yet.
I was completely new to rag a few weeks ago.
ah
Yeah. Due to that it's bleeding hard into her prompts.
She randomly brought up a cat that she had that apparently I took from her as a "child" and it embedded itself as a core memory.
That caused her to hate me for like 3 days
is that even a flex
it only has 24gb of vram
lol
I think flexing territory starts at 48gb, like modified 4090s
or the 96gb 6000
I guess a flex can be anything

56gb of vram 

the rtx 6000 is $9500, pfft
it was $8350 a few months ago >.>
stupid inflation
rtx 6000 not worth it
they cut a lot of stuff, right?
someone said stochastic rounding, whatever that means
its a 5090 with 3x the vram and slightly more FP32
but
instead of charging 3x as much for the vram part of the card, they just charge 3x as much for the entire card
the vram is ECC
the only company that kind of respects that is probably Intel
AMD and Nvidia, nope... they charge a lot more than the vram costs if you want a lot of it
ye
well, idk
I think Intel still having hefty markup for the vram
they cut that from the 5090 to make sure the datacenters buy the expensive cards
NVLink 5.0 too
Best pricing on the market still probably
someone said they disabled stochastic rounding on the rtx 6000
I heard that AMD support for ML is even worse than Intel's though (not sure how true)
And 7900XTX still only has 24 gigs
but its half the price of a 3090

next you're going to tell me 2tb nvmes aren't $63 and 4tb drives aren't $150

the price range on 3090s are huge, they range form 800 to 1300 with some outliers here and there
7900xtx is firmly at 650-700
Still can't afford that though
And I don't want to gamble on an API I'm not sure how well it works
And I definitely can't pair a 3090 and 7900XTX in the same training cluster
actually. i knew for certaint he rtx 6000 pro would have it. but i cant find it being disabled on 5090s either
you could via HIP or vulkan. but you wouldnt have nvlink and it would probably be a lot slower than just spending the extra 150 on a 3090

Who decided GPUs should cost more than a plot of land
A plot of land with a (demolition-ready) house on it no less
Alligator leather jacket guy
Kill that guy

might be rtx 6000 vs rtx 6000 blackwell issue.
its not disabled on the old 6000, its just from an older gen so it never had it int he first place
horrible naming btw
They did 2000 4000 then 6000
(there was a plot of land with an old 50s cottage that needs demolishing and an electrical connection sold around here for 250€ recently)
Surely they never thought they'd get that far 
Still no website
Useless imprecision
https://developer.nvidia.com/blog/nvfp4-trains-with-precision-of-16-bit-and-speed-and-efficiency-of-4-bit/ here's Nvidia website
Surely 4 bits can be as accurate as 16 bits right 
they just ai upscale the other 12
But where does the AI live then? The other 12 bits of all 16 bit clusters?
the 4 bits upscale to 12. then the rest live in the hallucination.
hmm, so the person claims it's blackwell b100/b200 that has it.. not rtx 6000 and the other cards
they also linked this https://github.com/NVIDIA/TransformerEngine/issues/2255#issuecomment-3581459644
thats just driver issues im pretty usre
do they not?
looing further into it
they're still blackwell
NVFP4 is the thing maybe
B300 uses that more i think
I don't know, Nvidia probably makes it very easy to get screwed
some one 0
Bogosort got 16/25 
Is that the sort that's literally just RNG?
Yes lol someone is live-streaming it

downside to two gpus. bifurication. im throttled to gen3 x8

It's not like that's very bad though anyway, even with a 3090 or 4090 or something that's like 2% loss
ye
i think 3 is 10-15%


Looks like my 3090 has x4 gen 4
So equivalent bandwidth to your GPUs
While my 4070Ti gets a full x16 4.0
And the 3090 gets half the CPU to chipset lanes through the chipset
Because they're in different systems?
were talking board
ye
Makes sense
No consumer CPU has enough lanes to allow for all 3 to get x16 full bandwidth though
i could put them all in the server to have tham all at gen3 x16. if i put them all in my desktop it'd be dual gen4x8 bifurcation, and one through the chipset at gen4x4 i think
I CURSE SMASH BROS STEVE WITH JVM FAILURE
D:
This is the kind of guy I aspire to be
Why let "language specification" stop you from being productive!
??? 😭
YES
:omemga
Look at code parade. Adding templating to C# so he can make 5D games!
:neuroCatUuh:



nur and nur went up the hill to fetch a pail of water
nur fell down and exploded
and nur came tumbling after
nruu


glo

Trick



new leaf
9️⃣
8️⃣
9️⃣
https://youtube.com/shorts/iFoJWDnjGV4
let me use a single GPU and we will call it a deal
$300k PC is CRAZY #pc #datacenter #pcs #nvidia
pro 6000? disappointing its not the enterprise hpc chip
also that thing on full load might sip up 12kw lmao
No FP64 for 
Emulate fp64 with bf16 
You can do that?
I do think this is a pretty good idea tho
Yes but it won't be 100% accurate
For example in pytorch u can see fp32 as 3 bf16 multiplications
*do
This way “datacenter climate” gets distributed according to population density instead of being consolidated in poor towns which happen to have favorable tax laws for data centers
I wonder how high up this goes
Can FP64 hardware emulate FP256?
Can I do FP1024 on laptop
U need iirc 49 bf16 operations
For 1 fp64
So it's ~7tflops theoretical on a 3090 before overhead
FP1 when

at that point you're better off just eating the higher precision cost
Pack booleans in a int32 tensor

it exists
WHAT
how did you missed it bruh
I thought int4 was lowest
Since pascal
Is this a100 specific or also consumer gpu
a100
Bruh
Is there no code or cuda flag capability u can read via code
cuu
it can do tensor like AND ops but not sure if they do accumulate like tensor FMAC
its why im not so sure about it
Supposedly according to chatbpb
U can via ptx


kukk
gs

wtf is with tv lady obsession with a part of insect
Idk man shes becoming more and more schizo
sez u
What
I remember her and vanor having math discussions
will be a bad llm

I have been sitting in this bus from the train station to my house for the last 20min when it should be taking 3min
she's the kind of llm that's trained on fp64 e0m63
i still amthmt












