#programming
1 messages ยท Page 15 of 1
blue nodes = input, purple = output, white = hidden, red connections = negative, green connections = positive

Morning!
whats the difference between doing this and......
this
as you can see, in the first one, im using PRIMARY KEY(id)
and in the second one im adding PRIMARY KEY to the line where im creating the id column
is there a difference between these two methods?
ping me on reply, thanks in advance! 
nvm, i got the answer 
even on vacation i be improvinng the code
center = np.mean(frustumcorners, axis= 0)[0:3])
is now
self.player.camera.position + self.player.camera.forwards * (depthlayers[i] + depthlayers[i+1]) * 0.5)
double the speed
Meanwhile me advancing my studies for the next course (2 months left)
it looks like a mathematician wrote this
looks very barebones but just enough to make a computer function
its like skirting the line of doing just enough to be turing complete
that does actually make sense
would watch but gf is sleeping next to me
will watch if i remember later tho
its very fascinating
its very explicit
that's actually incredibly impressive
very nice
ooo
so you're one of the devs behind it
I can't judge the code quality too much because I'm only starting to learn rust. and its a completely different beast than embedded C
it seems interesting, it feels like I'm reading something inbetween rust and possibly markdown
that is interesting
aww it didn't output into the terminal
ooh, it says null?
oh yeah i didnt reaf yhe code xD
i just looked at the terminal output
oh xD
allocating memory like this is interedting
oh right
ah okay that makes sense
I haven't gotten out of bed yet
so apologies if my brain is a bit uh
sticky
and slow
is the goal to abstract manually allocating memory away?
are your allocating per variable or per program?
like ad the end goal
ahh I see. since you can do whatever you want with raw memory

still, having a consistent way of allocating these things is gonna end up in more reliable and predictable programs
that's pretty fucking cool
so you're just iterating over memory
that works so seamlessly
its genuinely really impressive
the ableos recruitmrnt drive never stops
yeha I do xD,
I'm gonna be working with my gf on starting our own company, probably a repair company, on top of that, helping her work on her wayland compositor and some other projects too.
Also going to be designing custom electronics to sell to consumers
so as much as I'd love to I don't have time xD
I'll give it a follow!
<3, I'm not relaly sure how to follow it on forgejo though xD
and thank you <3 I'm mainly a reverse-engineer
your project is incredibly cool. I'm sure you'll find some passionate people <3
it IS cool that we're in the same timezone though
not "multiple columns can have primary keys", multiple columns can be the primary key
right
sorry

esl
??? how
what is that?
very academic and structures, and generally hints ar it
it reads more "engineering" than "academic" to me
it just omits descs but nothing about it is surprising
I'm getting an SSL unrecognised name error when trying to clone the hbvm repo, you may need to update the certificate
the EU site is working, is there a way I can edit the config to point to that instead for the cargo build?
pointers are making me cry, they make optimization so much more annoying
a := 5
b := &a
b.* += 5
// now a is 10 despite being assigned once
ok what about this solution - when a is referenced with a pointer it gets converted into a cell so the compiler knows not to optimize it
a := 5
// internal
a := cell(a)
b := &a
b.* += 5
should work right? well what about this
a := 5
i := 0
loop if i == 3 break else {
// a := cell(a)
b := &a
b.* += 5
}
// a is now 20
lets even assume that you can just flow a being a cell out of that loop, you still have to flow it also into the next iteration of the loop and then a is not a cell upon entering it for the first time but a cell upon entering it for the second time
so all you can do is just backtrack and make it so a was a cell all along and i hate that
alternatively unroll that loop into two separate bodies which is weird as well but possibly a better solution
i'm interested in making a compiler + LSP to gain experience but pointers and comptime make everything hard
Vector just learned to drive over my keyboard cable 
My dad bought 4 dgx spark under his company
I can't find much about hblang so I will presume its a new programming language, does hblang not have a standard library atm?
oh, so did you architect the language?
assumed from your answer
ok, what about a documentation for hblang? does it not exist yet?
Technical debt 
my ISP finally gave me an ipv6 address 
it doesn't really matter since the ipv4 is on public anyways 
The syntax looks like a lot like the zig language so that's understandable somewhat. supposing someone would want to contribute, for reference about the syntax and possibly semantics is this a confronting compiler?
a proper hblang's compiler, following its rules instead of creating extensions of them unmentioned
OK
Man
What is this job
100$ for 3b1b level video script on technical topics
Expert level
Lmfao
You need a PHD and many hours on leetcode
I used a goto 

vs
this took 75 SECONDS to get past
this one is just confusing cause technically that is not a bycicle but who knows how the fuck it works for captcha
captcha is absolute bullshit sometimes
Yeah I also don't understand them cause I'm totally not a bot 
I get capcha every day as I'm in a uni dorm so a lot of ppl use google and its just loooooong sometimes with a lot of retry
anubis hits me a fair bit
I guess it probably depends on how your uni does internet
this and it seems to have gotten worse with time
๐
99% of the time I'm only getting hit with extra captchas if I open an incognito tab and even then that barely happens
I'm... always* in incognito 
the amount of free training data generated daily by captcha is insane

the occasional brainworm telling me to buy a gpu for cuda keeps calling to me
hello b200
Steam captcha takes me 15 minutes for some reason ๐ญ
Yeah
my acc profile is even better
since it removes the spaces
It looks different for a lot of devices
I really wanted to make a plugin that lets you rename people client side
keep putting it off due to lazy though
The money clock is ticking 
7$ or osmething
so expensive
did your just disappear or something
apparently its been replaced
by a shitty app
"Windows Scan"
like is there nothing at C:\WINDOWS\system32\WFS.exe
i am trying to sign up for scholarship
and bruh
they ask for everything
shit i doubt my parents would ever give me
full tax papers of my mum and my dad and his wife
all documents from when they separated
update
probably wanna use alpha to signal how close to 0 the values are
surely that should help
i am 
also, made it bigger
Watch as Tong He, a research scientist at Google DeepMind, introduces the new Gemini 2.5 Pro Deep Think, an enhanced reasoning mode that uses parallel thinking techniques.
See how Deep Think uses its reasoning capabilities to tackle a challenging coding problem from the competitive coding platform โCodeforcesโ.
Learn more at https://blog.g...
๐ญ that audio sucks
maybe make it uh.. not linear? like.. idk how to explain cause I can't remember the function I'm thinking of. but like.. from -0.5 to 0.5 its basically invisible or only slightly, and then it ramps up a tiny bit, and then on the last 10% to -1 or 1 it actually goes full alpha
should give a better indicator about strong connections
all money went into training gemini 2.5 pro deep think
fixed 
tested a colored image and another error to try to fix
(apparently I read too much bytes ?)
hmm, good point
maybe try just f(x) = xยฒ?
I was thinking of doing a sparse intialization too
because there's like 16k connections there
and that's a lot to optimize
is this your neuroevolution result?
(for evolution)
when I did neuroevolution my networks started out with just the input and output neurons. not even connections between them lol
not result, more like I'm trying to do character prediction using one-hot-encoding
my network would crash if I didn't give it any connections rn lol
ah well. yeah. that'd do it
back then I designed it in a way that I simply feed it input values and then take the output, regardless of whether anything is connected or not
initial networks mutated fast that way. speciation was also off the charts due to so many initial variations
my code try to read 65kb of data in a 4kb file 

aparently the error is reading an huffman table 
I assume a 65459bytes huffman table is not ideal for image compression 
This thing is just sitting here, computing
even bigger!
Avg Fitness: 0.0840659
Max Fitness: 5
Min Fitness: 0
Species Count: 5
Species:
[
Species no: :0
Size :182
Avg fitness :2.67033
Max fitness :11
Min fitness :0
]
[
Species no: :1
Size :7
Avg fitness :2.14286
Max fitness :8
Min fitness :0
]
[
Species no: :2
Size :1
Avg fitness :5
Max fitness :5
Min fitness :5
]
[
Species no: :3
Size :8
Avg fitness :4
Max fitness :10
Min fitness :0
]
[
Species no: :4
Size :1
Avg fitness :3
Max fitness :3
Min fitness :3
]
[
Species no: :5
Size :1
Avg fitness :0
Max fitness :0
Min fitness :0
]
'Evaluation': 19929328700 ns (19929.3 ms)
'Culling Game': 6029500 ns (6.0295 ms)
'New Generation': 842239300 ns (842.239 ms)
'Generation': 20779765100 ns (20779.8 ms)
20s per generation
for 200 individuals
ooo u got a left side window pc??
Very silly
i mean right
It has a window on both sides
wait so can u look through both or does one just show the back
The side that's visible in the image only shows a couple cables, the back metal panels, and my 8TB SSD
The other more hidden side gives vision directly to the main internals, having the GPU, CPU cooler, RAM and motherboard visible
fair enough yea im just used to seeing pc cases where on the right side it's just metal since you cant see anything anyways
found the issue
I was doing an | from a uint16 with a negative char
so the char got cast as its equivalent in signed 16 bit. then got xor and become unsigned making a big number 
now I can decode big (5120x4920), subsampled, colored, baseline dct jpeg 1.1 images 
my java code is still faster by 30% somehow. need to profile 
seems like wsl slowed down my code. with window build I'm ~12% faster than java
Silly
bro why are you measuring in nanoseconds ๐
I originally was using it on small functions
even still ๐
it still takes over 1000 ns to do 1+1 bruh
I assume ur using python
when ur using python there is 0 use for nanoseconds cuz we a lil slow
why wouldn't you
o
nothing in python runs on the nanosecond scale
Do you have the project saved in the windows filesystem huh
maybe ? good idea to test
if you do then don't, the filesystem bridge between windows and wsl is painfully slow and will hurt performance if you read anything
WSL FS SUCKS sooooo bad
if you stay within wsl its fine
you can just have your project inside wsl and it's a non-issue
it'll be faster than the windows fs too
am just lazy ngl 
window is temporary anyway
and my java benchmark was running in window too so apple to apple comparaison
i guess that's true
am i being gaslit or has something changed at some point, i remember it being painfully slow even in linux fs
that's still weird that it'd be slower inside wsl 
i think wsl1 had some performance issues with the filesystem?
I do kinda heavy file opperations and probably is most of my time so no wonder if there is a slow down
I use WSL 2
my only wsl2 experience has been super fast fs
but in exchange it was faster when touching the windows fs
wsl2 is the other way around
but also I literally don't use the windows mount at all
that would explain it if you access that yes
almost certainly just the windows mount
i remember now, i tried transferring a dataset from host to wsl
testing on wsl no window mount rn 
a bit faster than window
(by 20ms over 530 ms, ~3%)
i just tested how long it takes to install dependencies for one of my js projects in windows vs wsl, 8.5s vs 0.75s 
how hard could that be on an nvme 
oh I should test something like that for my project with verrry long first dependency run
that has specifically issue with window making it slower 
yeah 
that a minecraft plugin and it need to dl, decompile, remap, patch, recompile 14 minecraft server version
o if this is paper they actively suggest not doing dev work on the windows fs iirc
yep ik
and I do it on an hdd in window so its pain
I think something like 15m with -parallel ?

its ok its only once 
oh, what ReFS 
their "new" filesystem that replaced NTFS for windows servers and is also used for their new "dev drive" thingy iirc
oh so that microsoft New Technology for File System 
ima go full linux anyway anytime soon so no need for it 
anubis checks everyone with Mozilla user agent (so literally every browser or everyone who pretends to be a browser)
@rare bridge Is the Bingo opensource? ||so I can use it for Alex's stream as well
||
yes it is open source
Bingo for Neuro-Sama streams (https://twitch.tv/vedal987) - cloudburstwan/nwero-bingo
oooh thank you!
for some reason it wasn't listed in your GH profile
or im just very blind
Well this is the type of request it's trying to make.
not to familiar with all this and what its saying but very interesting
I honestly think it's impossible.
is the one get string throwing all that ?
Seems like it.
I'm sure I can get it to redirect, but I may never know what the intention of the mezzping is.
do you think its licensee or validation check?
License check? Hmm...
cause if you calling to pull data and it is running the Pragma: no-cache it has to also validate each time right
You might be right.
pretty sure doesn't store validations if im reading that right
the way i read that your requesting data with the get the logs are crazy high but its still grabbing the data. its just doing validation checks for each source
I'm starting to think that I need to use Wireshark to see the outgoing traffic, and a debugger on the client itself to see what it's expecting.
i love trying to use a dead repo for some llm that was implemented in 2022
i basically have to rewrite everything myself again
They very thoughtfully labeled what each DLL plugin does.
"at least, not on foot", as said
don't we love abandoned repos that have few hundred stars
especially if the author vanished in 2021
look at the bright side, it could be a decade old 
using tensorflow 1.0 or something
what?
omg i hate those repos
the author is NOWHERE to be seen
in this repo
theres instructions on how to run a script
but the script is not present in the repo
so uhh
???
generated script maybe?
๐ just realised my university's WoL setup might not be working instead of my own config being the issue
what
yo sup i have question for the functional andys over here
im finally free from my javascript shackles
but now i need to learn a lisp language
what of all the 3-ish do you recomend
racket
lol i am running a 1080ti i cant even try
back in the day the 1080 was the best, it was so good that they made him a titan version
๐ด
(it was like 6 years ago lmao)
i'm getting 80 t/s on a 3080, it should be fine on a 1080 ti
ive been playing with it and runniing into driver (cuda) issues. And deff needd to learn more about torch cause thats been giving me issues
im kinda weird and will lock into learning about stuff to better understand it
how dare you blame my favourite multibillionaire corpo
I'm really getting desperate now.
but you're only 40 minutes in
you want to check what data it's sending???
I wanna know what it expects to receive.
is this related to https://github.com/lightbulbatelier/starlight ?
follow the call stack and xref in ida
I hope you're not doing this all with just disassembled code
What other option do I have?
use in conjunction with something that gives you a better disassembly navigation
preferably something that can xref and spit out the relevant pseudocode
I.e., ida, ghidra, binja, etc
what is this even about
binjas binted
ok I'm gonna hit the 
I did it letโs go.
my condolences
Database tracking legal cases where generative AI produced hallucinated citations submitted in court filings.
use temp 0 :p
they said coming weeks
it had been a week for korean time 
so i purchased a month of claude turbo subrscription
because i hate money
the $100 one not $200
anyways i was in claude code farting around having a grand old time and at one point i decided to see was a pull request review would look like from the cli since it had that in the options
and claude spent literally like 30-45 minutes desperately trying to get that environment to let it pull it
and i was kinda watching amused and putting in random attempts to help it authenticate
well i checked the model after like an hour of this
that was opus
the whole time
opus 4
(which is not charged to me because of the sub)
so i just bankrupted anthropic sorry y'all
it wont even tell me how many millions of dolllars it was >:(
nah
what actually happens is that
they like increase usage and decrease based on their current situation
so uhh you just used many usage lmao
probably wasted all your daily code
i've just been leaving it on auto
they might have set it on batch too
anthropic absolutely SUCKS on reliable usage for subscription users
i've actually only gotten rate limited once so far and i've been sitting here with code rolling all day each day honestly kinda trying to see what the usage limit is for this
a lot of it has just been chatting BS though
not actually coding with ... claude code
hmm
idk
but i don't like how they work in general
like instead of fixed 'n chats' per hour
they give 20 times more than free tier
like what 
when i had the $20 tier i definitely got rate limited so fast every conversation
which is why i dropped that sub
but then 4 came out and i was like eh fk it and got max
cuz ive been meaning to try claude code anyways
without bankrupting myself in api
making your money worth fr
i dunno where they're getting that the $100 is only 5x the $20 usage
i mean it's 5x the cost
but it feels like wayyyyy better rating
and ive been leaving it on auto and checking what model it's using and it's opus way more than i expected
usually if it starts absolutely blathering and pontificating i'm like oh look it's opus
gn sam
maybe the rate limits weren't actually THAT low it's just it was ALWAYS a "high traffic time" but now i dont feel that or something
i spent a solid 2-3 hours yesterday telling gordon the docker AI just how much i despise docker
while i as having issues with a bunch of containers
[insert nix propaganda]
honestly if nix makes docker unnecessary in its entirety then it's much more appealing than just about any other option
i hate docker
for development i guess kind of but it's not like that's the only use case for docker 
im sure it's great for someone somewhere
it's never been anything but pure suffering for me
like sometimes it works but most of the time i'm saying fk this and reaching for something like https://github.com/rzane/docker2exe
i feel cheated, this just runs the image via docker
i just hate the docker daemon, i hate dockerfiles, i hate docker compose beyond words, docker desktop sucks ass, docker cli is barely better because it's still docker related and you have to touch that garbage with clean hands, the way volumes/containers/images are separated is barely intuitive for me (that's on me i just have always hated it and been annoyed at which is which)
but really more than anything i hate that people just yeet stuff into docker containers and decide that means that their package needs absolutely 0 documentation and it'll just work no matter what
yeah i thought it was some sweet translation thing the first time i saw it too
i just like it because i dont have touch the docker engine myself

you don't really have other better options than pytorch tho

do you want to use cupy then? 
it's time to make keras the king
reminds me actually, wonder how the tinygrad torch backend is doing hmm
they appear to be vibing
cause being able to use their amd stack via pytorch could be kinda nice 
anything that allows anyone to use something that isn't an nvidia product is based in my book

I DID BIG BRAIN THINGY (unless it doesn't work)
i may
be able to run llm at 3x speed
maybe idk
Silly LLMs don't use the context window as well as mature LLMs
So I'll use a mature LLM for the context window and logical crap then the silly LLM actually responds
50/50 it works and 0% chance it is a good idea
so it's speculative decoding but shittty
sounds like a great idea to me as a certified slop producer
yes
I've only got 114 days to get this thingy production ready, and so far I haven't got enough done :3

NVIDIA = not good
then use amd
ROCm is working good for me
nah tbh the best thing after nvidia is google tpus cause they actually have software support
it's not as good as nvidia software support but ye
Google tpu sounds expensive
it is cloud based
but for large scale ml training it can be worth it
for single person no
So ive gotta find a girlfriend to use them?
You have unlocked new role
i mean company
where the budgets are +1M
Wait why doesn't google sell them?
it's great.
if you work for deepmind i'm sure
cause they host them as massive optimized clusters
setting up a similar cluster would cost +100M dollars or something i assume from scratch
they also made them to upset shadow by saying 200% efficiency all the time or whatever
and then electricity, networking, etc
they are efficient in terms of used chip
when shit is compiled to XLA it's very hard to get poor util
400% fusion energy achieved
So instead of a fancy efficient TPU I gotta use a RX 7600 XT I can't afford

dw the tpu would be way more expensive if they sold it
neither do datacenter gpus
yeah
the h100 and later are basically just tensor cores 
But they look cool (plus I can't afford the RX 7600 XT as it is)
yea only cheap to google
they dont pay nvidia 10x margin
R&D costs though
googol has a googol dollars anyways
who makes the TPUs anyway, tsmc?
fair enough
Yeah but what are you gonna do with that much power? Run 3 chrome tabs?
whjy would you bring such an unrealistic thing in to this highly grounded conversation
3??
next you'll tell me they can run crysis
atleast its not too bad
you dont need it just use a tablet
i swear i would be so cooked without 128gb ram
having 64 has been actually crippling for the last few months
i really don't want to buy more before upgrading though
its been looking like this for me
it's every day bro
i hope they have some sticks for sale for prime day or sth for cheap so i can get to 128
i am fixated on 192 because of fking t
hi t

bruh
192 would be insane
but the rise in cost is also insane for me
i'm either getting 48gb sticks or 64 come upgrade
i can not afford that
it depends on the cost difference when i do
but it will be at minimum 48 per stick
also depends on how hard i get rekt by the gpu
i def need to upgrade my setup
nvidia STILL havent send me my damn RTX insider super deluxe VIP "invitation to buy a 5090" thing
it's been like 4 months bruh
and i simply will not pay $3500
not to burn my house down
i'
will do that at msrp
ish
i've been pretty happy with my 5070 ti for the most part 
well, the constant driver crashes during profiling aside
views on amd and intel now adays? im currently using a 1150 board with I7 k series
i curse my attitude towards purchasing things every day
ive not seriously considered an intel cpu since ryzen 5000
from what i understand AMD really been killng it
ironically enough the 48gb intel card coming out soon got me like ๐๏ธ
despite everything
keep in mind it's 2 separate cards
yea but 2 24gbs
Mhm
both of those are 2x the 10gb i have now
more even
i just pray they're available
(they wont be)
iirc they said they should be sold on their own early 2026?
Honestly those would be the cards I would have picked if I didn't go P40. Those weren't the best decisions but it was the one I had at the time.
yeah if they're in any sort of reasonable universe for price
i really am having a hard time justifying why i'd skip them
but that's the caveat i suppose
Thatss crazy, luckily I have 32gb of ram (laptop tho) Soo I can open a few tabs, I won't forget the day I accidentally opened 400 edge tabs
supposedly "less than a grand"
yeah if they truly do end up less than $1000 i dunno if i'd even hesitate
guess it depends on what else is around by then
or coming
and if jensen has come by to make me pay the troll toll
i hate what we've become
Hey can someone tell me whats the etiquette on discussing/speculating on Neuros inner workings?
Vedal leaked the length of the context window in the recent stream and I am just wondering if there are any threads collecting those kinda statements or if that is off limits.
we speculate on it like schizos all the time
i'd say unless you live in his walls and start giving out actual source details it's fine
cuz otherwise we dunno
Just wondering because he is dropping a lot of details
One day we gotta make a board
if you search the word context you will see many posts about how she has some level of control over what's in her context
from today
I mean he himself said his major achievement is merely getting the response super fast
I reckon nere runs async
and random chatter messages flooding her context is prolly not super healthy
but he told filian the exact number of seconds for the visual? context window
ive been schizoing about how i think she has a bunch of different context windows that get swapped in and out depending on what she's doing to maintain some coherence without getting overwhelmed for a long long time now
I think its just sorting stuff from various sources with normal algos and appending to the prompt
we've known/been told/hinted at the rough length of her short term "memory" aka immediate context for a long time as well
but yeah my thing forever has been if she's activity swapping between like chatting/playing a game/talking to stream guests whatever, she's getting her context yoinked out and replaced with maybe a hint of what's going on but not front and center, to make her still know what's going on in the other contexts if needed to seem continuous but the thing she must reply to RIGHT NOW is the relevant context for that situation so she doesn't have the drift you'd expect with how many things she's juggling
and of course ranking/reranking things even within the same activity
evil broke a lot during the MC stream
seems she also does that herself
directly put the reply/prompt to the mc bot into chat and probably vice versa
yeah this issue is probably why she doesn't often show "new in progress" developments until they're ready
probably out here leakin'
if that's not nailed down
Also had a telling moment during standup stream where neuro put what seems suspiciously like the control prompt into a story
throwback to the first days of gpt 3
she will also just make up dumbass stuff that sounds like it might be a system prompt for fun though
"always obey vedal" def seems something the big man put in and why she can't stop bringing him up every 2 seconds
even without context
rules 1 and 2
of course not you have to wrangle these systems or they will just go berserk
especially a le funny tuned one
But neuro always bringing up turtles is definitively what is puzzling me the most
is that just the vedal emotes in chat, her long term memory or actually always in the context because prompt
because it happens even when she streams with other ppl on any topic
that long term memory has got to be one of the most cursed collections of schizo material known to humankind
I mean, at this point we don't know how much is the memory and training data
all i know is maintaining anything resembling consistency with how many iterations and models she's been through is very impressive
is it though if the prompt is the same and you got the model + voice + quirks inentionally put in?
Like randomly jacking up the temperature to get her to tell stories/tangents
to me it's quite impressive yes, without too many obvious personality changes where the data must've changed more than usual
or the model itself
i mean you train 2 different models of the same size on the same data and they'll be totally different even with everything else the same
Nah its an inverse square kind of thing.
Where the most common phrases stay the same and its only when you drift into the ""valleys"" i.e. stuff that isn't well covered that deviations appear
But you'd need to train and compare several llms with the same prompt/temperature/architecture to really proof that.
Besides, Vedal heavily hinted at Neuro and Evil being the same and people insist they are RADICALLY different
the memory is probably the main thing keeping it consistent
going from qwen2.5 3b to gemma 3b with the same data same training is very very different result for me
im not training the entire model though so that complicates it
i dont think neuro is full fine tuning or if she is now she hasn't always been because of resources
ive always considered them to be roughly the same or a very similar base model with some sort of lora + obviously their separate memories + different prompt
Cant compare cause different model depths
right they're different models entirely but that's what i mean
neuro has been like 5+ different models, although vedal does know what he's doing i doubt they;ve all been the same architecture given when he started
I mean, its just, even if it seems trivial if you ask any ai "is the sky blue, answer in yes and no only" they'll all give you "yes"
Thats what I mean with it fraying only at the fringes
And we have to consider that whatever you run as hobby probably is closer to RNG than a full sized model
like any of the 32gb llama things I've seen were really bad
and neuro has come FAR
Yall are underestimating how good a 1-3b model can be
certainly but also i'm running on the assumption that neuro was at least in some form in that state as well for a good amount of time before it was a whole operation
although there is the ;susge: "we" in all of vadhuls early neuro pitches
so who knows maybe it's just demis' pet project secret mode
that's why we can't see the architecture of gemma3
because it's all neuro
bruh
ive cracked it
kek
Lots of artists do the "we" to respect fans and contributors
even though it seems weird sometimes
yeah i mean there's also his own reasoning of "wanted to sound official"
and early neuro was a lot dumber than i usually remember her being
now she's gpt4.5 with a stupid lora on and that's why vedal can't afford greggs
I mean even aside from becoming more coherent sfx+discord bot upgrade definitively was a quantum leap
yeah it makes sense it was around the whole "agentic" trend anyways
which is still ongoign
but she's def out here agenting
tools out the wazoo, maintain long context and coherent goals over a whole stream with literal garbage being shoveled at her all the time
The pc/setup roast streams worked out before
but you could tell it was pretty formulaic
now she can just ignore the slideshow and stuff
and go into therapy mode XD
she seems to also be able to pivot pretty well without requiring a whole specific setup for various situations
sadly a higher level thinking unit won't happen because of lag
part of why that one pirate stream went bad because she decided it wasn't pirate time anymore and it apparently had been just left up to her to maintain it, and he was hoping she would, but she didn't
while the ones before were asome sort of style adapter
hey gemini2.5pro can do this kinda speed she might be huge and fast we never know
one day
Well yes but gemini still only runs one stage
and also still 6s to first token
you can either do the higher level thinking with conventional scripts that run in <1s or you bit the bullet and go full auto gpt
where you have an iteration just doing stream of consciousness that then gets processed
neuro was actually the first matformer
?
like russian nesting dolls, they get trained with nested blocks of various sizes at each layer so they have a maximum possible size but they don't actually always use all their weights depending on what they're processing at inference time
like an autoMoE without a router
ah
Did not know that
See, I need that thread/board XD
So far only really messed around with image gen
i heard about it vaguely but folks were talking about it here like yesterday
i think shadow linked me the arxiv lol
that's how i started being interested in it too lol it's been a long journey at this point
the last 2 years feel like 10 years
TBF that is all of vtubing at this point
youve started caring about non-image gen just in time for diffusion llms to maybe actually be something
Ehhh so far it feels like the industry stalled a lot
I don't think they can really resolve the issue of speciality topics
https://x.com/InceptionAILabs/status/1894847919624462794 i think we just hit the "throw more gpus and easily gotten data at the issue" wall
that's why gpt4.5 crashed out
\diffusion being applied to text gen freaks me out somehow lol
it's a lot harder to "identify" with the model when it's not autoregressin
even tho either way it's not thinking
it just logically feels weirder
sounds like nuke it tell it works methods
I think you have the issue that the industry really doesn't know which use case they can get good enough at to be a real thing
but yeah, speciality topics/context and higher order thinking/understanding physical processes
what call the caveman method, throw and hit it until it works
I am doubtfull you can resolve for everything in one step
large models >1b are really just this
there was a time where we thought huge models wouldn't make sense or even work
the conventional thought was that overparameterized models were a bad thing
not enough time has passed to actually see thats why
I think people don't understand the brain works exactly like a LLM
randomly connected nerves
with random weights
its just we can't run however many billion neurons on a gpu
everything takes time to develop, with out it you just dont know
back in the day if you told someone you'd have a 1T+ language model they'd think you were a total idiot headed for the world's most overfit model
didnt help there wasnt compute to actually test it anyways
lOcAl MiNiMa
i grew up wth 4 in floppy so ya sht crazy now
you can have the smartest model, if it takes 1T+ params you're still an idiot
ehhhhhhhhh
it's something like under 10% of the params in any of the large models are actually important
theyre actually just dummy thicc because that's how much of a net you have to cast
should focuus more on that
to hit the training objective
well it's a law of large numbers thing
otherwise liek you said earlier you get stuck
you have to give it soooo many params so it actually goes past that minima
you can very succesfully cut them down after
but not 100% successfully
name one 100% human...
the bigger issue is the half full glass of wine thing
QAT and stuff are big
lol okay my bad..
it's ok you didn't know
zucc doesnt want to give up his wAIfu
just sayng normal human uses how much of there brain
I mean if you think about it, a parrot can match a 4 yr old child with 1-5% of our brain size
and thats not even taking a human brain and ripping all the unnecessary stuff out
there are some pretty cool neuro science discoveries out there
the issue is mainly that any improvent in cognition is logarithmic
but targets have an abrupt cutoff
we really dont know what the cutoff is or where emergent capabilities appear
until it happens
well, we know a GPU can't do it
i also could fight the other side of this because errors are what define and help us grow and learn
currently seems like the consensus is it's not worth trying above like 1-2T
i'm just here for the hardware innovations, if LLMs is what it takes then so be it 
but that's also compute limited again
the training process is trying to generate infinite errors
who knows what happens if you cram 40T in tehre assuming you can source the data
they could start flying for all we know
well they are training process shouldnt make same mistake twice
we just like social animals that can vocalize and are seflish enough to recognize themselves in the mirror
in 15 years if we're still alive we'll all be like damn those idiots didn't know how to train a model
The basic structure of the training process is mathematically perfect
its always goals
the data is the problem and also the objective
ye
i wish ol' euler could see us now
i already feel that way
and also see his name slapped on samplers still in use
their vibe will be pretty strong
it's pretty wild to think about the capability gap between what we have in 2025 vs even like 5years ago
do we though?
transformers were out by then but it was still before gpt3 said fk what you know about scaling
Like, yeah gpt 4.5 is pretty good
i had too fight to get programing classes, they now teach them as a standard
idk it feels like its all skewed by the first "product" launch
which if you go before Iphone you are 30 years back
but after Iphone we got marginal improvements
it's just whenever someone gets the recipe in a good enough state the general public notices and adopts stuff
I still remember using talk to transformer with gpt2 to write my exam for my English class (I hated that class so I didn't care)
That was 4 years ago now
yeah i remember laughing at subreddit simulator
Or the florida man newspaper generator
Eyyyyyyyyyy somoeone already finished a Filian vs Neuro captcha compilation
lotta those were like markov chains
and i still thought that the subsimulator lil markov toys were freaky good
hell thispersondoesnotexist was around in 2019
and that was based on older GANs
thank god GANs are a thing of the past
o7 BigGAN pour one out
there's been a fair bit of murmur of mamba + GANs being very good at very specific domains
recently
and gans be going FAST
theyre still arguably better at faces than SOTA diffusion models
convnets or at least layers are being used in a bunch of ways to try and kill attention again
neuro is actually ELIZA from the 60s but they forgot her in a basement the last 70 years
what's old is new again
Just waiting for when gen models bite the bullet and actually innately populate scenes with skeletons and draw the image over those instead of just randomly diffuse bits somewhere
you can do it with extensions but its not the same if it isn't done natively
things will have gotten out of hand when they start stripping people of their skin and using that as the real skeleton starting point
how good is good enough
i certainly don't know
I mean img 2 img already works the same way
you just need to figure out how to properly fill in
starting from a predetermined base and refining is always easier i mean even knowledge distillation is really just that
take the more powerful larger model and noisily train the smaller learner model beyond where you'd normally realistically be able to
I think its more like moving from pixels to gaussian blurs
because right now you want the models to very sharply paint in where the attention says the keyword goes
but ideally it has a diffuse space that restricts it
and can rotate and everything
there have been so many reworks of exactly what the attention mechanism is targeting or even doing since attention is all you need
we already have rotary embeddings & RoPE /yarn type things that cause the attention to become positionally aware way more
the issue is that you can't get a flat definition across the entire range of tokens
or disentangle stuff
the fuzziness is never going away
that's where generalization pops out of really
it can be exact and deterministic or generalizable but noisy
i dont see that reconciling any time soon
yeah but hence why I say you should populate using skeletons first. Kinda the first step towards building in physics, general enough for most objects and really good at what people care about
otherwise you end up with being in a half filled glas as an intrinsic property of wine
nature starts way before the skelly and it's still noisy as all hell
the reason for evolution by all accounts seems to be the inherent noise present in reality motivating random change
and lots and lots of cancer
I mean its not a reason as much as an emergent process
clones are still decently succesfull
if you can somehow get down to like planck length and perfectly calculate, anticipate and control the laws of reality there then you're finally where you need to maybe control most randomness
and then the fuzzy logic in my rice cooker will no longer function
I mean we can't perfectly describe chaotic systems yet

but the butterfly effect was pretty much disproven
but to get back at what I meant initially
how's your programmatic neurons coming along @amber fractal
Over
ing so I've been laying off a bit, I think I got enough ideas for the next iteration before solidifying it.
yeah i stopped seeing pushes on the gh so i was like D:
it'
best to let things rest to come back with a new perspective though
That is my default method of iterating, which is why I bounce between a lot of things.
you got it to a respectable pause point though
i kinda hate getting there because then i'm in the "well it's good enough atm that it's not urgent so i'll wait"
the ""skeleton""/pose for a "horizon" could pretty much be a line.
While geometric shapes are the most defined. You can then use an llm to link higher order concepts with lower order ones.
80% of what people care about will be humans or animals and skeletons solve those entirely.
Solving how strict that original pose distributes attention around itself is a bit more tricky for less defined objects, or how much you can deform the skeleton with the original prompt.
that's why i've adopted the "get it to a broken disaster 20% of the way and then abandon but feel massive pressure to fix it anyways"
I can probably push the current as the git version is behind, but probably in a WIP branch as it is a start of another overhaul.
Good morning #programming
well then you've got the question of what is a baseline skull anyways
cuz it's not like you can perfect 1 and then generalize perfectly
the opposite really
Programmatic? Do you mean each neuron is running custom code?
now you're just dummy overfit on that single thing
Hello, everyone
iggly can explain it way better than me and i already misspoke a bit calling them programmatic in the first place kinda
but yeah our boy is basically cooking agi

or am i
im trying to build the perfect idiot myself
I'm working on a SNN model
It's hard as the definition of what it is doing because it is pretty abstract. Not using floats is really beating me over the head sometimes.*
-# *All of the timr
Something interesting about SNN is that there's a temporal aspect to the network
Correct!
yeah lmao so i kinda just default to the "well they do their thing"
Start with neural nets being universal function approximators, how does yours approximate?
temporal info in modeling breaks my brain more than even positional embeddings
Trust me, I also defaulted to that. I'm finally starting to tunnel into a proper implementation based on resonance.
Am currently just going to brute force with evolution
I mean currently we are starting from random noise and guessing where any number of skulls could be.
Think wireframe rather than real skeleton (and even that is too much info)
Trying to see if this path is worth pursuing in the first place
Have to figure out I/0
i mean even just audio transformers being able to model the time dependencies of music as reliably as they can makes me go ???
transformers don't understand time like at all,. they aren't sequential
and it's quadratically harder the longer you're talking for them too
Hmm... I think I also want to try hyperNeat, because I don't see how else to handle making repeated structures
The reason I chose SNNs was because of sparsity
Sparse through time, sparse through space
wheelz would love you
Sparsity through time means energy efficiency and better compute
The approximation would be a combination of pattern matches, either positively reinforced or negatively. That's all I got really.
that feller LOVES SNNs
Sparsity through space would be memory savings
i think the ideal models are all sparse in the end
Basically so I can run a good network on my 4gb of vram
we're not even dense models
True
just a bunch of sparse representations in a trench coat
That seems too broad, no? Maybe striking down a mathematical definition would help
My network also allows arbitrary connections
Like self connections, and backward connections
neurons hit up that lost connections classified
Apparently something similar exists, positoron? I think was the name of it.
wdym
I never got to reading that paper, another chatter mentioned it was similar.
there's a very well known case of a positronic AI
Like, I believe that the reason only forward connections are used is to make training viable
However, I am using evolution
And evolution doesn't care about backwards connections
I don't think I can do backprop on SNNs anyways
nah
you can't do it on linear
sequential systems
that's why we can't do it lol
you'd have to stop your own brain processing shit to update your weights
how do you do that
Maybe BPTT using approximate gradients?
I'd rather pursue the line of local learning rules
Sleep time?
Afaik we do actually rewire our brain when we sleep
dead time maybe
ive always considered it more like a car wash
but yeah get bumped around ya know
i generalize shit to the extreme i really gotta stop doing it
My definition is entirely binary, so it is acting more like a 8d binary vector at the moment.
Local learning rules should also allow for arbitrary topology
Which is a bonus
But also I think local learning is also better for neuromorphic hardware
well, i saw a bunch of crap about it first half of this year
which means it's probably 43 years away really
but everyone was sure it was ready finally
it seems to be somewhere the labs are pointing resources at because we're probably climbing up the start of a wall in the scaling again
The last iteration of my project I accidentally found a small quirk, Some updating code only exists if it wasn't what it was trained on, so I made that into a resonance method.

What
My things come in two halves, match and emit
I was using an update function and masking which parts needed updating.
me too
The match mask is only all true if the input was different than the last time it was trained.
me when i put devstral on .01 temp
The best part of this thing is, it doesn't rely on the output; so this version moved it to the forward pass.
i liked cerpe it somehow gave it a fishier sound
what does that mean
why i dont know
ive unironically been up >48 hrs rn so i may actually get in bed
shortly
i am more noise than man now
Go sleep
Sleep!
It still means Seize the day, but in latin
i will but only because i'm getting slightly nervous i might actually just die randomly if i attempt to stay up again
Unless u were talking about cerpe fish
oh no i was talking about my fish thing
lol
XD
sometimes i just say things and even i'm like
????
ok god dammit i'll get in bed
i swear if i somehow wake up at like 11 pm
i'm gonna be mad
my hands are all like
sore
that's odd right
i think i'm dying
feels like i have arthritis
might have something to do with typing for like
48 hours

Warm water soak for hands
exam includes connecting sql to java
cant fucking remember Class.forName("com.mysql.cj.jdbc.Driver")
help me
@amber fractal Can I get a quick recap of what u did?
I'm thinking of remaking my NEAT system with a hierarchical approach
The idea is to find reusable subnets that when composed can solve lots of problems (preferably efficiently)
So layer 0 is just raw neural network
Layer 1 is made of layer 0 networks
And so on
So I try and keep a database of useful submets
Anyways gtg
subnets won't be able to solve problems
they just build an internal representation
unless ur planning on building an model ensemble which would be extremely inefficient
bro uses java in 2k25
Someone put an LLM on a raspberry pi. As it generates tokens, it will run out of memory and crash and reset. He also made it aware of this limitation through system prompt 
https://www.youtube.com/watch?v=7fNYj0EXxMs
why can't it just enable page file
is it stupid
igglya is typing...
I am going to get essaying so hard
godgd mrgonining
Yes I have been actually typing this entire time

guten morgen o algo
in another languageS










