#programming
1 messages · Page 121 of 1
Would be hard with material in the way. But I think that's just another BSDF
near useless tech tbh
Well, for simulation it is the best you can approach it
maybe not audio necessarily then but rather emf simulations or something?
Hmm, I feel like EMF simulation will require a different kind of ASIC. But who knows. Spectral rendering is quite new
because i've never heard of practical path traced audio applications
Same 
more immersive audio in video games 
that's not what i meant by practical and that requires realtime on consumer hardware
just one more asic bro
also steam audio exists and is foss and best in class
and cpu only
I'd rather company that trains AI have ASIC to do it than using GPU 
looks very cool but is definitely not for consumers, at least for this first iteration of it (and likely not for at least the next decade)
it wouldn't need 400 GbE QSFP if it wasn't meant for larger deployments in datacenters
tbf a h100 is already basically an asic
oh thats cool i havent heard of that actually
Petition to change H100 classification from a GPU to whatever dedicated ML processor is 
GTPU
Can it perform the usual graphic operation tho? Like rendering pixel on a screen?
N
doesnt gpu origiinally mean an asic but for graphics
It stands for Graphic Processing Unit yes
now we have gpus without the gpu
the whole part of the gpu that made it a necesity to add it to a sepereate device are the parallel cores, and we kept those
they just removed the display stuff
the H100 doesn't support any of the graphics APIs and doesn't have display outputs, so no graphics unless you use CUDA
it does have a few ROPs for some reason but they shouldn't be used for anything
There's also the term NPU (Neural Processing Unit)
there's a lot of specific texture and shader related hardware 
yeah i meant more the term “graphics processing unit” in particular
wouldn't surprise me if google patented this
One of you guys from #programming would use that poor few ROPs to run AAA games I swear
you can procces graphics without showing them
its just kinda weird to do it like that unless you're rendering
not the H100
it literally does not support it
no?
no
thats a shame
there's no support for any graphics apis
aren't rops just the part that does rasterisation? if so you could emulate it in software no problem.
i feel like removing graphics api support doesnt even save that much money, its just to make people not use it for gaming???
it saves space you can use to put more tensor cores
yeah, you'd basically be writing a software renderer in CUDA 
I don't think you can use the actual ROPs without a graphics API but I may be wrong on that
The silicone literally doesn't have the needed stuff to do it
well yes, but you could always add it

there are some nvidia cards where they disabled the graphics api in the driver but the hardware is there but iirc the h100 for example literally does not have the hardware for it
graphics apis have minimum specs for capabilities, so if we can space by not adding all the fixed function stuff and rendering output, then it is either more space for cores, or you can have a smaller die and a higher yield per silicon cookie
its not that expensive if you're doing the lithography of the rest of the chip anyways
Silicone space is premium. Like VERY premium
but ye you dont need it
if you could add dx12 to your gpu, but then you can only fit 10 onto a wafer instead of 15, then is it really worth it? esp. when 99% of your customers will not use the api?
h100's are designed for clusters anyways so you dont need 100 display outs you wont be using
you basically lose 5 $10k sales for an unused feature
all matches for today have ended
pretty much all expected results
opus 4 not winning a single game is surprising, did they forget to turn on reasoning or what
no, they did
claude models just suck at chess ibr
better at coding in some way
what bothers me the most
but I honestly dont see even a point in using claude in 2025 with their prices
why first to 4
its bo4
bo7
or was supposed to be
bo4 makes no sense 
first to 4 is bo7, why are they saying bo4
I think they dont understand how best-of and first-to work :LULE:
but the games went first to 4, not 2
wtf fr
bo3 but sometimes bo5
tpu 
TRC program is also giving off tpuv6
V6!!
just one generation away from their frontier
and probably what served gemini for a quite long time

poggers
aight i have split the 1 megafile into 3 smaller ones
ill need to reformat some stuff still tho
rn i define the needed persistent variables in the main.cpp class and i pass them as a pointer to the other file, but if a variable is mainly used in that other file wouldnt it make more sense to define it there?
ofcourse that would mean it'd be global but still
What are you trying to achieve exactly?
I mean, what variable do you need and where?
for example, i create VkDebugUtilsMessengerEXT debugMessenger in the main.cpp class, but i only use it in the debug manager file
why tf is the version for the extra/haskell-data-default at version 0.7.1.1-356
what are they doing that they need so many revisions for the pkgbuild 
looks like most of the haskell packages are at a couple dozen or 100+ hmm
thats the day of the year the build was released 
on the 356th day of the year they released 0.7.1.1
but ye i have no clue
mm ghc or transitive dependency updates probably
i guess that would make sense hmm
you could factor them out into individual modules that hold the relevant variables and then use that
since a version bump is basically a manual nonenforced indicator "the build output changed please rebuild/redownload"
mornibg peeps
like a struct or something?
that'd work yeah
Was it already said that the new gpt models dropped
They suck
Trained to benchmaxx


no r2 means no cool stuff yet 
Chat this model is so bad I can't
I have it running on a h200 with the lilac training mixture
google shut the fuck up. ive said no to this 6 months ago
And loss wise
Gemma 3 4b outperforms by a mile
In mean token accuracy
It's so bad at vtuber which means
It was trained only on stem
Aka to answer questions no one asked
i mean.. they did say they primarily trained it on coding
tho i have no idea how you are suppose to run the 20b version.. its so slow on my 4080 and qwen3 30b outperforms it for me
it also runs like 60% on the cpu for me.
vs 20b?
i haven't touched any of them yet
I guess I can determine those things to be pointless to even experiment with
tho the 120b seems to atleast be decent
from the few people that i heard talk about it
stick with qwen3 for now. 
Qwen 3 megabased
i wish there was a qwen3-coder:8b version
Maybe i can pull off a good finetune for gpt oss
openai is a small indie company please understand
that has probably something to do with their "safe" training
true
They overfit the model on their rules actually
There were some posts showing
"based on my internal rules"
Type shit
Where it would list them verbatim within the reasoning traces
lmfao
Small indie company indeed
honestly i don't get it... gpt hasn't been top in a long time and now they are just shooting themself in the foot
Cause they need to release gpt5
this probably
coding: claude or gemini 2.5 pro.
everything else... one of the chinese models 
"Look, our latest model performs better than the previous by a wide margin!"
that hype tbh sucked
yeah created like 30 repos, very accidenta;
Lmao
btw are these 2 the horizon models? probably not right
like the anon models that are available on openrouter
no it doesn't seem like it. the horizon models are still up
idk.. i feel like maybe its a new deepseek?
from what i saw its pretty good at coding. so it can't be a gpt model
I would forever not forgive altman for not releasing gpt 5 today
?? lol
people have now started abusing the app name header feature on openrouter to spread anthropic propaganda
Wanna see how funny this is
insane
i mean it;s true tho
LMFAO BASED SCHIZO RANT
I was kind of excited for the open weight models, but this is even better than I thought! 
wait they link to a domain
Chinese Model vs gpt oss
that redirects to some random person's github
lol
2.4 loss vs <2 loss
Oh, nvm then.
Even for finetuning it sucks
cause moe
I hope it fixes itself magically
right
Moe moe kyun
Both 3b active params
i mean regading finetuning
Surprised no one named a model this yet
No the chinese moe trains just fine
Fyi
The stuff in the hf repo
shad
Is for some stuff 1:1 copied from t5
i think the activation is too small
Lmfao
No qwen 30b is also 3b
Active
It's just they copied from fucking t5 for some reason
Kind of off the rails, but personally I don’t really believe AI VTubers are really seperate from their base model unless they’re fine-tuned. A prompted mistral 7b is just a mistral 7b to me.
Yea that's why i have a training corpus for lilac
Based.
lmao
I'll let it be for the night
And see
How much money I'll waste
The trash architecture was expected tho
They aint google
Google actually drops huge arch improvements
E.g. gemma 3n is based
I’m guessing it’s too strict, yeah?
suckers
No it was just trained on math and code so it's having rough time absorbing the vtuber data
Ahh got it.
That’s a shame.
Yea
Tbh
I'll just release this finetune oss
Lmao
Yall can do whatever with this then
Guessing Lilac won’t be using that then 
lilac

fumo is life ahh
thats crazy
i can make most parts of a plush myself, except for the stitching on the eyes
For that u need an embroidery machine
i have an embroidery machine at home, its just form 1920
its permanently attatched to a table and you need to manually spin it with your feet 
its not feasable to make anything useful with it
It is feasible. Just need a lot of practice like most things 
its one of these. probably not the exactr same model but the same brand
this is ancient wow
Ahhh, singer. My grandma has one of those.
And still very dexterous too when using the machine
ive only made an actual plush once, and it was a pain in the ass
To sew stuff
manual stitching and i stole the stuffing from an old pillow
it was pretty shabby but it was free
#sewing
I forgot how easy it is to get sewing and other textile product here since it is one of our country largest export
i'm so glad the #screeps arc is gone at least temporarily, no interesting topics were brought up
i'd bonk them back into their holes they'll crawl out of
shiro slander 
so anyways back to sewing
soo.... any good wall building algorithms you guys developed for your screeps? i currently just letting mine do their thing and upgrade my controller
i just place the walls manually
Holy shit also have one of those 1:1 at my dads house
gpt-5

in the middle ages flanders used to be the best textile makers and the richest county in Europe.
openai engineer in this chat 

you broke the nowaying chain
he is leaking gpt-5
fun fact is that gpt 5 actually got leaked on perplexity

pretty sure someone got fired

Perplexity got the hidden url
good
Almost everyone nowadays
who's everyone
They were rotating ips and headers and everything

i can understand that
Any AI company
yes but perplexity is a search engine
Even outside their officially documented ip ranges
With AI
so google?
Perplexity is chatgpt wrapper
i don't think ai companies roll their own web search while ignoring robots.txt
Yes, but I don't know if they use the same crawler for search index and their AI stuff
scraping for data and for indexing are a bit different
Yes, and the AI division is what ignoring robots.txt
that doesn't make sense
because how would gpt-5 leak then? 3 mentions of it in pretraining wouldn't suddenly make the model recall it
that means that their search ignores robots.txt
It does
It explicitly does
Lmfao
Cloudflare blog
absolutely insane
Now the question is, do they use the same crawler for their AI and their search engine? Because google as you said, uses a different bot
isn't perplexity just their AI?
Hence my statement, AI companies ignoring robot.txt
In short, they want open source to remain the near exclusive domain of autistic coding nerds.

its simple
they dont want 4o competitor
so openai made it clueless on real world info

m8 find me a goddamn phone that can run a 20B model
sam altman's phone, similar to a gaming laptop, uses 5W on batrtery and 1500W when plugged in

bro has sub-zero cooling, in kelvin
The new Pixels are getting 16 gigs of ram IIRC
you too can run gpt-oss on your phone with a ChatGPT™ Plus® plan! buy today at https://chatgpt.com/plus (not sponsored)
Its 20B parameters not 20GB lmao
i aint runnin this on my pc 🥀
tried downloading the fucking qwen-image yesterday and BSOD'd my PC by accident
Its MoE, you can run it EZ with some offloading
on a 4080?
ima try getting lm studio back in and cook smth ig
Hell you won't have to offload
Just drop context down to like 90k instead of 128
You've only gotta save 1GB of VRAM so your OS can do its thing
Hell drop down to 64k context and you can run it while doing other shit with zero issues
You're using ollama aren't you

Lmao, yeah ollamas memory management is fucked
I'm using llama.cpp with flash attention
be me
distrohopping
"man i wish there were a way to just kinda store my exact configuration across installs of my system, that way even if i haev to wipe it for some reason it's just configured exactly the same way"
realize that's what nixos is for
it's over
i haven't figured out a good alternative that allows me to quickly use models....
i really should get to it and get rid of ollama. i hate where its heading with their GUI that silently installs when you update
i should make my own thin-ollama client 
i just need something to spin up models and then shut them down after some time
vLLM
huh it serves it like ollama does, i didnt know it did anything differently, you can just do like vllm serve `Qwen/Qwen2.5-1.5B and itstarts serving that on an endpoint
are you saying like you want it to idle and unload the model if idle?

since i only need the model to load when i use it with vscode
and then it should go away if its not in use
For mentioning the possibility of using NixOS alone, you have summoned the NixOS council here
LM Studio
Good luck with that lol. I am content with my Arch setup
but does it do memory management better or worse?
You have a GUI at least to control a lot of the parameter
my ventoy drive rn lol
a gui is exactly the reason i want to move away from ollama. so thats not a positive
lm studio's UI is the reason i am moving away from it
it has some stuff but it's frustratingly limited
better than running into a risk of bsod'ing my own pc bc I cant handle memory shortages 
I see lol. I thought you would be interested in the auto unload model
yeah lm studio do be having that i forgot
that's what i used to use for echo but i havent set him up on this pc yet
surely... its not diffcult to get that functionallity with vllm
And also this config for inference
https://github.com/mostlygeek/llama-swap
apparently you can use this with any backend including vllm
and it supports auto offload
Automatic unloading of models after timeout by setting a ttl in features
I honestly have no idea lol. I use LLM the same way you do, that is only loading it when needed
That's why I don't need a daemon running all the time
Just open another app as needed
was reading this thread and found a few that support it but it's mostly thru frontends
vllm has /sleep and /wake_up endpoints.. so you can probably just have a slim python script that is proxy
speaking of ai slopshit i found that stable diffusion is like 2x faster on my 3080 on this cachyOS install
than it was on windows
so that's cool
check this out
that's the whole point of this thing
but it's for oogabooga text gen webui and sillytavern
ok chat this is first time im trying a MoE local model
what do you increase number of experts for
dont
just dont change number of experts usually
it can cause it to be weird and/or shit
it's fine tuned with the built in number of experts

more experts you would think is better automatically but there is usually a ssweet spot
the router itself is the problem with that
i mean theoretically more experts = better
but unless you're on base model only and planning on fine tuning further with your chosen number of experts it'll probably get unstable quickly
the more i use c++, the more i question if i suck at programming or if c++ sucks as a language
and then i relaize the anser is both
it was never a valid question
hmmm claude has opus 4.1 now i didnt even notice that happened
pre-empt gpt5 i guess
agi
but ye anyways i have now made the seperate files not contain functions, but a class with functions and variables cuz it jsut works better that way idk
chay would not be proud cuz of the OOP
ok im runnin in a lil bit of trouble here
I think lm studio prioritizes taking up my normal RAM over VRAM for some rsn
like it fills up close to 100% way too fast
i started using capitals for the first letter of a classname tho, that is huge
choose the gpu offload layers
appropriately
it'll default to like
8
when there's way more than that
you would configure that when you're initializing the model
its literally set to 20/24
The brute force way lmao
gonna lower the context
it should put whatever doesnt fit in system ram anyways
using anything more than like 8k context locally balloons memory usage pretty fast yeah
i cant read that, is that ram?
yes
ah
your problem is that says память and not memory
4k length it is LULW
i may not have a brain gentleman but i have an idea
an idea and a dream
remove your ram and then it cant use your ram
am scared
it says 15 gb is taken but FUCKING WHERE
im fairly certain i toggled off the reserve memory checkbox
am really not clocking where is that 15 gb coming out from
waht size model
the 20b one
what is your VRAM
16 gb
?
unsloth/gpt-oss-20b-GGUF
ggml-org/gpt-oss-20b-GGUF
The model is MXFP4 by default
ok so i did download the wrong model POG
yeah a 20b if it's full precision
bf16
needs 2 bytes/param
which is like 40-45 gb of vram
but hey it works
Its 20B
sort of, but im scared for my RAM so im gonna delete this shit
Oh gotcha
is it released already tho

make sure ur using flash attention and kv offloading stuff
Repo stats probably haven't been updated yet
But Unsloth are my go-to quant providers
unsloth are based
And if you go to the ggml-org one they're literally the guys who make llama.cpp 
f16 is the only available so ima go for this one ig
"F16" here is the mixed precision format the model was trained in
Its actually mostly 4 bit
oh
Comes in at around 14GB
Nah, the embedding and output layers are FP16 but most of the model was trained in FP4
Very, its the first release like this I've seen
They merged in support for the MXFP4 format in llama.cpp too
makes sense with how q4 is like the default i keep running into
everyone keeps measuring what their model will run on with fineprint of **in q4
if that one was 12 t/s unoptimized I wonder how fast this one gonna run
12 t/s inc
Very
download on this 1 is holy fucking slow tho
Its ridiculously fast
wpmder wjat this lm studio linux download is inb4 flatpak or appimage
god dammit it's .AppImage
Classic lmstudio
time to download a sketchy AUR version
Very much a closed source program so you get stuff like that but its good enough
3 MB/s 
I still prefer to stick with llama.cpp though
i just like using it for very fast tests
cuz i havent set up any llm shit on this install of cachy
You are downloading in llama.cpp itself right?
Because that's a lot faster, they throttle browser downloads on huggingface iirc
im downloading in the fucking lm studio
the og model got yeeted in fast but i guess hf is fucked up

actually they look normal thankfully
Optimal sampler params here btw
Mistral GPT pog
Native MXFP4 quantization: The models are trained with native MXFP4 precision for the MoE layer, making gpt-oss-120b run on a single H100 GPU and the gpt-oss-20b model run within 16GB of memory
within 16GB of memory
only if you use like 2k context
gotta be your connection
thats not the
unsloth

u fell down the same slope I did
it was recommended by lm studio surely it'll be fine 
BF16 ono
now i know it's your connection for sure
cuz i get 70MB/s while downloading both
for each
engine revving sound
i need to get a 10g ethernet card

Weird
need new ethernet
not really a vpn, just a bypass script
lmstudio should be connecting to hf direct anyways
restarted app & download, seems sorta better
it must be something with hf where you are maybe
lmstudio is unusable
wayland icon
instead of the app icon
it's over
did you maybe have this toggled @cosmic sphinx
i forgot about this setting
you know you fucked up when you change a file, and then you get 20 errors from completely different files 

exactly
i feel like i read some anecdote recently about someone who was having inexplicable errors in his code for something and it turned out to be a cpu bug

That would actually drive me insane in the truest sense
now i find the reason i didnt want to dive in to running this crap on linux
exciting times ahead
i can imagine intel saying "do no under any circumstances do these 25 instructions after each other or your cpu will literally kill itself"
with the shit they've been doing lately
i think this just means it's running out of VRAM
they just need a longer instruction pipeline
bigger branch predictor
how aboout 120 instead of 40 y'all
just predict a year ahead
oh i forgot they've been doing those "core ultra" cpu's
to me i just thought 14900KS is the latest and greatest they have
yeah they changed the naming again for the lulz so now it's once again impossible to tell what is what unless you already kjnow
im downloading the unsloth version
lets see if ollama can run it
y'all i was so close today too
apparently 290K or whatever the fuck its called has a built in npu
i feel like thats anti-marketing, nobody wants that
Be aware ollama blows up the context cache size
Not like it helps anyone if you can't get good memory bandwidth
should've just done it
In LLM inference bandwidth is always the limiting factor
until it isn't 
And when is it not? On any modern GPU they run out of bandwidth before compute
its kinda funny how all these big companies are now boasting features that activly make me not want to get their product
AI slop?
was tempted on the 24gb intel part though
that i should've done
ddint know this model was out at the time tho
that makes sense why claude put out 4.1 opus today suddenly
kinda. im fine with ai in ms paint to remove backgrounds and shit, but dont use my data please
no recall
absolutly not
hey that's why i moved to linux
Linux is peak
i really dont feel like getting thought policed randomly cuz i opened the wrong shit when it took a screenshot
If I want AI features, I can just host my own model on my own hardware and keep my own data as my own
no
the og model is just too large and unoptimized
thats why we have based unsloth
im moving to linux soon enough, but i do need windows for my college program so ill be dualbooting
run windows in a virtual machine lule
If I ever need Windows again I'm gonna just VM it
nah
ollama can't run the gguf version of it
im doing a game-dev program, i dont want the performance to suck as
for a non-graphical application idk why vm wouldn't be more than fine
be the one to make it work in proton
//wine
//bottles
Then just run on WINE or using the Linux versions of software
why run windows when i could have a system clock THIS verbose
tis a bit drunk
why thebfucj jm downloadkng this thing
i will need Visual Studio Enterprise, Microsoft Office, Adobe Creative Cloud, Visual studio community, Unity, Figma and Blender
my condolences
visual studio is the killer there probably
and a drawing tablet for some reason too
Adobe and Office are the two problems I see
those too
I wouldn't wish VS upon my worst enemy 🫡
idk i never expect visual studio fat to work properly on linux
tell us how that runs, i may consider testing tmrw
Will do 
it's not like it has a reason to exist on linux with how tightly integrated it is with windows development tbf
whhere's the lobotomized and incoherent mradermacher q2 quant
perhaps merged with a random qwen model for no reason too
technically with both GPUs combined I could load the 120B model at Q2
the lenovo tablet i got a year ago already has a pen digitiser thing built in along with a magnet wireless charger for it on the back, i just need the pen for it still cuz it wasnt included with the tablet itslef for some reason
oh yeah definitely it's just one of the things that i expect to fit the "never, absolutely not, definitely won't be a smooth experience" type thing when trying to use it on *nix
But not like I need it, I have no use for an LLM on my local system
as expensive as it was, i absolutely love my surface pen 
the tablet itself was only 240 bucks of aliexpress, im also getting the pen on there
if i were to get them from a local company the tablet would be 700 and the pen 140
Okay, even with the fucked up unsloth quant GPT oss outperformed Qwen 3 32B
kinda thought about getting one the other day
This is the real deal
also i sure love how there are like a dozen different standards for digital pens 
can we not
imo you're better of with aliexpress than amazon in terms of dropshipping

relevant xkcd
honestly
how exactly is it fucked up
gibberish
Tokenisation bugs
yep, even lenovo in their tablet ecosystem has 5+ different pens that don't work together. And Lenovo also likes naming things in the Chinese market differently than the western market.
the xiaoxin pad pro 2023 i got is called the tab P12 here, and its 3x the price
i wonder how the unsloth version ended up larger
guess just from saving it in flat f16
assuming that's what was done
the mxfp4 weights wouldnt be real f16 weights but would still be slightly larger in disk i assume
that's par for the course
model releases
everyone releases some form of it
fix later
I mean, knowing its THE OAI model, I thought more ppl would run it thru by now
Unsloth also had the sampler parameters wrong 
unsloth version released 40 minutes ago 
temp 1.0 
Its a reasoning model, they get weirdge with this stuff
actually classic
i feel like this particular param thing does happen like
every time
with unsloth lol
love them but they love shipping the wrong inference params
probably p good for programming but idk why you'd do a local model for brogramming
minimal chances of data leaks to bro corpos
blud is working on top secret code
if I worked in the defense industry
i figured out long ago that anthropic has no use for my shitty code
don’t give it an email tool
well yeah defense industry would be a little different if u need a clearnace
but even then it'd be hard to work from my own pc
It can do agentic stuff like browser tasks too which is neat
honestly yeah, making it work as a local agent
i’ve seen schizo gh issues filed by claude
every time i've used something to agentic anything i'm sat there like wow i could've done this myself already
i guess i don't have that much to delegate away
opus 4.1 is such a meme upgrade 
2% better on swe bench == agi
consider it a sneak peek to the future possibilities
yeah i mean i know it's poggers that it's possible i just don't necessarily have anything for an agent to do atm
other than research
i got gpt pro on a whim since i've been testing the $2 billion dollar subscriptions one by one for amonth
and i'm struggling to find things to make use of it
Okay I'm getting mixed signals, they recommend unsloths samplers on some guides at their website
And then the ones I just screenshot on their github repo
in the same article?
i figure the article would be out of date
i'd trust what's on the repo
yeah no1 upgrades articles fast enough
im malding that i tried the $200 gemini subscription right before they put out deep think mode because i got it thinkin deep think was imminent
but i was 1 month ahead
might've done the same thing with gpt5 tbh
because i see this out
but when gpt5
although sama was posting gpt5 screenshots on xitter the other day
I love how sam just gave a sneak peek in that video, where bros whole desktop was cluttered with txt files
'can u move all these to trash pls thx'
Blud press Ctrl+A and Del
ngl ive had claude code do that same task
"help, too many files, fix plz"
i guess i do have agentic tasks i just give them all to claude code
and dont consider them that
because they dont need to use a visual thing
they just do it cli
on the other hand, if like 70% of the 100 files on desktop were empty, but the other 30% had important content, then if you could trust the agent to look which had the content and save them instead of deleting, that'd be cooler of a demo
altho even that is sorta ez when u just sort by file size
reminds me of the promptboard thing i made
*presses cmd-space* “textedit pls” *waits 10 seconds for it to type “textedit”*
i did that w/ claude code with a bunch of versions of something i'd written but kept saving in different directories with useless differentiators for a name
so i had it open each one and figure out which was furthest ahead
and then order them in order of like
up-to-date-ness
its nice that for this model u can select the reasoning effort, like for o series models
if that acc gives some improvement
well
max reasoning can actually cause some degradation
depending on the task
so that's what i imagine you'd use it for
low effort for low effort stuff
max for hard stuff
if the task at hand is too simple sometimes they overthink on max reasoning mode
"What is 2+2"
Thought for 3 minutes 48 seconds...
Not experienced the same bugs with the ggml org quant
Lemme guess, high reasoning effort 
and came to the conclusion that you're actually tricking it 3 ways
"the user is asking a simple math question, so the answer would be 2+2 = 4
BUT WHAT IF...."
Classic
simple math question so 2+2=4
but what if the user lives in a non-Euclidean dimension? 
What am I looking at here now?
ai slop schizophrenia
discussing openai open source slop
Nice
that's what OSS stands for
in gpt-oss
not open source software but open source slop
the article on hf is titled Welcome GPT OSS, the new open-source model family from OpenAI!
family
unless just meaning 20 and 120b
maybe more inc
at some point
sam's in high reasoning mode
ur kirito??
you know how these days you have power fantasy isekai slop?
are u aware ur oshi is a kirito clone
the eminence in shadow took that concept and ran with it to such a high degree its become one of the best recent power fantasy isekais
he's not just a kirito clone, he's the kirito clone
isn't that just overlord
does it
mostly
Gpt oss is so bad someone pls kill me the training is going so poorly compared to any other model pls
i imagine for fine tuning that something trained in weirdo mxfp4 weights and f16 mixed would be fucked up using finetuning tools not expecting that
Ima give it time to see whether there's just bugs that need smoothing out
me every time i try a 20b class model
kinda funny if there are bugs.. they already delayed it 
which is why i gave up on using local ai for anything real until i have wayyy thiccer hardware
the eminence in shadow literally has characters named po tato and skel etal
also the mc has alter egos names john smith and mundane mann
qwen3-coder 30b is good for local ai
it barely doesn't fit into my gpu vram... but it runs fairly quickly
the normal qwen3 models are also good local ai's
Pete Saman
The guy who delivers pizza shall from now on be called Pete

I still consider Mistrals 20b class models stellar
ok this may be worth watching nvm
i will no longer shit on eminence in shadow
the bad guy is literally named perv ass hat
reminds me of one of my favourite manga of all time
thinking of what it's called i cannot name it here though rip

tfw rsync'd 2TB to a new storage drive and then thought let's verify the integrity real quick, that's a good idea
and it's a HDD
it's been reading for sooooo long
It is
Bad
I have it finetuning on a h200 rn
Bad
kek
high then super suspiciously low forever?
and then it comes out way overcooked
that's been my experience
with wrong params anyways
not robust enough
Loss is higher like

Why is there so many orange here
Orange
ye fair
You're blue and already visit beach many times
2.3 vs 1.4 loss
1.4 is qwen 3 30b a3b
It's a big diff
55 vs 70 accuracy
Im trying to see if i can fix somehow
2.3 is high
also depends kinda on the model
unless the test inferences are also schizo
wow pop! OS i thought would be running cosmic by default
and it looked like it
but it's gnome
Both are active 3b so
and just looks exactly like cosmic
well, the other way around
they made cosmic look exactly like their gnome flavour
wow
twitch L
does this if launching on zen browser
cant figure out that it's just ff
Am I missing anything by using vanilla firefox?
not really
i just use it because i like the vertical tabs
and ff can't currently be configured that way without a lot of addon slop it seems
because i tried
well yes you can have that but it wont let you jam literally everything in the sidebar like zen
Bro, wtf are you doing to the poor browser now 
i dont even have a top bar
cuz i wanted it to be like a portal type thing so there's nothing on the top of the window, it's all on the sidebar which is hidden
Ahh, I see
More docs should be written in middle English 
For yon other intricate joint forms, such as the ball joint, we remain in discourse regarding its potential implementation. Yet, a series motor doth rank last upon the list, owing to the gimbal lock that dost constrain the true movements attainable by a mortal.
i wouldn't want king arthur to be unable to comprehend my documentation that's true
its the #2 isekai comedy after konosuba for me
well mucho loved konosuba
speaking of which i need to watch the megumin spinoff
see this is why i use zen cuz the entire ui can be blasted onto this left bar
and then it's like this without that hovered
so my space is used for the actual page
ignore my too-thicc border
i need to fix that
I honestly like the minimalist look. But I use a vertical monitor anyway so I have a lot of vertical space 
so do i
on the left side
that's actually the main reason
because it was a pain in the ass
if i'm in mega slouch 3am mode
to get the mouse up to the top
lool
so i was like fuck that i want everything in this sidebar
and ff wouldn't let me
and now i'm coping
okay..... maybe twitch is mad that my user agent says linux? this is updated ff wtf
reeee what is the latest windows ff update version
funniest thing about this model might be the safety settings that they cooked into the model
wait did they really
i mean
i guess they always were gonna cook safety shit in
but like you can adjust it at inference time?
Yeah this release is rough
natively
yeah im reading all about it
woof
its disgusting
I'm having a little trouble with my AI, which I coded, and it's having trouble using expressions in VTube Studio. Can someone give me a little help
Im runing it in python
uhhh
im not an ai dev but i feel like we'd need more info about how your system works
firefox 141.0-1 on arch
So that seems the latest version
well ye, but how does the expresion system work?
i won't even do it with ff
i think it's just mad because it's reporting ff for arch or something
but even if i spoof ff
it says no
and if i use ff
spoofing windows ff
no
i guess they dont want me to ever log in
it literally prevents you logging in at all
I don't have twitch so can't exactly reproduce it lol






i can barely run the gpt 20b with 8k context




