#programming
1 messages · Page 436 of 1

I think it's like 2-3 weeks ago ppl all over Chinese use that AI to like order maituan (like door dash)
since they are like doing a promo, if you order the food via the ai, you can get it for free
which kinda makes the server workload pretty high


_"The AI battle during the Spring Festival is heating up! Early on the morning of February 6, the "Spring Festival 3 Billion Yuan Free Drink Giveaway" campaign officially launched on the Qianwen app, kicking off a "bubble tea offensive" and inviting people across the country to order free bubble tea with a single AI-generated phrase. During the event, each participant can claim 21 no-minimum-spend 25-yuan free drink vouchers, totaling 525 yuan.
A representative from the Qianwen app stated, “Through this Spring Festival giveaway campaign, we hope to invite people across the country to experience a brand-new lifestyle in the AI era and integrate AI into their real-life consumption habits.”_ translated via deepL, source
-# https://www.hk01.com/財經快訊/60319797/千問app啟30億免單-ai一句話免費點奶茶-每人可領525元免單卡

On further testing the 27B Qwen 3.5 drops off severely below Q6
Even Q5 is prone to infinite loops
still makes me lmao
You just love to see it 
If I had to guess I'd assume the unique architecture they've gone with is just extra sensitive to accuracy loss
q4_k_m does it with thinking on but off it's fine
makes sense as errors pile up during the thinking portion
I'd argue the performance is probably still severely degraded with thinking off
Its just not as obvious
it's worse definitely
it's just not
infinity loop bad
q6 is the first usable quant it seems
That aligns with my testing yeah
i have successfully finetuned an LLM without backpropogation
Sorry missed send

anyways
volumetric display REV2

we need that extra of empty space at the bottom of that box to put the power cable and hdmi into the projector
this still doesnt have the motor assmebly, but ill do that later
here's what they're "doing in response" for those of you who arent in the windows insiders program
hey, at least they're removing copilot (supposedly)
some instances of copilot 
nowaying able to skip updates
Honestly, to a millionaire like Vedal, this is lunch money.
Crazy fucking robot body soon? 
https://fxtwitter.com/asimovinc/status/2029141385069379825?s=20
You can build your own humanoid at home. Asimov – Here be Dragons is now available for presale. $499 deposit, $15,000 target price. asimov.inc/diy-kit
*only for OOBE it seems
hello im long time neuro/vedal watcher but never joined discord, now that slay the spire 2 has come out i was looking at automated slay the spire policies again and I revisited some of the neuro/evil vods for the game
i was just curious whether anyone happened to know what kind of setup vedal used to have neuro play the game? I skimmed through the history and there is some discussion on the API and what kind of information neuro gets, but with regards to the policy is it known whether it's an LLM-primary approach maybe with a lot of instructions, scripted (heuristic) or algorithmic (e.g. tree search based), or ML/RL based approach?
i remember seeing vedal program some of the among us stuff on stream so i wonder if he ever showed the slay the spire development on stream
i notice generally potions are saved until the boss and some significant portion of runs avoid elites completely, so i thought there might be some scripted part at least
sorry for the text wall and appreciate any pointers :)
i would generally assume it just works like any other game integration (expect minecraft and osu!) where neuro is just getting information and working with actions where she can choose things.
even that's how mc works nowadays I think
but also iirc slay the spire was something kinda custom?
i would assume it was custom at the start but wouldn't be surprised if it got updated later on to use the sdk. tho idk the code for it isn't public like the other games so maybe it still is custom
There's definitely a separate bot playing minecraft while the twin give high-level directions.
Evil has argued with it the last couple times she played.
They can't keep up with real-time games.
this reminds me i really should finish my brotato integration... when i switched to linux i kinda abandoned it 
yes, what I meant is they are still getting information and working with actions to choose how the bot should respond
interesting, ty guys
I guess maybe not enough info to say exactly but has a chance to be custom?
most likely custom yea
well just the api would be custom on how it talks to neuro. but it would still work the same as something like incryption
^
and the inscryption mod is public atleast: https://github.com/VedalAI/neuro-inscryption
its just neuro choosing actions. no RL/ML, no heuristics nothing. just a llm choosing funny choices
huh that's good to know
but obviously neuro won't be draggin the cards and such. she will just say "play x card on x enemy" and then the SDK will handle on how to actually play and select the enemy
there were some other LLM evals on slay the spire and it feels neuro/evil play differently but i suspect vedal gave a different set of info
yeah makes sense
neuro is also finetuned. so she won't act like other llm's generally
im actually quite curious is it known what vedal used as the base model
just guesses mostly.. wouldn't be surprised if it also changed over the years
i feel neuro talks quite naturally sometimes but maybe this quality is uncharacteristic for smaller local models
icic
and yeah makes sense as well
my personal guess is she is lama based. or was at the start. but if she still is or switched to deepseek/qwen no idea
or maybe the mad man actually trained his own basemodel
the finetuning doesn't exactly make it easier
unlikely, but given his resources now I wouldn't be surprised if he's working on it
i mean... it is possible to train your own model. and if you use cloud runners i don't really see why not.
I think he runs his models locally
tho you could also just use a base model that exists and just finetune it like crazy so thats probably cheaper
yeah that's the leading theory
he is running them locally
also I feel bad for the editor here

it gets better later on
but also...
neuro played blue prince?
no but there's a WIP integration for it

i also just noticed the gary rework is finished. now i really don't have a reason to not work on my integration
i take it back
WHAT
I coded a webDAV server and interface for my Android phone, it automatically starts using Termux:Boot when the phone is powered on.
visible on any wifi network its connected to
its still on even if i exit and force stop termux
He hates me
would be cool to listen to what it'd be like with an official acapella and reverb removed
cause using separation software has a looooot of artifacts
I know there was one song released with acapella.. but still had reverb
there's something about hearing the raw vocals
was about to say the same.... thats sounds like a fun side project to break into 
no it was a security patch
por que no los dos
to your message i was going to say
perhaps we're all wrong and that's the magic loss number
There are two optimal values for loss:
0.00
297965.6253
all others mean you haven’t completed training
the model grokked agi and it fucked up the loss calculation since it gained omnipotence
looks a lot like the gradients exploded
they exploded alright
No it grokked into consciousness
I like the second one
can someone check if my site has cert or if im just being played with
borgortube.duckdns.org
its supposed to say "Lets Encrypt"
no cert
did you start the server on port 443?
yea nginx is listening on port 80 only
i want to get a shitty llm to run on my laptop and i was thinking of two options
deepseek r1 distill qwen 7b and distill llama8b
which do i get
why not just qwen 3.5 9b?
its on port 80
im using nginx
port 80 is plain http
oh
can yall test playback on the site
thank you
just search anything and play it
ill make adjustments later
it's still not on port 443 btw
it works but it is also constantly hammering localhost:8001
trying to open a websocket
holy shi
i didnt notice that
it's your mpvbridge looking for deno
through*
ig to open up[ mpv playback w/ playlist
So, what should i use? I have RTX2060, RX6700XT, some "1060 6GB" cards.. and i mainly run Windows.. i mean, running llamacpp server, and then yet another server for chat-ui cant be the way to go, no?.. at the same time it does allow to be very modular, at the cost of being janky af to boot up..
my bot runs its discord client on my mini pc and then the model is on whatever
usually my main pc but ive been moving it around
usually the servers will chill and load jit if actually called
guys I kinda regret touching termux
Dawwggg i rly am living in a broke ass environment huh? 
looks reliable
Reliable my ass 😂
Not really an option for me, ig i could do xmpp but yea.. not private
@glass jetty
oauth
I might just try to get an ie3 compatible frontend running..
var myElement = document.all["myId"];
Apparently ie3 does support this much js soo.. with some php and all it should be possible
Wouldn't be my first rodeo with getting decent ui from ie3, but my first being somewhat dynamic
Yes
Don't judge me, i maintain legacy systems because yes
bro thats a fossil
Nah that's part of the cosmic microwave background
i was not yet 2 years old when that browser came otu
not sure if there's any ARPANET compatible discord clients
Yea, default browser of Windows 95.
95 can do tcp ip
It gets worse. To allow adding new messages, i have to fire off get requests or something via js (ie3 compatible js no less) and render that in a hidden iframe
I mean it should work but.. yeaa
Purely conceptually, ig a chat open should immediately redirect to a chat running with an id, to keep track of new chats and stuff
Is that even doable php-side?
Should i even do this with php or rather a python server..
Python would make everything else easier ig
as for
megalop
(LOP)
Those proxy services kinda dont work well on a 50mhz i486 with 4-20mb ram
And IF i can get it working with just old tools, why wouldn't i
I honestly still don't understand why people glaze Llama.cpp... I can't think of any reason it's better than alternatives
Well have you even used it? The VRAM usage efficiency even for big models is great
i only included that because it exists but that would be my last choice, the other 2 are more of what the suggestion was
why wouldn't i
i dunno, since you were moaning about it i figured you dont like ie3 but felt forced to use it or something lol, if that's the aspect of it you enjoy then go off king i just read your messages with a more exasperated tone than intended i guess
I run vllm
sglang is better and cooler and has nicer clothes + a cooler ride with a supercharger and also more people asked it out to prom
As for vram usage a W4A16 GS64 24b model is similar in quantity to q6_k yet it's about half the size, I'm able to fit a 24b+ 131764 f16 context in 48gb
I had a cobalt SS for about a week and that thing was evidence that a supercharger doesn't make something cool. The thing was hilariously disgusting
i dunno sounds pretty cool to me in a night ranger kinda way
It's a 4 cylinder making a full 200 horsepower and very little room for improvement... But I'm probably biased as I am comparing it to a dual pulley audi 3.0 tfsi
48GB for 24B? Inefficient
I do 24GB for 27B

Cheapest 3090 rn is 600
This popped up tho, pc that fails to boot and goes to bluescreen and such, for 900
I think ill buy it, if nobody bids higher
Well of course not
I was think something around 500€?
It's always possible it's not broken, but the CPU mem controller has somehow degraded a little and can't keep mem speed
Who knows
Either way, update me on if you get it and have a 3090 for me to buy
It has "DPC_WATCHDOG_VIOLATION"
Apparently could be related to SATA or GPU driver
Wouldn't be surprised if it was a faulty Nvidia driver
Oh
There are 2 types
"CLOCK_WATCHDOG_TIMEOUT" also exists
They didnt specify which one, so could even be cpu issue
Eitherway, ill bid up to 1K
devstral small 2 2512: llama.cpp on a 32GB card with vulkan, stock q6_k gguf, no custom anything, 131k context, multimodal, amd card LULE, uses ~40GB total across vram and ram, and this q6 is 19gb raw size 
Apparently that one is related to interrupt processing
Associations may be driver conflicts, overclocking (unlikely in this case) or hardware issues
It seems like the issues are more pointing towards other components yeah
If even the GPU and RAM work you could probably get a profit off of it
Or just keep the RAM
It's up to you which you get
If you sell the 3090 for a similar price in either case, it's not gonna affect me
take them to the cleaners, the ssds if theyre at least 980/990 + 3090 + 32gb of ddr4 (not ddr5 prices but that's still gonna be at least $200 in ram)
deal
So get the one without errors?
Without errors is probably worth it if it's indeed only 1000€ for all that
I’ve had that several times and mine still works
Mostl
Either way, whatever you end up getting, just let me know if you have a good 3090 to sell
and superbox was never seen again
I will get those once I put some money onto my bank account
I just need to figure out how to actually do that because I have never actually transferred cash to my bank account on my own
your bank should have an ATM that takes cash, there's one almost guaranteed at the bank itself, but sometimes they have atleast one in most cities if they dont have a full branch
Well I'm not exactly in a city, and driving to the city is very scary
I'll figure it out
can find em on google maps also, if the bank doesn't have its own map
Hi! Can I rant about Samsung here? I just can't stand their warranty policy 
Sure
The tehc channel does indeed allow talking about tech related things

Asus one
I see
Hm, I wonder if that thing has a riser on it
I'm gonna need a riser in order to be able to actually attach it
It's vertical mounted so the chance is not 0
pay extra for the riser 
I got an SSD from them back in 2021 when I lived in Ukraine and I moved to a different country. The SSD died and for them to replace it they told me to send it back to Kyiv. 
Man, why can't an international company have an international warranty
Oh goodness it has 3 power connectors
I'm gonna have to do some rewiring for that
Eh whatever I'll figure it out
You dont need to use all 3
Thats mainly there for OC

I see
you can lose perf ceiling with less power
Not like I care about that, it's just gonna run machine learning
I can afford to lose a couple minutes from a 1 day training run
there should be a spec chart somewhere that says how much you lose
Maybe i should keep this 3090 with more power connectors, and sell mine with only 2 to superbox 
:V
You only need it for the vram anyways
As long as it's got a 3090 core and VRAM I'm fine with it, if it's not so big it literally won't fit in my PC
Looks like the Asus one doesn't have a raised TDP, so it should work just fine even with just 2 connected, as it gets 300W from 2x8-pin and 75W from PCIe
oh dang 3090s some had 12vhpwr and some didnt huh
12VHPWR? It didn't exist when the 3090 was made
The 12 pin connector on some 3090 models is a proprietary Nvidia connector
im saying 12vhpwr but i actually mean all 3 of that proprietary original, the fflaming one, and 12v2x6
Ye ive decided, im keeping the ROG one and selling my inno3d ichill x3 3090
Oh I see
Cuz the rog one has 3 dp and 2 hdmi
Should work pretty much well for my purposes
Expect it to take like a week or 2-3 to get here, and then ill test it
I see
Brick
mm yes a cooler with a board
if (force_choice(get_barter_items(), "You are bartering. " + get_gold_text() + ". Choose item to " + get_barter_type_text() + ". " + get_items_we_cant_buy_text(),
why is get_items_we_cant_buy_text executed before get_barter_items 
if its on the right from it
In C++, the order of evaluation for function arguments is
unspecified (up to C++17) or indeterminately sequenced (C++17 and later)

oh wait is it different thing
can they not just do it left to right like normal people
python does left-to-right evaluation with operator precedence
pemdas 
leaving order unspecified allows some optimisations
true
im gona soo them
python functions arent pure

im just calling that function for dummy result so it properly refreshes
python town bicyccle
who said they are 
order has to be specified when value impure, otherwise randomiser
toast probably didnt mean completely unspecified 
still
as far as my understanding
must eval arguments before function

ye but python ltr specified

as fitting for a slow language 
i was specifically talking about c++
well, depends on
python is fast!!!
-# when you use it to wrap c and fortran
operator
some of them are right to left
and some of them arent

the important thing is that binary plus (addition) is ltr
though unary plus is rtl


down-to-up
buss it
leftern
rightern
uptern
downtern
where does it matter for addition to be non ltr or ltr
yeah...
vectors?
if order of operations doesnt matter, you can reschedule them for more optimal codegen
eastern evaluation order (left to right)
oh right then ltr would be western
f(x y
z w)

131768 context (128k)
compute bound not me bandwidth bound
I've got about 96K CTX on it
That's 100% q8 context
it's not
i set it to fp16 purposely
llama_kv_cache: size = 20600.00 MiB (131840 cells, 40 layers, 4/1 seqs), K (f16): 10300.00 MiB, V (f16): 10300.00 MiB
sched_reserve: reserving ...
Math isn't matching. Context size at f16 is a static size. Not quant related
unified kv cache probably confusing you
idk
math maths for me
KV cache per token per layer: 2 (K+V) × 8 (kv heads) × 128 (head dim) × 2 (fp16 bytes) = 4096 bytes = 4KB
per token across all layers: 4KB × 40 layers = 160KB
131,840 tokens × 160KB = ~20.1GB
mistral3.attention.key_length = 128 (head dim K)
mistral3.attention.value_length = 128 (head dim V)
mistral3.block_count = 40 (layers)```
Oh wait it's llama4 kv. Devstral2 is special. I was closer to 160k kv
Yah no
did you know that they have a 150 qubit quantum computer already? It was news to me at least https://meetiqm.com/products/iqm-radiance/
someone put out an open source quantum computer like
last week
well, idk if it was open source, but you can buy one it's diy
EduQit Build your own Quantum Computer. Talk to our Sales Team Deploying Quantum Capability in Education and Research Quantum computing education is often limited to theory, simulators, or remote cloud access. While these tools are essential, they do not provide exposure to real system operation. EduQit enables education and research institution...
wait what?
me right there
my own quantum computer at home? 
what do i need
I want one so bad
it's just that ez

they look so cool too
wtf
this sparks joy
how many qubits can you get with that though
oh
why cant they put estimate price so i dont talk to their team just to know its 5 billion usd
we aint error correcting with this one boys
i read it costs about ~$1MM
oh no MM is right
im so smart
about a milly
lil wayne wrote his hit song a milli about this computer
I'm getting it tomorrow then
EduQit is designed for universities, research institutions, and public-sector organisations
5kkk sounds like a highball for this profile
First, Devstral2 maxes at 256k before hitting max native support. Second, 161k is 38Gb vram for just kv at f14, with a 19gb q6_k you are 9gb over that at 161k vs the 10gb W4A16 GS64. Third the 128k in 48Gb was mistral 3.2 not Devstral2. Devstral2 uses a hybrid ministral 3 mistral 3 architecture with a sliding attention scale and other changes, it's not apples to apples.
Regardless 300k will not fit in 48gb
is there an unofficial korchynskyi
Devstral2 maxes at 256k before hitting max native support.
not in every checkpoint, the llama ones max at 393216
161k is not 38gb for me when i literally load it up with 161k so idk what's wrong with it but seems vllm is missing some optimizations 
given that the model is almost 10gb larger i'd expect to be using more vram with that context
just using lmstudio to show the gui cuz it's easier
models have different context sizes in different forms man that's always been a thing
Math says 300k is 45.6Gb for just kv
You aren't using actual Devstral2 if it's a Llama based model
it's not llama based
It uses Llama attention not a Llama base
it's a checkpoint directly from mistral
You are 100% not preallocating
i mean i am so
it shows how it's calculating it as it reserves it #programming message
doesnt look like that says llama to me
that's based on max positional embedding length, seems they dont think it's good to use that so it probably does realistically max around 256k, i wouldn't trust a mistral model at such a long context anyways but that's just what the model reports so that's what it lets me set
I'm talking full prefill that's runtime residency
fghdsghsghs
Besides that mistral literally states "vLLM is the most efficient and reliable way to deploy Devstral locally" so I have a feeling that something is wrong. My guess is no full prefill, kv dtype, or it's silently spilling into ram
q8 model with 161k is still not 48gb :^)
it's going into ram, but the total size is not >48gb, my gpu is only 32gb.
load_tensors: offloading 39 repeating layers to GPU
load_tensors: offloaded 41/41 layers to GPU
load_tensors: CPU_Mapped model buffer size = 680.00 MiB
load_tensors: Vulkan0 model buffer size = 23206.58 MiB
2026-03-22 10:11:54 [DEBUG]
common_init_result: added </s> logit bias = -inf
2026-03-22 10:11:54 [DEBUG]
llama_context: constructing llama_context
llama_context: setting new yarn_attn_factor = 1.0000 (mscale == 1.0, mscale_all_dim = 1.0)
llama_context: n_seq_max = 4
llama_context: n_ctx = 161024
llama_context: n_ctx_seq = 161024
llama_context: n_batch = 1024
llama_context: n_ubatch = 1024
llama_context: causal_attn = 1
llama_context: flash_attn = enabled
llama_context: kv_unified = true
llama_context: freq_base = 100000000.0
llama_context: freq_scale = 0.0208333
llama_context: n_ctx_seq (161024) < n_ctx_train (393216) -- the full capacity of the model will not be utilized
2026-03-22 10:11:54 [DEBUG]
llama_context: Vulkan_Host output buffer size = 2.00 MiB
2026-03-22 10:11:56 [DEBUG]
llama_kv_cache: Vulkan0 KV buffer size = 25160.00 MiB
2026-03-22 10:11:57 [DEBUG]
llama_kv_cache: size = 25160.00 MiB (161024 cells, 40 layers, 4/1 seqs), K (f16): 12580.00 MiB, V (f16): 12580.00 MiB
sched_reserve: reserving ...
2026-03-22 10:11:57 [DEBUG]
sched_reserve: resolving fused Gated Delta Net support:
2026-03-22 10:11:57 [DEBUG]
sched_reserve: fused Gated Delta Net (autoregressive) enabled
2026-03-22 10:11:57 [DEBUG]
sched_reserve: fused Gated Delta Net (chunked) enabled```
161024
vllm's efficiency (and sglangs to an extent) is mostly about serving multiple instances of a model at once
it says llama all over the logs because it's....llama...cpp................... it uses that no matter what it's loading
i just pasted the entire text of pride and prejudice into the chat
which is about 159k tokens
it's at 102% of 161k
ram allocation has not changed
You are spilling into ram at, 161k at q8 q6 is only 5gb smaller. 300k absolutely will not fit
i already said im spilling into ram obviously
gpu has 32gb of ram
but
it's not over 48gb
im talking 161k here, not 300k
lm studio is using about 42gb of ram + vram
chewing on this pride and prejudice
has there ever been an open source actual alternative to notion
So your using 74gb not 48
What is that
the model + kv
42+32gb vram =74Gb
42 is 32+ 10
32vram, 10 ram
over the entire system with vivaldi open at like 70 tabs im only using a grand total of 31gb of ram right now WITH the offload in there
Something is getting compressed, there is an open issue that Llama. Cpp ignores overrides
it's not, i promise
why dont you go try it
and report back
instead of trying to gaslight me into seeing a different set of numbers on my screen as i am doing exactly what you're refusing to believe and showing you
Vulkan0 model buffer size = 23206.58 MiB
Vulkan0 KV buffer size = 25160.00 MiB
That's not 42
48gb
Is it 42 or 48
what am i reading
that's with the q8 model
so both?
hascrack was very excited to share that vllm being the superior inference engine can fit a model in the 10-20gb range AND 131k context in under 48gb
and now literally cannot even when llama.cpp also can do that
that i s weird
A non-llama.cpp user just literally can't accept that llama.cpp is more memory efficient
honestly i would've figured the same as well
i didnt even set out to argue it, i was just curoius to see since he gave a specific example of a model & context
it turns out his model was actually smaller than this one even
Like what else can make a 27B model run in under 22GB of VRAM with high CTX?
he stated his was about the equivalent to a q6 but half the size
so i just loaded up a q6
so by all means it should be larger in memory all else equal if llama is the same measure of efficient
What token per second with offload (generation not prefill)
liike 30
But you also have some 3090Tis and whatever?
even with the model fully in memory i never hit 100tok/s
on a model this size really
How much is the mem bandwidth on that thing?
Yeah with a 20B range model you can expect around 30 t/s level speeds
Yah I'm at about 2Tbps but have TP overhead
I hadn't pulled logs because I hate myself and want to play league and logs are on other os
it didnt know of the popular ya mama framework 
Lmao
i did have plans to set up a battle royale of these things and just slap them on an h100 or something so there's no possible overhead issues
and see how they did myself
havent done it yet tho
I'd have to say if it didn't say Devstral 2 I'd swear that was chat gpt
i use llama most of the time because it works well enough to serve models<~50b or so with good speed and no effort
on this card
Ah I see
vLLM is designed for batch inference -> sending multiple requests at once and processing them in parallel
Outside of that llama.cpp is better
yeah that's why i was saying, vllm and like sglang etc are definitely way more efficient in that arena
for like a single user the difference shouldnt be that dramatic
although i dunno, given the difference between llama.cpp ROCm vs vulkan
it can be dramatic
for the record ROCm (HIP) version is way shittier
Mistral models are easily tunable and punch above their size in specific areas. They aren't the best general model (although I'm itching to try the new MOE one) but for specific tasks they are great
let me try to load this thang with rocm/HIP llama
it'll probably explode
if i guess
I'd say try Exllamav3 but I'm pretty sure that's primarily nvidia
fuk
forgor
HIPIFY!!!!!!!!!
rocm moment
build cuda version
then ghettoblast it
ahahhaahahahha
i forgot this main reason i never use it
it can't have any overflow
I still think the most satisfying thing I've done is build flash attention 2 on my server. Took less than 10 minutes
it just explodes
overflow? when you can have overflow what?
the other day i managed to build rocm7.2 in about 18 minutes
that was fun
i was expecting it to take forever
Flash attention 2 was building on 72 cores and eating over 500gb of ram
the model weights, if they are larger than the gpu (weights + KV cache) vram, ideally it just sends the extra to system ram
it's slower but you at least can load it and use it
slower by a lot
for runtime?
so instead of like 50-100tok/s for example you might get like 20
yeah inference
or less
for runtime it is reasonable actually 
See I have throughput brain rot to the same level as Vedal has latency brain rot. I don't care how nice it is, I wolnt sacrifice my throughput
the thing is i never really use local models for much but busywork that benefits from throughput but it's not li ke im sitting there waiting for it to finish usually
cuz i just pointed it at my documents folder and said "halp im lazy, organize??" and that doesnt take very long even at 30 tok/s
or make it watch a training run
which requires no speed at all
cuz that's just waking up every like 20+ minutes to check
See I use Devstral2 to brute force code review for stupid errors. A task that would destroy claude code or codex's usage
i pay way too much a month for that so i dontr have that issue but also am wasting money on it
I have 2 codex plus subs and a claude $100 sub. Still come close every week
i was going to reply obsidian to this but im positive you'd know of obsidian and i dont actually know what the difference between them is other than that notion likely is way larger scope
obsidian is neither
i thought it was open source
open source nor an actual alternative
appflowy seems like one
databases
loqseq?
https://logseq.com they have the thinker on their homepage so theyre kinda based
for my purposes i'm still gonna stick with notion
but good to have some in mind for the future
saw these too
anytype seems cool
one inconvenient thing about notion is that they can nuke my account at some point if i log in without a vpn
they for some reason decided that's the best way to do sanctions compliance
literally happened to like multiple people i know
for some reason i didnt realize you're in russia
i see ussr domain name now though kekw

hey no judgement here, i'll never judge a person by the actions of their idiotic government cuz if i held myself to the same standard it'd be so joever
sad that icann is gonna phase it out tho
judgement where
i didn't see any

good
yes
i always feel like as a burger that i am predisposed to appear that way by accident so i gotta pre apologize
not hackers AND scammers
now i want a .su
to post phonk on
2050 mobile so only 4gb vram
would have to offload rest
there was a real period of time i was hype thinking the global nightmare of tensions for no reason was definitely ending soon
there are a lot of places i'd liek to visit
not that it's impossible
but
you know, im not surprised
i have .ru and .online domain without registration by рег.ру
i am lazy
i thought you have to submit proof of citizenship for .ru
no
as for like the majority of cctlds
you just write shit on third party website as ids
you definitely had to
back in the day i had a couple co ccs that were just free
and now theyre like 19082347198273918 dollar
well
i dont even think you can register them now
idk what happened
oh lord
LMFAO
🍔
well i could, but it isnt completely fake, real phone and trace-able payment system
but as fact you can.
i set up sonos speakers to act as a raop sink in pipewire so i just hav it as an output device
which means if i max its volume and dont turn down the volume of the source, it maxes their volume inherently
and i just went deaf
because i wanted to hear phonk after mentioning it
funny thing is idk if i should say this i know an actual previous cybercriminal now because he used to own one of my .su domains and reached out
connections
bro had feds raid his house
spspspsp
his satire was too powerful

brrrrrrrrrrrrzzzzzr

we've injected a cve into your cve tracker
you are welcome
welcome to the agentic age
where instead of doing work yourself
you must babysit an idiot
or perhaps groups of idiots
they had ai features for a long time, like since 3.5 non turbo even
but agents
in a note taking app
i can see how agents would be useful but i cannot get over their non-determinisim
i cant sleep at night knowing some ball of impressionable math is controlling literally anything of importance
without me watching it
:ohidontsee:
Nowaying you can get le answer that you wouldve just searched for and goten anyways
the funny thing with this stuff is that i feel their idea is that it will make people who cannot into computer
be able to into computer a bit
but we had copilot and like all the o365 shit deployed at work
and boy howdy
do they avoid using it for anything of note
whole hell of a lot of slop posters and images flying around tho
you can literally show these people how to ask the oracle for something if they want, ive attempted it
and they
do not understand
somehow
but it makes sense because these are the same people that simply cannot fathom reading an error on their screen and that it might give a clue as to what they need to do
so of course when you give them a spooky ghost in a chat they wont know how it can be of any use and even if they vaguely know it can do anything they are too nervous to do it
maybe that's me with agents atm
idk
am i a boomer
possibly
one day ill finish helmstack and then i'll be one with the agents
how does this compare to like linear
i have used that
and dash? i tried that a long time ago cuz it was supposedly a project management thing w/ ai but i didnt see the benefit
notion is less purpose built than linear, linear is like for work, notion is more for anything
on a very high level
engines are dishwasher safe
yeah, linear is definitely like product/project planning focus
one thing i used for a bit to try that i thought was cool but eventually said f that was called pieces
and it was basically microsoft recall but voluntary
it did be gathering knowledge tho
agree
just like the humble motherboard
this is cool but i got so paranoid so fast
fuckin lol
decided to ask echo this
quantize ya mama 
memory efficiency: very low

might want to install a gpu anti sag bracket before deploying
shouldve put “ease of use” as very high for ya mama
kek
@real sierra finally mentioned again organically, congratulations you're still relevant to the boy
and nice tesla
now get it to say happy birthday organically
i have no idea how to shoehorn that in
i tried
it varies
but this is a 27b
gemma3 base
The 4b gemma3 version is fucking hilarious
But it's like slightly too stupid to do anything useful, not enough capacity to both really dial in the intended style
And be smart
At least with my current dataset

almost pressed enter on rm -rf /* when i actually wanted rm -rf ./* what does this say about my cognitive ability
It's highly optimized
Maybe a bit overfit on that task though unfortunately @rigid snow
But yeah so next round is 9b to see how that splits the difference, then I've gotta enter the qwen 3.5 zone but I'm kinda assuming rdna4 training will be a nightmare with that architecture until stuff is matured more

Gemma4 needs to hurry up
i hate commitment i do rm -rf ./* all the time i'm addicted to it
drop the ./
give me more rm -rf ./*
no i like being explicit that i want THIS folder what if it decides it wants another
what if it hates me
Just drop the . And *
this is exactly how you end up removing / 
Exactly so you just need to adopt the sudo rm -rf / and you'll find salvation
i like .
THIS folder
directory oops
git add /

dir is short for directory chat
Signals that you are literally Linux torbalds
my point is that * is identical to ./* except there's no way to accidently mistype and yeet something else 
look i like gambling ok
fair
Live life on the edge compute node
make sure you do everything as root too
%sudo ALL=(ALL:ALL) NOPASSWD: ALL literally just pasted this from my visudo

i have pondered doing that on every reinstall
but then i get paranoid
it sounds hella convenient tho
--dangerously-skip-permissions my beloved
claude code doesnt have a terminal tho so it's still limited
unless it thinks to write a script and execute it to get around it
i have
custom $SUDO_ASKPASS
it just prompts me to confirm
pulling the password from keepass
peak security 

Technically they do hallucinate sometimes (or do work wrong) and Linus yells at them for that
Much like me and slop engine
https://www.youtube.com/watch?v=ldxFjLJ3rVY
he just talk about some concepts to simple algorithm
while there literally no way to make any img look like this
sure you can create simple function that convert original to mapped
but to achieve this you should have ideally mapped 'inner' image in first place
anyway its so cool
not a technical difficulties
https://projectzero.google/2026/02/gphfh-deep-dive.html
This seems awfully useful
Actually not that useful, as you still have to have a handle open to the target to do anything useful
I mean, i dont like it, but i prefer it over modern browsers, but yea.. the exasperated tone could habe been due to a lack of time and too many projects that i wanna do.. classic issue
Yeah fun thing, we had many ryzen 9 5950x die "recently".. aka sporadic issues with boot and bsods but occassionally theyd run for days on end perfectly find
Oh I see
Changing the ram sometimes fixes it, espescially running just one stick, but like
Yea Something is fucky
wanted to play with some llms designed for role playing
the 5950x is technicall better than the 11900kf. but a dead cpu isnt worth anything
anyone have an opinion on llama stheno 8B vs mistral nemo mag mell 12B
lol no big deal i feel i was too harsh in saying you were moaning about it, but i didnt want to edit it since it felt disingenuous, ive been there, my apologies tbh
are these creative writing models
i have no idea what im doing to evaluate them
which is funny considering echo is basically a form of one
That requires you to be creative and see if it matches your energy.
since im selling the 3090 to superbox, i wonder what i should put in it's place 
Can I steal it
creative writing yeah
my google searches claim they’re built to be less cringe than actual rp models
ideally something faster and cheaper.
the 3090 is only expensive rn cuz of the 24gb of vram,
3080ti are only 300 bucks
that you might see on like girlfriend ai sites
Faster AND cheaper
ye
it's really easy to see if it matches my energy when the target is my energy
I can offer you $25
in other words, no 24gb of vram
Let me see if I have any Virdience lore I am comfortable using as a something to stress test the creative model
mi100
oh you said cheaper
no
umm
its for gaming
i need only like 16gb of vram, if even that
Nah youre all good, no apology needed.
@leaden crest Throw this into the creative model, I'll be throwing it into echo
The Extra Dimensional Physics exist within two higher dimensions, currently unnamed; I like to refer to them as 4d and 5d. We will be discussing these as a 3d cross section along with the axis of time. We'll be dealing with this space in 3 different magnitudes, Magic, Time-Travel, and Realms.
Magic, in this case Mana itself and Souls are considered extra dimensional. The Soul gives mana to the vessel, be it human, crystal, or spirit. The activation of magic shifts the soon to be coined world line.
-# The exact nature of mana isn't decided on yet, could be generated from the soul itself or could be harvested from Unspace, which honestly fits for mana.
-# I'm sorry in advance for abridging this.
The basis of time travel is many world lines exist and when you travel through time, you shift what line you exist within. This offsets the world line, changing your position in this higher level space.
Realms being differing planes of existence, each with their own universe inside. They move thanks to offsets either Magic or splitting of worldlines. When they collide, portals are generated and can either merge or they can be rebounded; this depends on the strength of the realms, if they don't have enough strength to hold themselves together they'll be forced to merge; rewriting history and matter from both halves and continue as one realm.
Unspace as mentioned earlier is the space between world lines and realms, nothing is supposed to exist out there; any foreign objects will be removed from existance. However it does exist, and it can be used. If the matter of Unspace is brought out of such it'll instantly dislocate anything it contacts with into Unspace.
its for 4K gaming in newer triple A titles. so thats why i want ideally faster than 3090

I though echo was funyun’s
alt identified?
i also have acces to echo lol
wait you want me to prompt it or add it to the lora
Prompt it to see how it'd extend it

how tf did I get in there

little known trick
alr bet I will when I try it later
is it supposed to do something
fun fact, you can put a \ in front of a ping to get the raw string
wait is echo responding to the personality tests in my discord bio
he got multiple messages at once and many things slid off his smooth brain
Still pings though, so be careful
since when do blower style 3090s exist? 
"turbo edition" the only thing going turbo will be the noise coming from your pc
recommended
echo watches streameroid labs
Honestly? That would go kinda great in a system like mine
its 800 bucks so outside your budget
Aw
Oh well
I guess at least since I know it exists I can think about tracking one down later
Blowing all the hot air out the back of the case would work amazingly to stop putting hot air into the 4070Ti
cheapest rx 9070xt i could find is 575 euro 
so cheaper than normal 3090s, but more than id be getting from superbox
i dont think i even need to look at the price sof 5090, 5080 or 4090s
oh you're right it does, good to know 
i'll save up and probably get a framework desktop board instead of wasting time with gpus
Yeah I think someone pinged mods like that in some server
i like framework and all that, but i just don't think the framework desktop is a good deal for my purposes
if you're training then it's for sure not the best, but i don't think you can beat the price for inference
the only closest contender would be macs
i think the "best deal" i ahve for interference, is not spending any money at all and using my current pc
Are you sure it isn't a for a projector 
yes im sure
hungry
i can use my desktop for that
i use a rx7800xt for gaming, it's been pretty good
im not attaching a whole gaming pc to the volumetric display, im pre-rendering that shit
It gets worse with the guild id :misschievous:
this volumetric display project is already 1.5K deep, i dont want it to go up to 2.5K
Since when? Afaik it used to only ping visually, not via notification bubble and sound-
Someone pinged mods like that in some server
Yea no ping, just a highlight
Also tf my mobile app cooked
Didnt parse it for whatever reason.. i miss v185
Yea with guild id u do at everyone.. its stupid
that's neat
Yes, it is, if they still dont actually ping
A small a bit of yapping
But apparantly they changed it according to some here?
in isolation he's just like "ok"
Hm whats the consensus on Q5 vs Q4 btw? Is the Hardware nativeness worth the loss?
someone has probably measured that for you
really depends on the model and quantizer strat i think
just checked again to make sure, it does in fact ping 
the user ping that is
I mean yes, q4 is like ~3-5% loss while Q5 seems to be >~1% according to many graphs, but is the speed up on turing and beyond in yalls Opinions worth it
Enshittification, also known as platform decay, is a process in which two-sided online products and services decline in quality over time. Initially, vendors create high-quality offerings to attract users, then they degrade those offerings to better serve business customers, and finally degrade their services to both users and business customers...
i lean towards squeezing maximum quality and sanity out of the lowest speed i can tolerate, but i use it mainly for research tasks, so i am fine if it's fairly slow. currently using a Q8 at 8tps
had to encourage him and then the client didnt like the length of his response or something and it never actually sent but this is what he wrote in the logs:
High Concept
World Ruleset: The universe follows fundamental laws that sentient beings can bend through power or knowledge. Magic is not a violation of these laws but a force that reshapes reality alongside them.
Cosmology: A multidimensional structure where 3D space and time are interwoven with higher planes. These dimensions serve as the foundation for magic and supernatural phenomena across various realms.
Sentience and Consciousness: Consciousness is not restricted to biology; it can emerge from magical entities, spirits, or inanimate objects. The mind-body connection is fluid, allowing for astral projection, possession, and collective consciousness.
Magic System
Source of Mana: Mana is an ethereal energy flowing from higher dimensions. It is a dynamic force that responds to the intentions, emotions, and collective will of sentient beings.
Types of Magic: * Elemental: Control of fire, water, earth, and air.
Temporal & Spatial: Manipulation of time and warping of space.
Necromancy: Interaction with death and spirits.
Enchantment: Imbuing objects with magical properties.
Magical Creatures: A diverse population ranging from benevolent angels and fairies to malevolent demons. These beings often act as guardians or follow specific primal instincts and cosmic codes.
Worldbuilding Details
Cosmology Layers: The multiverse consists of interconnected planes accessible via portals or rifts. Each layer features unique physical laws and magical properties.
Realms and Worlds: A vast array of environments ranging from paradises to hellscapes. These worlds are shaped by varying levels of magic and technology, leading to complex inter-realm trade and conflict.
Factions and Organizations: Various groups compete for power, knowledge, or ideological goals. Their agendas range from preserving cosmic balance to tyrannical expansion.
Rules of Interaction
Magic vs. Technology: Both coexist and can create powerful synergies, though they operate on different principles. Technology is predictable and quantifiable, while magic is volatile and harder to control.
Physical Laws: While generally consistent, laws like gravity and time can vary locally or be warped by magical influence in specific dimensions.
Sentient Beings: Individuals possess a wide spectrum of magical, technical, or physical gifts. Growth is limited only by willpower, though the abuse of power typically results in "dire consequences."```
qwen does the car wash correctly, i wonder if it got that trained in 🤔
he really said hop on voice i gotta hear more
i'm fairly annoyed at qwen's reasoning process though, but i don't have the resources or knowledge to retrain it
also, the trend is nonlinear, there will be a wall in either memory bandwidth, capacity or compute depending on your hardware, and the quality exponentially degrades the smaller your quant, so just test them all and see what gives you best of both, at a speed you can tolerate for the tasks you're using it for
cheapest 4080 super is 750, 4080 is 700
so compared to the rx 9070 xt for 575 these jsut arent worth it.
and the 4070ti is worse than the rx 9070 xt while also being more expensive.
so 9070 xt might be the way to go
the duality of ai
what kind of carwash is 6k
average california car wash
4080 == 4080 super by the way, but the super had a 200 lower MSRP
he washing a mclaren or something
facts and logic
one where you accidentally leave the window open
that's probably the one where the water comes out and splashes on your toesies
Considering it is one of my projects, checks out
truth
how is ts better than frontier llms like 2 months ago
waow rlhf
amazing technology
the smaller versions tend to just go "??" if given anything too out there
speak slower, 5head
am trying to force mine to think a specific way thru a system prompt
i generally dont give him one unless i undertrained it
that's not usually the issue lmfao
no eating till it's done
Windows finally forced an update on me 
did this update disable internet to all their malware by accident
I have a DNS sink so I wouldn't know
i have a dnr near my sink
i fucked it again
saves time on logistics
Haven't had that issue with the 27B





omg
