#programming
1 messages ¡ Page 21 of 1
the reason i dont like these concepts as much anymore is because i realized how poorly imperative code composes
but its the best thing we have for bringing C programmers out of the gutter
i was wrong on this btw
small triangles are bad cuz of wasting quad samples while rasterizing.
this doesnt mean you should have no small triangles, this means you should maximise triangle area
so left bad, right good
*except for nanite meshes

i introduced bwaa, konii introduced neuropoghd


branding
it would be funny to brand game-jam 3 game under "bwaa studios"
i need to buy that domain ngl
im poor tho so nah
bwaapoghd
just checked wikipedia and apparently my height is a couple cm above average
like regioanl average? or just average global?
well i assume async wont be supported right?

epoll is a beast
and theres io-uring too how fun
async doesnt require threads
its just polling the kernel for new data for multiple files at the same time
yeah
i think just copy the posix api open read write, maybe creat
will file handles be platform independent?
or will ableos use some special thing for representing files
on posix-compliant systems its just an integer
just write a File object that wraps a file handle, write (write data from a slice, returns how many bytes it actually wrote and may have to be repeated), read (read data from the file into a slice, returns how many bytes it read), open (flags like read, write, append, create should ideally exist, idr much else)
i think for an actual good lsp i might have to use epoll to read files and process requests in parallel but the good thing is i just want a working lsp not a good lsp for now
also paths are their own can of worms but lets just work with unix paths for now
also stdout/stdin should ideally have the same read/write API as files (should be natural since stdin/stdout are just files internally)
oh also flush is important
and the bad thing is, all of the io operations can fail
and i dont want the lsp to crash when it fails to read a file
oh and a rename (move) api might be useful for atomic filesystem writes
hm do I need to prepare a june variant of my cat pfp
i have mobile detection now i guess
you can at least optimize by copying in 8 byte chunks but not much more i guess
or or or
need a way to jit code
then you can just write the assembly
actually
doesnt hblang have function pointers
if so you can already do it 

yeah but you have to find the right one
find a compiler bug that emits wrong assembly 
can you get a function address at least
how do i compile for x86
wait i just realized i can probably overrun the stack and execute arbitrary code 
this is clearly the best solution for optimized memcpy
well, time to derust my binary exploitation skills
i think ROP is the most realistic approach here

works 
tablet still gets recognized as not a phone so i might need to fix that idk
with shadowmapping the tablet gets 70fps, and the phone gets 0.5fps. so for phone it needs to be turned off, but on tablet its fine to be on imo
this depends on the tablet tho so i might have to fix this someday
oh no... this code sets up a []u8
i think it might be fine at the end
i just want a buffer on the stack that i can overrun...
no it has to be on the stack
i already did it but i just want asm thats easier to work with
i was wondering why my heightmap didnt work on mobile
In Google Chrome on mobile, the maximum texture size allowed for WebGL is generally 4096x4096 pixels.

why must mobile always cause extra issues?
Because SMOL
@ bwaa fix chrome please 
progress
You may just have to create a tiled texturing system specifically for use on mobile devices
i successfully jumped to the shellcode but it segfaults on first instruction
probably have to make the page rx (or rwx)
fun fact, mobile firefox maxes out at 2048x2048
just checked
guess it depends on the gpu and/or driver
or maybe not actually, chrome says 4096 idk
FUCK FUCK FUCK
why must things be this way

i just want heightmaps to work
do i just make a lower quality version that gets used on mobile? i guess that works...
is cutting it into multiple textures not option?
kinda
damn, google has AI code boxes now. and the code is not too bad
idk why it feels the need to split the colour channels tho
this is all i did 
yeah i tried a bunch of things but they didnt work but now that i return @syscall it works
thx
(i tried assigning the result to a variable before, that didnt seem to work)
grr it still segfaults
why wont the cpu just let me have fun
Chrome & Firefox dump courtesy of gpt, idk how useful this is but here you go @olive sable
thanks
my phone is a bit unusual since it has an amd gpu tbf
Mannnn there was an australia ahh looking ~8cm arm span spider under my bedđđ
okay yeah i think something went wrong with the webgl2 check 
desktop firefox should support it but it also says unsupported
it worked when i allocated a separate chunk of memory 
it just didnt work when i used the binary's static memory
time to strip the size of the shellcode executor down and like make it actually return control to the caller
it seems to say webgl 2.0 not supported on my side too
eventho it should be supported on desktop chrome
I love massive unemployment

memcpy performance is more important than whatever useless code the users are writing
i'm getting different results now
it seems to be chrome is limited to 4096, and firefox should be fine
so on firefox my terrain should be rendering and chrome it shouldnt be
Does anyone know what pathfinding algorithms games like StarCraft, League, or other MOBA's use?
A* is the one you should know
you might also want to separate "global" pathfinding and per-chunk pathfinding
its faster if you use a pqueue
Take new dump courtesy of claude, no guarantees though @olive sable
the canonical implementation uses a vector but you can replace an O(N) with an O(log N) if you uses a pqueue
Pqueue?
priority queue
fib heap 
a collection where you can add an element with a priority and remove the element with the lowest (or highest if you want) priority
the file

seems to check out ye
so 16K on firefox, 4K on chrome

This is new
so good news, i can see the terrain being rendered on firefox
bad news, why is it so SMOL????

SMOOOOOOLLLLLL
this is very interesting data, but i think none of it matter too much until you hit the cap

If it's good enough for musl it ought to be good enough for me is my motto
Code: Un SMOLifies your SMOL
wow unemployment
hello chat
i am here to once again complain about how perfect the linux sound ecosystem is
it's so good, i love it so much

explain to me how it is possible for sound to travel large distances, to twitch servers, then down to some random persons computer faster than it goes through my sound hardware then into my ear
i am literally just monitoring the audio directly how in the world do i have 8 seconds of latency compared to somebody who is across the planet
how
raaaaa
bro i wish i was joking i actually have video proof
just switch to JACK 
i'm about to add simd to hblang without even modifying the compiler you cant stop me 
and it wont
i successfully made the shellcode return control... to libc 
konii have you considered implementing SIMD anyway
it would be funny
but
fast
speed
wow the efficient local inference LLM i was going to write in hblang is no longer possible
this is true and not 100% made up, but it doesn't change the fact it is no longer possible
what the hell are you guys making
mov rax, 0x3C is to demonstrate that the shellcode does something
and then it just returns from the original function

doesn't hblang have fancy comptime stuff?
just launch the assembler 
solve the halting problem in comptime
no need to worry about runtime execution speed
ideally i'd make it so after running the shellcode overwrites the original function with what it's supposed to be and jumps there 
hblang x86 emulator...
if they do then what i just did is kinda pointless 
i kinda felt like that while implementing the gc
I know that it runs natively on x86
I mean embedding x86 programs to run at comptime or in hbvm so that you don't need to rely on native FFI or IPC 
hm
because they only exist in a meaningful fashion at comptime probably
isnt return code just the lowest byte
okay i guess all i have left to do is make it actually do something useful rather than just write to eax
hello can anyone help me with a python project im working on? its basically a website where i can calculate pp using oppai-ng based on inputted mods, acc, and # of maps wanted. however, i can't seem to get the files to download properly
i guess you can do something evil if you know what order the compiler puts symbols in
đ I am not getting the concepts of flax
i wrote assembler code that adds two numbers 
well thats easy but it runs in hblang 
code:
mov rax, [rsp - 0x30]
push rbx
mov rbx, [rax + 8]
mov rax, [rax]
add rax, rbx
pop rbx
jmp [rsp - 0x20]
func := fn(arg: ^uint): uint {
a := arg
f := &a
// run_shellcode didnt change
return run_shellcode(@bit_cast(f))
}
main := fn(): uint {
arg0: [2]uint = .[1, 9]
ret := func(arg0.ptr)
return ret
}


and if i change it to return ret + 1 it actually returns ret + 1
so uh do with this what you want 
simd, syscalls, blowing up the computer, vcvtne2ps2bf16, whatever you want
you are now free
â
ah depends on known function address
my solution is less prone to breaking 
it still can break easily 
my solution doesnt depend on known addresses it depends on stack offsets staying the same
they can change across compiler versions though
i dont really think i can do better without the ability to call arbitrary addresses
actually i might
i just dont want to
because the solution is looking for the nearest integer in the stack that points to executable memory
and then patching the shellcode to return to the correct address and use the correct input address (both are pretty easy, in fact you can just make the input be a global variable since everything is single threaded)
wait theres a better solution
its adding one more indirection, func2
then then idk i lost chain of thought
oh right
then you overwrite func2's return address to return to the shellcode instead of func, but the func caller's address will stay in the stack so you can just ret
but this depends on the function not being optimized to a jmp
anyway this is cursed and pointless so i'm done 
gonna actually do some work for once
right doesnt it do dfs from the entry point
i mean, if you can run stuff on program startup everything becomes a bit easier
one of the ideas i rejected was making the shellcode rewrite its calling function with itself on initial execution and then jump back to main to restart the program
it would work but might execute some code twice so i rejected it
but it would be worth it for the sheer cursedness
the code i wrote for addition in asm is enough to implement literally anything, no additional code required
because it passes a []uint to the assembly code
which can be used to encode arbitrary data like first item being the function to run and other items being arguments
thinking of ways to use it for evil
wait i just realized i dont need static linking if i have dynamic linking and a shellcode to call arbitrary functions
const char**, probably somewhere in libc
found this code
char **get_argv()
{
static char **preset_argv=NULL;
if (!preset_argv)
{
extern char **environ;
char ***argvp;
void *p;
asm ("mov %%esp, %0" : "=r" (p));
argvp = p;
while (*argvp != environ)
argvp++;
argvp--;
preset_argv = *argvp;
}
return preset_argv;
}
extern char **environ is key i guess
oh environ is on the stack too
apparently im able to get context info like max samples and stuff, but not the max texture size
âAwww dammit this system is too hard to crackâ
âYour firedâ
âWhat?? Why?!â
âThis man is running wfw 3.11â
â⌠youâre jokingâ
âNopeâ
what does this mean?
It means that no firewall matter for me because Iâm only safe because hackers just forgot about me
||And also because I made a huge lan brick with a firewall antivirus and htTPS to HTTP encode and decoder ||
I have to learn rust :(
Nope
Itâs not on Windows 3.11
Hell, they just barely still support Windows 7
why do we have people asking to play roblox now here?
Donât ask me Iâm so tired from getting this application running on Windows 3.11
Youâre just unnecessarily toxic
It really isnât
sure bro
And I bet in everything that youâre a nine-year-old who asks your mother for fruit snacks every morning
Surrreee
Are you drunk?
Like bro, you literally just joined
And you just want to ask about Roblox???
And your discord account has existed for three fucking days
Lol
i was wondering why it said no mutual servers lamo
LFGGGGGGGG
Letâs be honest he probably was under age
Like who goes into a programming channel in an ai server.. goes up to someone running fucking Windows 3.11 and says â wanaa plaiy rooblxâ
Doesn't matter. Too late now
as long as you're 13+ its fine, just dont be a an asshole i guess
He was acting like a 5 year old đ
was a throwaway account
someone just flexing their unemployed status
pay it no heed
I think it was both
Theyâre probably just gonna come crawling back and ask to play roblox again
Or leak the fact that they live at Celtic point worksop S81 7AZ (look up before ban or mod ping and youâll see the joke or just click here ||its a fuckin daycare đ||)
Lol

Ngl they probably do go there every day for about five hours before getting picked up by two deeply in love heterosexual adults with opposite genders that are related to them
A.k.a. a very in-depth description of a mother figure and a father figure
And them going to daycare
what happened?
Thoughts on how this is going so far?
stdout works, and division will probably make x86 basically 100% usable
not counting potential compiler bugs
Itâs a very in depth and detailed explanation of a child going to daycare and being picked up by their parents after five hours of being there
was a rhetorical question but ok
-# Hopefully I'm getting better at this tuning thing
screw it close enough its better than before at least

ive tasked myself with implementing right bit shift
with subpar hardware support
why is it so much worse than left bit shift
if i had left rotate, that would actually work
could left-rotate 15 times, and then apply a mask that zeroes the MSB
unfortunately for me, left rotate is very hard when you have uhh
...two registers
2 registers huh....


so instead
im completely leaning on the stack for this
going to do what's effectively a copy of the upper 15 bits of the input to the lower 15 bits of the output
but loops are hard (two registers!!!)
so im doing it recursively 
stack frames are basically free memory anyway right
one = stack pointer
one = instruction pointer
and hopefully two more?
instruction pointer is hardware so i dont have to worry
i guess its technically a third register
stack pointer is not hardware, its just at a fixed address in ram
the other two registers are for general stuff
sounds slow...
i dont expect multiplication from what you said 
though multiplication isnt that hard, its just bulky, a lot of bit shifts and adders
depends on the ISA
i mean yeah you can just use the binary exponentiation algorithm for multiplication too
...but you'd have to have bit shift for it to actually be fast
or, well,work

yeah...
lack of bit shift is hindering a great many things
which is why im currently implementing it
left shift is easy enough, just add something to itself

right shift...
multiplication of A by B is basically:
for every 1 position in B, shift A to the left by that amount of bits and add to the result
child's play compared to division
though admittedly i don't actually know how division works
the naive way is to subtract the divisor from your value until the divisor is negative
the number of times you can do that, minus one, is your answer (if we're doing integer division)
i mean yeah or you could bin search the answer if you have multiplication
that seems kinda annoying to work out the logic for
if you're doing something like 15/2
there isnt an exact answer
actually i guess you know how many steps your bin search will take
shouldnt be an issue
What are you working on, that you have to implement such low level details, if you don't mind me asking?
its easy to make a bin search algo for lower or upper bound
i remembered i had a game that lets you assemble logic circuits and put this together
Y not add a module for right shift, if you need it?
not sure i have instruction bits to spare
Ohhhh
okay i could read this if i wanted to but i dont https://pages.hmc.edu/harris/research/srtlong.pdf
JUST DOUBLE THE BITS
the ripple carry adders already hurt to watch
dont do this to me
Replace them with carry-lookahead-adders ez, how bad could it be 
if you implement cool stuff like division i'd love to hear you talk about it
nothing like the feeling you perfectly understand something without putting in the effort because someone else talked about how they did it
the bar for talking about this is division? it's that high?
no you can talk about much easier stuff too 
i feel like i might've set your expectations a little high
no this is very cool already
i spent 4 hours trying to figure out why i could only do recursive calls 2 stack frames deep
i know how time consuming even simple stuff can be
turns out i was trying to restore something after throwing away the stack frame it was in
and this just didnt manifest as an issue unless you went deep enough
huge props to this game tho
very cool
i remember some online learning tool that made you write various functions in verilog and checked their outputs
the built-in assembler lets you define your own tokens as binary values
here ive used them to map out my opcodes
then there's the macros 
i implemented basic logic elements and adders but didnt go further
and now im putting together basic math functions in a little library 
most of the bit tricks i know are from competitive programming not from any hardware related stuff
good way to learn verilog
i was thinking about learning it or VHDL at some point
but it seems painful and theres no way i could ever use it 
there are some cases where you can trivially solve a problem for every bit and then merge the results
when are you implementing a lisp
yeah i really like how something like this was possible
one thing is saying something is turing complete and another is making it run semi useful programs
but a forth is even easier to implement than a lisp
its almost purely stack based
First attempt at touchscreen controls
Its seems to bedoing something, but not what i was trying to do
in terms of fps, we have no fps
idk why tbh
mobile shit
i think i linked touchscreen to the movement, but its also seen as mouse input by pygame so thats also something
watching the blinkenlights
i am witnessing advanced sorcery
what in the
Tablet has enough power to do >10 fps so here's tablet
like million times slow than real time right
1 instruction on this thing takes 120 ticks with the current clock settings and fetch cycle
so at the current tick rate thats uhh
Really nice man
13 instructions per sec
its tap based apparently, idk why
i guess il just plop some arrows on the screen and call it a day?
dude this looks sweet
thanks 
so this is built with python and then wasm compiled right
yep
đĽ đĽ
the link in the readme has the current version, should run fin as long as you're on desktop i think
right shifting isnt working 
catastrophic failure
it returns to a wrong address
who knows what kind of stack poisoning might be happening here tbh
16 stack frames * 10 bytes per frame
i might need to make the stack bigger
tomorrow prolbem

https://youtu.be/rDDaEVcwIJM?si=w48j3r4FKMiGezO8 40:18
The entire video is a masterclass (I'm looking at you human method of square root) but this the division section.
an introduction to a numbering system that's objectively better than seximal
- external links -
footnotes, script, other readables: https://github.com/lucillablessing/thebestwaytocount
soundtrack on YouTube: youtu.be/MI4xSjRBa_o
soundtrack on Bandcamp: https://lucilla.bandcamp.com/album/the-best-way-to-count-ost
source code: https://github.com...
This probably doesn't help with the current issue but it is fun
it's all coming together, the language ideas i had but didn't know what to do with are somewhat integrated in my head now
it's no longer the question of "what" but "how"
@olive sable Engine mobile test at 165hz xdx
at least it seems to be running well
the tablet gets 70fps with a snapdragon 870, but it doesn't get recognized as mobile so the shadow mapping is on there
what does your phone have?
let me check
snapdragon 8+ gen 1 (we love tech naming)
I'm gonna miss dual networking when I have to end up switching phones
I love having enough ram in this day and age

I knew the phone was cooking with video, but not that much
didn't know I was paying for a slo-mo camera
my A52 has this
Surely you can get decent 8k photos but software isn't allowing it 
64MP is a lot more than 8K suposedly
so its probably just software
but i wouldnt actually use 8K from this phone's camera, cuz sensor is SMOL
guys im trying to testrun the LLM stuff apollo sent, does this look correct?
it looks like it is running and training
that does seem to be the case
i meant more like the model reporting as 0.4M params
is that correct for those settings? i wouldnt know cuz im just the guy with the gpu for this project
i like that training log output in the terminal, what is that
from
and maybe it just means the amount of trained parameters?
like so far, or if it's a lora, the size of the lora params only
how big are you expecting
I think this version is just smol
im not expecting anything, since idk how this works
params add up to being 0.4B
so its just a typo i guess?
i think its from "rich"?
python lib apparently
i mean it's definitely rich output
but like
what tool are you running that has that built in
idk, not my code
it probably isn't built in, apollo probably spent a lot of time doing it right
https://tenor.com/view/snoop-dog-who-what-gif-14541222
i feel like i recognize or probably would recognize who "apollo" is in this context
my pc is laggging every so often from this lmao
apollo is the guy that made the program, and im the guy with the 3090
so im running his code cuz he doesnt have a 3090
apollo from this disc? isnt there an apolloiscool
that is the guy yes
aight im gonna let this run while i sleep
hopefully it wont go over 24GB and quadrupel the needed time
the checkpoints take up 12GB each? 
im gonna run out of starage space in half an hour
12.5GB every 2 min is 6.25GB/min, or 1TB every 2.5hrs
how am i supposed to keep this running for 150 hours like this?
Deleting old checkpoints probably
i have school and stuff, i cant babysit this the whole tim
Need a script for that if it doesn't already include one
If apollo names the checkpoints based off of what checkpoint it is, it should be possible to automate
this
Anyone read AI 2027?
I skimmed it a while back and thought it was astupid
but I just watched a video on it and its even dumber than I remember
whats that
I'll just wait 2 years for it to exist irl 
Its a "prediction of the most likely AI outcome"
Where fictional company OpenMind makes an AI Agent that achieves AGI in 2027 then destorys the world after an arms race with China
in a stupidly long tale thats really just the paperclip machine
but don't worry, it can be stopped if you gets some more AI safety researchers 
Written by some AI Safety researchers
No conflict of interest there
And these people definitely have qualifications yep

well, tbh it does make sense that the people most paranoid about ai would want to be in the positions to stop said ai
2nd section
There is no preventing AI from having bad thoughts, just like how you can't force thoughts into people
yes they'll be able to build 1000x data centre in 3 months time
lulz
1000% this can only exist if we get very sparse, very fast
Also hive mind AIs
and two evil AIs Agent 5 from Open Mind and DeepCent merge together to take over the world
Cos AIs merging is a thing
Man people really gave up on new archs huh
last time I checked you can convert somewhat but not entirely merge models
was wondering the other day, even if the capabilities of ai scaled with parameters and therefore their ability to self improve, resulting in exponential growth, it's still going to be stymied by the curse of dimensionality... right?
nvm I forget image models (as long as it is same base model) can do thus
With the amount of training data, double descent kicks in so its not that big of an issue
but my first point still stands
but data is more the limiting factor than hardware usually
Yeah but are they twice as powerful 
I feel like eventually one will need to incorporate synthetic data as part of training larger models
I mean some methods already exist, but probably a lot more so
Ah yes the trenchcoat method
how do i code it to delete the file? is there like a torch.delete()?
or is via os.remove() accepttable?
I think some already are, but I think they'd hit the same barrier as genetic algorithms did
I'd be doing os.remove()
aight
old_checkpoint_name = f"discord_llm_step_{self.step - 200}.pt"
old_checkpoint_path = os.path.join(self.config["checkpoint_dir"], old_checkpoint_name)
os.remove(old_checkpoint_path)
code 
Ik neurosynth has something like this, so it is being heavily used already if anyone can impl it.
Though I'm not 100% if it is synth data or just a method to use data better
@prime ridge thought it'd be good to ping you
Gn sam 
I have no idea what this means

domain specific language
what dataset is it trained on
am i contextually right
Something, he's clueless
oh is it a private one?
I can see if the owner opened training data
so what did u find
still looking, had to boot up laptop
anyways are like 10million twitch chat msgs enough to train a 124M gpt2 model
or i gotta go smaller on the model size
Seems like it is not open at the momemt
Based on this meme
what did u censor?
I censored because I needed to censor
real
server is 13+ after all
https://arxiv.org/pdf/2502.15840 funny paper
why do ai people always have to use emojis in everything they create
isn't this the one where claude panicked and called the FBI
yea lmao
a lot of emojis mean the man who made it is either
- AI
- AI pretending to be human
- a man with autism
- a kid
please use uhh react icons 
5 is doesn't exist?
less AI-bros and more hustle bros
but theyve adopted AI a lot
i got distracted
hustle bros?
what they hustling
anyways did anyone implement transformers 2 or squared
whatever its called
I see so much shit like this. Just wait until the people eating up these articles learn about Neuro-sama. 
Good morning 
Any of you nerds play Space Engineers? đ
morbing
Make it delete old checkpoints like any sensible trainer that makes big checkpoints
NeuroSynth's current datasets are entirely synthetically generated using Neuro RVC and existing compatible datasets
This is done because making a full vocal synthesizer dataset from scratch (which we will do for 1.0) takes a lot of time
reinstalling windows was too easy this time wtf
i didn't even have to do regedit or oobe\bypassnro
why can't microsoft invent Microsoft Basic Display Adapter 2 or something
Meanwhile
:
đ âŹď¸

âĄď¸ đť â
It'd just be MBDA1 but with a new icon in device manager
And more telemetry
they can at least upgrade the resolution to 1280x720/1920x1080, 800x600 isn't real anymore
Too much work
The only real way is to ai generate the resolution and image and everything with Microsoft Copilot Display Adaptor
Who needs 3:1 ai frames if you can have all ai frames

does anyone have advice on a good way to build a stack frame
I'm not 100% on what should really be put in it
right now what I'm using has the function args, return address and a pointer to the parent stack frame
feels like I'm missing something
I was mulling over stack canaries tbh
currently shit just fails catastrophically if I corrupt the stack accidentally
debug info might be smart too
but I'm not 100% on how to format that
idk where to put that info either
if you're unwinding the stack you're probably discarding stack frames as you go
so where to store the return addresses as you go
well I was thinking for a stack trace
it is yeah
yes I realize that
but if you want a stack trace from within nested function calls
your return address in the current stack frame only tells you the immediate caller
you would have to inspect the previous stack frame for the information on the next caller up
which seems non-trivial to me
yeah I don't quite have that
what I called a frame pointer earlier actually points to the first stored argument in the frame
but the information for how many arguments a frame has isn't stored
well I should probably get this fully working first before adding features
I have a function that recursively calls itself 16 times and it's causing issues

my debugging tools are hell
I get to stare directly at ram

it's not as bad as it seems, this is all in a circuit designing game so it coddles me a bit
but not enough 
coddle me more 
I may as well be
don't get me wrong, the only saving grace is that the instruction set is simple
and that stuff isn't compressed in memory or anything
nor are addresses randomized or stuff like that
the ALU is severely lacking though, as evidenced by my trying to implement a logical right shift in software 
Also: A few days ago I uninstalled my AV.
Never ran better 
it's like saying frogs are better than humans because they can't contract a human disease
Did you know? There's a lot of Linux malware that's potentially significantly worse than Windows malware due to how widely Linux is used by servers for big companies
Yeah you can delete all the checkpoints. I switched model architectures so those old ones will be replaced with new ones. Also there should be an auto-delete functionality. These damn files are taking literal days to zip. I only expected it to take 30 ish hours to zip but it's just been going and going forever...
Compatible with targeted attacks, not compatible with generic attacks. What do you encounter more often
The point is there's very much Linux malware
well yes ofc
I definitely recommend adding canaries ( it clears your mind about actually corrupting data. You'll focus on other things), and also checking for other issues can break recursive calling.
iâd love to dive into this with you tbh, but iâve got an exam on caches coming up â L1, L2, L3, all that fun stuff
@real sierra
I love these kinds of videos. They're so adorable.
https://youtu.be/ta99S6Fh53c
AI vs AI Playing Soccer!
https://brilliant.org/AIWarehouse/
If you want to learn more about AI and deep reinforcement learning (how Albert is trained), there are amazing courses teaching those exact concepts on Brilliant! You can use my link to get a free 30 day trial with 20% off! I've personally gone through the course "Introduction to Neural ...
I love that whole channel actually
Both woke up rn
cache exam 
every exam question is a cache miss on my brain
i'm just ----> 
wait , you are not using HDL ?
to me caches are just black magic, being able to massively boost performance just be reordering some operations is kinda nuts
i really liked/messed when they were "black magic" for me
I like caches, they're not my problem to implement
LLM pretraining be like
implement ? , nah ez , study for exam ----> 
my GPU voltage keeps dropping from 1.050 to 1.010 and then goes back up
i think my 750W PSU is struggling
It has nothing to do with the PSU, that's just what the GPU voltage regulator is doing
The GPU core voltage is not coming from the PSU
nope
the processor is something I rigged together in this random circuit design game over the course of 2 days
16-bit
GPU VRM is . . . . dying 
risc-v ? or custom ?
i put it together because someone dared me, so I couldn't be asked to learn and implement a semi-reasonable architecture
instead I just adapted the nandgame architecture
oh boy
16-bit system with two general-use registers, a program counter register and uhhh
that's about it 
separate ram and rom
good start
two registers definitely complicates even simple tasks
i made a 32bit core in risc-v (in hdl {verilog} tho)
try not to flex challange
seems tough to learn
just stay away from floating point alus , you will be fine
oh don't worry my ALU can't do floating point

it's missing some really vital operations
like a logical right shift
turns out this is agony to do in software



wait can I even do that
how can you binary search without a division

good point 
I guess we're doing repeat subtraction then 
even stuff like this has been extremely challenging though
12341 / 2 , alu pov : đŚĽ
both registers are needed for most arithmetic so there's nowhere to really store stuff like loop counters
just make more registers
as such I've restored to recursion for even stuff like multiplication
Ofc there is, but finding it is definitely more difficult
no storage? use a stack frame 
the same stack frame that can overflow
?
reeeeaaallly big stack
make sure you put the smaller arg second
Silly
for division, could you do something like:
6/2:
2x1 >= 6: false
2x2 >= 6: false
2x3 >=6: true
6/2 = 3?
discord formatting
that works with certain ints i gues
does such an algorithm work with two's complement negative values
but if you try to do 1/2 you're fucked
ye but if it cant do floating point, it should be fine right?
like 1/2 would be 0 (or 1?)
7 á 2:
2Ă1 = 2 â false
2Ă2 = 4 â false
2Ă3 = 6 â false
2Ă4 = 8 â true
probably yes
I was about to say "floating point would get a software implementation" but if bit shifts are already software that might be intractable
Also, genericism
Windows only has 10 and 11, and most win10 stuff works on 11, so malware can target pretty much all consumers and businesses at the same time (win7 can be ignored atp)
Linux has so many different base distros (ubuntu debian arch gentoo etc), which themselves get made into even more distros (ubuntu alone has too many to remember them all) with so many ways to customise everything that writing generic malware is much more difficult, so i'd assume that most of it is targeted at distros that get used by money makers, like server distros
for A / B , it should be B à Q ⤠A
you need to start your trial multiplication with NĂ0 to catch the case where a/b with a<b
but also uhh
there's no hardware multiplier
Maybe

But yes, the linux malware can go much deeper
Which is why you make backups and partition home away from the rest, reinstall, and plug in your backup
whatever, go my windows defender
Ngl if i was actually someone who gets targeted by malware I'd have shit like bulk reinstall scripts
Format the whole drive, reinstall linux from scratch, and run bulk-install-programs.sh
proprietary software user
đˇ
yeah, my software comes from a company with hundreds of paid professionals maintaining it 
My software runs at more than 10fps tho
Very silliness
It's incredible how shitty the fusion 360 ui performs
Especially because the 2d ui somehow runs slower than the 3d cad view
On windows
That seems a little crazy
(On linux it's all kinda slow because wine isn't great for performance)
But that's why I'm learning freeCad
what about proton? is that any better?
proton is just wine isn't it
well, with some extra stuff on top
its better wine
Proton is wine for games
And it runs great
Except if you use the wrong nvidia drivers
classic nvidia drivers
I forgot which one it was but it made every polygon on screen take away a frame
nvideoiae
Satisfactory died completely
TBH if it wasn't for games I play needing kernel level anti cheat I would have ditched windows for it.
60 polygons = 0 frames?
even on windows satisfactory has graphical glitches due to the new nvidia drivers
It's amazing, they even broke on windows long before i switched to windows
We almost mailed the pc in under warranty because it looked like the gpu was deadge
Pretty much
Windows 11?
I think it may have been 570 but I'm unsure
yeah i was struggling with driver issues for months after getting a 4060 ti 16gb
came with graphical bugs day one
right out of the box
3070QM here
ye, but the drivers should be the same ones as for 10 so 
The top 3cm of screen were rendered above the bottom 3cm in windows, in a streaky almost transparent pattern
Well, who knows
I've heard of Windows 11 24H2 breaking stuff that worked before
it wasnt windows update that was the problem
Nah that's on nvidia completely
happened only after updating nvode drivers
could I get someone to verify the code for my right bit shift 
I wanna believe it's correct but
it's crashing every time

after trying to fix an issue with the gpu that was causing stutters as i left the pc on, and reinstalling the drivers at least 6 times, i gave up and switched the card out for the RX 6600 XT i had sitting around
I'll send it when I get home lol
That's crazy
i will verify, but i will have no clue what im looking at actually
Why do chips not have modulo in their ALU if it's doable via hardwiring input to output 

its not too complicated, mostly just stack macros
Lul
If only i could
But sadly it's a laptop, dad denied my pleads to just get a powerful desktop and mediocre laptop
And now the only thing that required a laptop is done by his phone
Idiot 
found a video I posted from yesterday #programming message
i have to read a book and make a project about it, and i have to make a motivation letter by tuesday.
so imma do that now
bye
Self-triple-scammed
it's a little niceified, it abstracts actual electricity away so you can just work with logic gates and high/low logic level
but yes
sure
i'll be studying tho , so hit me up dm or mention me 
shifts 
replace shifts with subtraction and it works ig
not in hardware
oh wonderful
this architecture actually has a name
The Hack computer is a theoretical computer design created by Noam Nisan and Shimon Schocken and described in their book, The Elements of Computing Systems: Building a Modern Computer from First Principles. In using the term âmodernâ, the authors refer to a digital, binary machine that is patterned according to the von Neumann architecture m...
16
@olive sable @grave fractal i'm home now, here's the code i'd like reviewed:
# math_shr(x) returns x right-shifted by one position.
function math_shr 1;
push_arg 0;
push_value 1;
push_value 2;
call math_shr_recursive;
push_retval;
return;
# internal backing call for math_shr(x)
# math_shr(v, mask_out, mask_in) performs a right shift by copying bits using masks.
function math_shr_recursive 3;
push_arg 2;
pop_d;
if_else_d JGE math_shr_recursive_if math_shr_recursive_else
@ math_shr_recursive_if
push_arg 0;
push_arg 1;
push_arg 1;
add;
push_arg 2;
push_arg 2;
add;
call math_shr_recursive;
push_retval;
goto math_shr_recursive_end;
@ math_shr_recursive_else
push_value 0;
goto math_shr_recursive_end;
@ math_shr_recursive_end
push_arg 0;
push_arg 2;
and;
pop_d;
if_else_d JNE math_shr_recursive_shift math_shr_recursive_ret
@ math_shr_recursive_shift
push_arg 1;
or;
@ math_shr_recursive_ret
return;
not sure if there's better highlighting i can provide
so the thing is, i was joking about verifying, i only work in python really
oh 
Based

with division? no not really
easily calculated if you can do division
i always get these mixed up 
my LSB is on the right
does that help
big endian is the one that makes sense and has the correct byte order if you just read it as a sequence of bits
little endian is the one that's fucked up
if the b in lsb stands for byte here, then yeah, lsb on the right is big endian

then im BE
i think my one complaint about this game's assembler is
you can't nest macros
so i have to expand any macros i use in writing my macros down to their assembly
this leads to wonderful things like
# call <functionName>
macro call {LOAD ARGS; LD R; LOAD SP; LA R; LR D; LOAD SP; LR INC_R; pointer rv inline 0; LOAD rv; LD A; LOAD 12; unpoint rv; LD ADD_DA; LOAD SP; LA R; LR D; LOAD SP; LR INC_R; LOAD a?; JMP; LOAD SP; LR DEC_R; LA R; LD R; LOAD ARGS; LR D; LOAD SP_SWAP; LD R; LOAD SP; LR D}
So based. To this day I still don't get why little endian is used most of the time. Network order is also big endian and it just makes so much more sense
also sometimes stack traces can get in the way of performance, sometimes jmp function does the trick better than call function
this strikes me as a very radical take
you seem like the person who actually uses goto in java
something something it's better for CPU design because the lower bits are first and they're necessary for the higher bits to be useful in calculations
apparently it's better for casting integers to smaller types too because then the pointer doesn't change
idk I think it's largely vestigial now
floats are useless anyway 
no trust me my 16-bit floats are very precise

iirc big endian for networking was chosen after intel had adopted little endian so i blame them, could've just gone with what intel was doing but nooooo, we can't have nice things
having it in "pseudo" made me
, but it seems fine , does it work ?
Yeah it makes no difference what the internal representation is if the CPU can just read them differently. I don't get it. I mean I understand why it was done once. But you'd think we'd find better solutions.
i dont use java
but actually this approach is called while() in java so you probably use it
oh im impressed, i was waiting for you to show up so i could explain what the stack macros meant
but you seem to have figured it out on your own
little endian is nice because a larger integer is just an extension of a smaller integer
i'm not that bad 
still tho
truncation is trivial, no address mangling necessary
i do use big endian in networking
honestly i was kinda hoping that wasnt the case, because it means there's a bug with my stack macros instead 
while
no thanks
make it recursive and ill think about it
the wrapper is fine (math_shr ) , maybe the (math_shr_recursive) ,, it may act up

Okay and whatever that is (I'm not that deep into the CPU rabbit hole) can't be solved in a way to keep using big endian and have it be efficient?

getting the least significant bits is a common operation, getting the most significant bits isnt
so it makes sense to optimize for the former
man i didnt touch asmb like for 2 years 
bigints are generally stored in little endian too i think
so rusty 
Why are they called least significant if they're so significant 
maths
I'm mostly joking at this point haha
Not like I can have any valuable opinion on this
add, or, and pop 2 values off the stack, perform the relevant computation and push the result back onto the stack
push_arg <n> copies the n-th function argument and pushes it onto the stack
push_value <n> puts the literal value n on the stack
push_retval pushes the result of the last function call onto the stack
pop_d pops the topmost stack item into the D register
if_else_d <condition> <true_branch> <false_branch> checks the value in the D register against the jump condition and branches accordingly
my math library depends on you konii
ok i wasnt too far 
i cheated tho
took a peak on the wiki you sent

didnt even know the wiki existed until today
much better reference than what ive been using
there should be a game about quantum ISA design
if_else_d <condition> <true_branch> <false_branch>
why not do it like in x86
wait
you mean something like :
CMP EAX, EBX
JNE label_true
JMP label_false
whats the difference between push_arg and push_value, are those different stacks?
but without caps, you're not writing SQL here
its a macro
when a function is called, the args are stored in the stack frame
push_arg takes an index and pushes the arg at that index onto the stack for actual use
lol i just googled it , and copy paste it , dont expect form me to write it 
push_value accepts a literal and pushes that onto the stack
gotcha, would push_arg break if push_value was called before it?
shouldnt do, there's a fixed address in ram that stores a pointer to the current function's first argument
so you can use push_arg at any time
and as many times as you'd like
and its pushed when calling a function?
or does it only work before the first call
yes, before the function body is entered the value of the new args pointer is set
stack frames look like this atm
stack grows toward the bottom of the screen
when a function is called, the return address is pushed and the function is called to
then the current value of the ARGS global is pushed onto the stack so it can be restored later, and the new one is calculated and stored in the global
when returning from a function, the topmost stack value is popped and stored in a RETVAL global, then the current ARGS global is stored in a SP_SWAP global as it's used to restore the stack pointer, then the prev args ptr is popped and stored in ARGS, and then finally the return address is popped and jumped to
and what does if_else_d compare? (other than d register)
in practice you feed it a jump condition, so it compares to zero

now i just need to port it to my language
actually
can i get a math check
the yellow args here
for each frame, the first one should stay the same, and the last two should double
Realtek's equalizer sucks 
it's the stack, with stack frames highlighted
so is it something like this?
math_shr_recursive(a, b, c) {
if args[2] >= 0 {
a = args[0]
b = add(args[1], args[1])
c = add(args[2], args[2])
ret = math_shr_recursive(a, b, c)
} else {
ret = 0
}
v = and(args[0], args[2])
if v != 0 {
ret |= args[1]
}
return ret
}
yah there some "math" , not doing "math" stuff

also buffered reader is important for reading until newline, good luck
adding something with itself is a left shift, and once the bit reaches the highest position the number is interpreted as negative, which is why that first condition is args[2] >= 0
i dont see bugs major enough that they would mangle stack frames fwiw
nor do i
which is why im worried
if the function returned the wrong result, that'd be fine
but it shouldn't be crashing horribly
if the ALU is having issues tho, that could explain it
probably the ripple carry adders again 
ill slow the clock and see if that helps
hardware debugging, like software debugging but you have even less idea what's going on 
at least it's consistently wrong
not if you have multiple cores!
why do you even care anymore
felt that 





hblang does not have SIMD yet
probably not scalable like previous titans but still




