#programming
1 messages · Page 515 of 1

I might already have, technically

Funny thing, when it comes to datasets of my own messages if I don’t dedup then it’ll always mention NixOS
chat
is this real
i was just fucking paged
cause the db team decided to do a fucking password rotation
without announcing or having a dependency graph or anything
and i was now suddenly the one having to find all our services that depend on that goddamn DB
random dependencies breaking stuff is fairly normal I think
the DB team really should know better though 

ugh
but there should be some system
or something
imo
that checks
wtf is affected
ye there usually should be
if you do it properly you have change management
and documentation for dependencies and stuff
and people who approve changes and are supposed to check this stuff I think
i would page the db team back to tell them to change it the fuck back
but it's a lot of effort so nobody really does that
especially the documentation part

you would think with all the AI hypeslop today they would atleast document something
literally the one thing it's decent at, but nobody uses it for that
db team like



just make a fucking shared calendar
takes 3min
talking to people is a crazy possibility too
in my experience that exists for changes
but then
nobody actually schedules these changes when it's small stuff like changing a password
who cares, it's only prod

wanna bet someone's openclaw found the ssh keys and decided to "improve security"
next time you're paged it'll be with a bucket and a mop
every week baby
gdd
hi guys look opengl ferris wheel for my computer graphics coursework
they liked it

OpenGL in the big '26
looks cool
did u make the whole engine
(with opengl i mean)

yes
well calling it an engine would be giving it too much credit
but yeah
it would have been a lot better if i actually worked on time. we had 3 months and i did this in 3 days
i would have had a complete game engine and i would have done wayyyyy more than the assignment specification
but whatever


are u doing game design or something
nah it's just computer graphics module
in my computer science course 2nd year
there's an optional game dev module for 3rd year but i didn't pick it
actually i don't remember what i picked for next year
i went middle school -> apprenticeship -> software engineer
wait bus stop
Hey I’m a failure too
Your not alone

Anywho
I had to switch to byte tokens for 175k model
Tokenizer was funky and still too big
And now I’m reusing every layer to get 6 layer depth from 3 layer model
byt5
I’m trying to implement spelling based loss masking so it gets better at making words that are words
thats just crossentropy loss
llm has no idea what a word is
175K parameter LLM?
Yes
Trying to get 1,000,000x smaller than gpt 3
That be absolutely tiny

Can probably run on microcontroller
Here are some outputs from ~425k version
`Prompt: 'Once upon a time,'
────────────────────────────────────────────────────────────────────
Once upon a time, it says that the new aqualced vain. They're a
great shot. However, it's not sure the bottom of the shot. Here's
the right way to make the treety on the ped, but it's a sery
audience to be a touted band your chief. The time is any of them.
That's the right way. We're going to run over my timely covered
the ped, so it's a mision. No, many are it that it's just not a
batch of the gest. The thin lies in them. It's some is the
aqualced lies. It's a small perhina bills from the ped, and the
bottom of the heddlin-daughter, I'm a small hesy. I'm sure I'm
sure
────────────────────────────────────────────────────────────────────
Prompt: 'The history of science begins with'
────────────────────────────────────────────────────────────────────
The history of science begins with an extremery of ligand. The CHS
was expected to be 64.5, 81. (0.3) were rephenal for a 1910-caling
(1). The DO Group Host. Lecenter has been assessed by Slen.
Coscosis and Osticlellar and the TDO NB. DS (3,8,115) (2012.
Although these theowers can be noted that Row-Gedo, Lero Jahrezar.
The DO Illinois is a long-ends in People of the Katkon Hatbay.
Although the SWARC Agency's TDHS was a wound of fan-shands and the
TDHS does not be aware of the theore
────────────────────────────────────────────────────────────────────
Prompt: 'def fibonacci(n):'
────────────────────────────────────────────────────────────────────
def fibonacci(n):ni, 0.0, 1.0), $$ is a gentle and gentle. Our
bottom-sides of puvillan or a core bottom sitis to a dejr. As the
"Aputing Psychology" of Rezo and Rezo G. J. Cath, Mantao B.
Balnic, Layerz, Bamhro Danick Moschy. Surgical B. Tanzi, Aputing
Marro, J. Cath-G. Hidrative Liaty Watning, Aputing Motori, Donke
Surgical Bolhma, Banan B. Surgical G. Malya, J., Laelonon G.
────────────────────────────────────────────────────────────────────
Prompt: 'The Wikipedia article states that'
────────────────────────────────────────────────────────────────────
The Wikipedia article states that the Buugal Prover Panigching in
a Molel Buugal Bitcoin-Samace, is to determine the form of an aim
of an inrairds on the Panigmen-Samace of the Buugal Ciocongar. The
Panigchings of the case was an important number of nasiilhous, the
case also was a bym was a literal aim of the Ciocongar. Loving to
Tetrifana, it has been a aim of the Molel Ciocongar. The theower
of the Tetrifana, the bym was ineffective. The Panigching and the
daily feser, which is an inserry with the Drondbest and the Ciocon
────────────────────────────────────────────────────────────────────
Prompt: 'In conclusion,'
────────────────────────────────────────────────────────────────────
In conclusion, but there is a most relevant time. This is not a
majority of stories, even though this is more than a few times.
A.D. Non-D. NOO RRBTT) is an “breaka” in the Pr'Sen, as the audim,
that the Southern F. Troxis's Beldeer J. Sect. Nor's audim, a
Markown is the Lelicy Mull, the Sect. Nor's Senate. The Sifland
Stadium is concluded in the liber and this does not
incompetifians. The VA Heny's Oesp, which was used to be felt on
the Main, we have a push, and there was no reasonsred to the
stories. The Elanzi,`
Used ALBERT factorized tokenizer embeddings with 1024 vocab for that one
But going down to 175k param requires switching to 256 token byte vocab
Plus the actual tokenizer runs were weird
Now I kinda want to make a mini-LLM on my 3090
fyi byte level models are extremely inefficient
at learning
data needs to be compressed way harder
don't byte models have not enough entropy to actually get good probabilities per token?
I can pretrain this one at 256 batch size (real batch size, not grad accum) on my 3070ti laptop
yea
As is the tradeoff of making something 1,000,000 times smaller than gpt3.5
I could probably pretrain a much bigger one at a still reasonable batch size
byt5 is barely able to generate remotely coherent
i would think the tokens are the ones you don't want to skimp on
and that one was pretrained on C4
the whole point of tokenization is to compress information
yes but you can compress too far and fundamentally then you don't have enough space to encode the probabilities you need
If byte tokens are too icky then I’ll try going back to the BPE tokenizer but that’ll require making the model itself even smaller
I wonder what a 1M-100M param LLM would do
blabber
Gemma-3-270m

The byte model speaks like konii
you already seen it like 2 years ago
in conclusion, i'm having a stroke
But back then I sucked at making data and the code also sucked
https://huggingface.co/Luigi/Falcon-H1-Tiny-Multilingual-100M-Instruct-GGUF surely this thing is crazy wicket smaht
175k could be doable if you really tune your token count and specialize it towards a particular style of output
Spelling loss is making suspiciously word shaped words 
i don't think you can get a general model out of it
If I didn’t know English I’d believe it
like those videos of what english sounds like to non speakers
theyre out of a job now
lol true
i have my discord message data
116mb
i would probably not go less than the common english vocabulary in token count, or maybe half of it
so 1-3k tokens
I’m sooooo gonna finetune this on gpt-3 and 3.5 to make ChatGPT run on 1mb vram lmao
"For desktop/server"
113MB
Give us the tokens per second speed coward
"only use if you have a high-end machine with at least an Intel Pentium 4 and 256MB RAM"
sharegpt my beloved
Literally my potato phone could run that thing
Wait a minute
I could totally pretrain the 175k model on my phone
And that would be really really funny
train a llm 

that would indeed be quite funny
I don't have that
32gb ram
I’m using a 4b token subset of the pile so I could store the entire thing easily
You can export it

Or do it the based way
But that’s for whispering about during recess, not public chat
ryzen 8 12380f
theoretically yes
What do you mean "theoretically"
yeah wdym theoretically
Idk what he meant
It’s in settings
Is a single button
you can just request your data in settings
for me its just messages
discord may have different controls depending on your location, like most apps today
Well we'll see what happens
"privacy farmers" sounds about right 
ye was about to say
hi
that's NOT how I would've worded it 
truly the 差點猜中 moment of all time 
I am doing work tho 
like 350 was what i was getting and i was like aw that's bnot exciting
gpu was disabled
cpu only
I think taiwan and china share a writing system
Or are they really that different?
They are both mandarin right?
dont say that over ther
still getting like 400 only, querying completions it'll only poop out a fibonacci numbers sequence for 100 tokens no matter how long i set the response
io limit of some sort
2026-05-11 11:55:07 [DEBUG]
slot print_timing: id 3 | task 900 |
prompt eval time = 5.25 ms / 2 tokens ( 2.63 ms per token, 380.73 tokens per second)
eval time = 252.91 ms / 99 tokens ( 2.55 ms per token, 391.44 tokens per second)
total time = 258.17 ms / 101 tokens
slot release: id 3 | task 900 | stop processing: n_tokens = 100, truncated = 0
srv update_slots: all slots are idle
2026-05-11 11:55:07 [DEBUG]
LlamaV4: server assigned slot 3 to task 900
I really have no idea how different their language is 
I know mandarin has dialect and all but not to the point of actual different language altogether
are u from china
i have an idea
auxillary 16k vocab tokenizer, stolen from an existing model
give it two output heads and just discard the BPE embeddings and head for inference
still 175k parameters but now hidden states are more aligned with full(ish) words
Special Agent double-o-@stark needle, any thoughts?
why not just mmap embedding table on disk
like gemma 3n/4
still counts as model weights
memory isnt issue, low parameter count is goal
wait i know
wait no nvm
dududududududududududu
sandstorm
lutel 
upward project embeddings
best possible setup i think
neur obuxket
⬅️ 

what the fuck is she saying

me schizo
Yay

when kernel eop exploit that can work with my phone
2018 kernel 2021 security patch

i was thinking about it

3627 because it has six seven
it’s the chip ye
🔢-4️⃣ 🔤

That requires the malware to be installed via other means tho
So its just another type of malware rather than vulnerability
Today I found out
Ubuntu is equally as bloated as Windows is
ISO size 6GB for both
Install size around 25GB for both
4/6GB RAM needed
clearly this is your sign to move to nixos
I need Ubuntu on my laptop temporarily for if I need to go to an exam where Ubuntu is an acceptable OS
isnt there like a setting when installing ubuntu of how much shit u wanna download
ubuntu is poop

you get a choice between "full" and "minimal" I think?
one installs a bunch of random apps
the other just has a web browser, a terminal and a file explorer
yea
my raspberry pi has like a 16gb sd card
and it works with ubuntu
(im not using raspberry pi slop os)
What that
keep embeddings at like dimension 4-8
and then have a single shared ffn that projects that to 64 or whatever
so ur embeddings are small
but ur model isnt complete shit
goodmorning
The views from my morning
8pm 
i had a litle eep

selep
which precision is @koniifer
7
13
2
FP80 E15M63
1131629092475768884
neuroCatUuh
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
sure i guess
☆♡《♥📺 idol~★》♡☆
3
6
3
neurt
1458912973526663352
doc_20260108_074827_M
true
neurt
Would you rather breathe manually instead? 
has anyone seen this firefox bug before? somehow neuro's blog site triggers it
Arg
No, but can you please blink manually for me instead?
Oh, and you have to hold your jaw shut
Hocus pocus, your tongue is now in an uncomfortable position 
damn is it bad
or not
idk
im not doctor
its not severe at all
i dont know any person in my family that died form it
its not a medical issue, its just kinda weak
An endocrinologist is responsible for evaluating diabetes, bone loss, and a range of hormonal issues, including hormones from the pituitary, adrenal, and thyroid glands as well as reproductive organs
i see
the neuro site is flawless and runs perfectly fine

only about 30% on my machine
Do they not have school in Belgium

we hebben een serieus probleem
nee
for whatever reason firefox is using my igpu
on my laptop

seems like a good thing 
but yeah it looks like the site makes your gpu unhappy
Wow, a website thats poorly optimized
"Do you have any idea how little that narrows it down?"
Most sites??
2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
sauron 
Yeah that was the joke/play-
i guess in terms of woman, there's shelob
bguu

who would you want to be?
cuz there arent many woman in the cast looking back on it
and arwen
ar

Goldberry?
npluu
what does pluu even mean 

i have no fuckign clue
the moment i start to understand what she's saying based on vibes, new words get added to the konii lexagon
started with
✅, somehow ended up with pluu and ying yang and idk what
even as someone with autism, its hard to decipher at times
might be a british dialect of autism
The Chinesification of Konii (viewer discretion is advised)
Apparently memory usage increasing quickly enough can crash Discord
fr
someone needs to do an entropy chart on koniis messages over months or something
i need to use my new local llm to research this shit
@samgpt

I wonder how I should torture my gpu next
ye
ive been wondering
how do i make training happen on data?
what format does the data need to be in?
i need it to be as dumb as me
[
{
"role": "user",
"content": "Hello"
},
{
"role": "assistant",
"content": "Hi. How can I help you?"
},
{
"role": "user",
"content": "Explain what an API is."
},
{
"role": "assistant",
"content": "An API is a way for different software systems to communicate with each other."
}
]
this format is expected by almost every training framework
Convert
how?
Python
well good ph
cking luck lmao
I mean
but ye python
Afunyun has his own fully working one to my knowledge
Your other options are diy and ChatGPT
There is also the fact that you only have your own messages
do you want a series of really terrible suggestions that will potentially anger shadow
So you only have a single turn per conversation
And thus nothing to train the model on
Ie the text completion regression we talked about last time you were testing this
So your options are:
somehow get the full conversations (slow and against tos)
use LLM to synthesize fake messages for your real messages to be in response to (slow and imperfect)

tbh its basically the last point kaine said
but a bit more advanced
group messages by channel, and then take blocks of messages that are nearby in time and generate messages to make a conversation
Okay well now I’m interested too
wym thats literally what i would have done

idk its a 50/50 if its a genius llm invention or the dumbest shit imaginable
I need to make the bred #programming markov chain
what if
you ask for everyones discord message bundles

then mix them together and order the messages by time
but usually if u only had one side of messages i would have done reinforcement learning but thats hard to set up for beginners
group messages into conversations
Consent is slow
I'd contribute my #programming messages
and everything in the arg channels for good measure
(only the premium chatters)
i would do it if i had the data 
llm trained on only ying and pluu
I would if it weren’t illegal
Unless you talk to yourself in text channels, there will have to be other user(s) involved, whether real or fake
Actually
Hold
folder preview jumpscare
looking at the discord data they give you
they really do have everythign huh
nothing bad is in there, but i do kinda die inside seeing my old messages

my discord account is pretty old
some of these messages are form when i was 13
and i kinda want to nuke the files from messages with certain people
fair
I need this ran at very high temp on a throwaway user
:neuroCatUuh:
Just to see how cursed it gets
2025-07-15 18:08:03 i used the word everynyan 
Hello everywan 
hello everysam

awa
i mean like my adress and stuff
all we have to do is test a few numbers and eventually it’ll autocomplete your address 
it doesnt exist yet luckily
chat I had a reading/writing exam (of sorts) today
topic was agi
paragraph 2 I just see a sam altman quote
absolute cinema
import json
import re
import markovify
DISCORD_EMOJI_REGEX = re.compile(r"<a?:(\w+):\d+>")
LINK_REGEX = re.compile(r" ?<?https?:\/\/(www\.)?[-a-zA-Z0-9@:%._\+~#=]{1,256}\.[a-zA-Z0-9()]{1,6}\b([-a-zA-Z0-9()@:%_\+.~#?&//=]*)>?")
with open("messages.json") as r:
messages = json.load(r)
model = markovify.Text(
None,
parsed_sentences=[
LINK_REGEX.sub("", DISCORD_EMOJI_REGEX.sub(r":\1:", msg["Contents"])).replace("\n", " \n").split(" ") for msg in messages
],
)
# print("\n".join([str(_) for _ in model.chain.model.keys()]))
for i in range(5):
# new = model.make_sentence()
new = model.make_sentence_with_start(":neuroPogHD:", strict=False)
print("---------")
print(new)
easier than expected
runs pretty fast on ~18k messages
may i yoink this?
yeah
you need to pip install markovify
(changed a single line so it doesn't print the entire vocab)
Make a konii markov chain

I don't have konii messages
i wonder which one of these folders is the #programming one 


11.8mb
🤔 isnt that the claude code one iirc
you can also make it not require
at the start
I just found it funnier
bro echo has deadass regurgitated some ultra doomposting from an old relationship at least once when he was overcooked supreme

i nearly died seeing it
i dont remember being that pedantic about trademark infringement 
its not your name in hebrew perchance? 
jsut is so real
this is hilarious
wait what do i need to do to run this
your discord messages and the python script
wtf was this last thing even?
aight i finished my refactoring and i murder my queen, this allows me to
Laptop bluescreens
Laptop restarts
"Updates are underway"
Very subtle microsoft
i get that i was rudely intercepted by a bluescreen, but i was murdering a queen???
how long did data export take for u sam
a couple hours
---------
Oh so you breathn't and there's no jpreg
---------
too much :welpsagiri:
this is the most fool-proof is 4
---------
:DIESOFCRINGE: i already have it
---------
fuck im doing this ai shit ran locally ye
---------
ye, those are not words i could pregenerate the entire scene in a week ago
jpreg???
im sure i meant jpeg lmao
sagiri 
truly the anime of all time
i dont even think about the anime it comes from when posting the emote tbh
bro i love markov chains sm
I'm working on the twitch/OBS integration, set up plugins and all that and it may legitimately be worse than building the actual AI in the first place
Idk why but I've always hated OBS
me with gimp
So fucking real
and docker
I used to have a side hustle making avatar tattoos in vrchat with gimp
It was like a money printer, shit was so easy
every time i use gimp that fucking thign pisses me off so bad it's unreal
Crime is illegal, but it only 900???

so cheap!
What's your issue with gimp, if I may ask
looking at my older messages i was thinking the llm would be insuferable. but the neurocord ones seem tobe funny
Streamlabs is gonna be how I hook up Zero bc it has API integ
the UI is just completely unintuitive
So right now Im unaware of better alternatives
Suspiciously obs websockets api shaped obs
it's just i cannot fucking do what i want to do no matter what
@olive sable ```python
import json
import re
import markovify
DISCORD_EMOJI_REGEX = re.compile(r"<a?:(\w+):\d+>")
LINK_REGEX = re.compile(
r" ?<?https?://(www.)?[-a-zA-Z0-9@:%.+~#=]{1,256}.[a-zA-Z0-9()]{1,6}\b([-a-zA-Z0-9()@:%+.~#?&//=]*)>?"
)
MENTION_REGEX = re.compile(r"<[#@(@&)(@!)]\d+> ?")
SPLIT_POINTS = re.compile(r"[ ./]")
with open("messages.json") as r:
messages = json.load(r)
model = markovify.Text(
None,
parsed_sentences=[
SPLIT_POINTS.split(
MENTION_REGEX.sub(
"",
LINK_REGEX.sub("", DISCORD_EMOJI_REGEX.sub(r":\1:", msg["Contents"])),
).replace("\n", " \n")
)
for msg in messages
],
)
for i in range(5):
new = model.make_sentence()
# new = model.make_sentence(max_overlap_ratio=0.99, max_overlap_total=300)
# new = model.make_sentence_with_start(":neuroPogHD:", strict=False)
print("---------")
print(new)
it's always pain
this might help it a bit
tl;dr if it picks up too many one-off phrases it becomes stupid
I know I can do OBS websockets, but for alerts and stuff like that (which I want her to react to), can I even set that up in OBS?
it's because i have used not gimp but paint.NET for 15 years prior to ever picking up gimp
Dougdoug does his stuff with pygame and obs
If he can, you can
Fair enough
so if i file is 73 bytes, why are we supposed to be financially responsible 
pygame 
I'll look into it
aituber 
The like V1 goal is get her playing something very simple but engaging
pygame mentioned twice?
weird coincidence
I wanted to do pokemon but its like impossible to decode the RAM addresses for the LUA bridge
Ye
cant u just use whatever harness was used for gemini/claude
No, Zero is hand built. It needs to be tailored to her backend specifically.
What I need is a memory map, and one just simply does not exist.
https://github.com/nichosta/GeminiPlaysPokemonLive did u check what they did in here
Harness for Gemini model to play Pokemon FireRed/LeafGreen/Emerald with Twitch assistance. - nichosta/GeminiPlaysPokemonLive
its so funny
There are documented addresses, but not enough
because #programming is like <1/8 of my total messages
Lemme look, it might be a useful resource
so most of the outputs here aren't even relevant to here
programmin is most of mine
Yeah so I've seen this repo before
GBA is SIGNIFICANTLY easier than DS, but I have no interest in doing GBA titles.
oh wait ur doing DS?
holy based
---------
they gave us the mod team, not join them!!!!
---------
if you rotate the world of cheese
---------
and if i want 5%
---------
im at my monitors should be fine
---------
the B580 has a smaller piece of paper you gotta justify the 6
rotate the world of cheese 
you gotta justify the 6 
Yeah
@floral hawk are you using an emulator
Gotta justify the 6 in B580
why not just give raw controller access
Zero already has an SDK for bizhawk but the bridge is the issue
She still needs to read gamestate somehow, if there is a way around that I'm all ears.
why do you need memory addresses
Yee bizhawk
nvm
To read precise game state
oh sam I have just made a great discovery

I'm so excited to announce a new terminal emulator! 🎉
Meet "Ratty"🐀
🧀 A GPU-rendered terminal emulator with inline 3D graphics.
🪤 Try it out: https://t.co/6AehEmblXT
⭐ Source: https://t.co/e5ytdgs2Gj
#rustlang #terminal #ratty #ratatui #opensource
new = model.make_sentence(min_words = 50, tries = 1000)
well neuro platinum was gen 4 does that have any info
cant u read it from like
the save file
or something
Neuro played platinum??
In theory yes, but its alot of noise
its a bit of a shame that the splits are kinda broken
my makov is so dumb
do what you need to ridicule you
---------
3600 isnt that code looks ass
that code looks ass 
what about just raw image input
maybe i should fix that
SIMA 2 is raw image input
cmon bruh
If I were using a larger vision model probably, but I'm running a 500m. It's sufficient but not consistent, unless I happen to build a really really strict prompt maybe?? That's definitely an idea.
wont work, visual input doesnt have spatial, ocr, AND recognition on small models
500M WOULD be enough for general reactions, but to truly understand game state it gets shaky
we dont even need an llm, this is already dumb enough to be indistinguishable form the real sam 
I'm assuming Neuro's platinum stuff is on github?
I actually tried visual input for smth similar on my ai (vtuber ish) to recognize buttons on a website but it has no spatial understanding
and from what ive heard ur trying to make it play every aspect of pokemon right
not like neuro platinum where vedal controlled the overworld and neuro only did battles
Yeah, I want it to be fully autonomous in anything it does
yeah good luck with local models, claude and gemini 3 pro were barely able to do it
Right now she can open OBS, do all checks, and start the stream herself.
and they were playing easy games
yeah but gemini has the robotics spacial awareness advantage
they pretrained the shit out of it
You aren't wrong.
ROBOTICS MODEL MENTIONED
how large of a llm are u running
half of my markov chain seems to be random code
new or old 8B
New
Llama 3.1, but it's been abliterated
The only guardrails are what I code it not to do.
llama 3.1 sucks a lot for agentic tasks it's like 2 years old now
try qwen 3.5
or ministral
question—even if you do manage to set up all the sockets, how will you make it do pathfinding in the overworld
sam I keep making it dumber
without setting both R_EN and L_EN at the konii level of software support as cuda afaik
konii is a level of software support it seems
I can't use Qwen, I definitely tried.
same
For tech constraint reasons

I haven't gotten to that issue yet, but I'll find a way.
import json
import re
import markovify
DISCORD_EMOJI_REGEX = re.compile(r"<a?:(\w+):\d+>")
LINK_REGEX = re.compile(
r" ?<?https?:\/\/(www\.)?[-a-zA-Z0-9@:%._\+~#=]{1,256}\.[a-zA-Z0-9()]{1,6}\b([-a-zA-Z0-9()@:%_\+.~#?&//=]*)>?"
)
MENTION_REGEX = re.compile(r"<[#@(\@\&)(\@\!)]\d+> ?")
SPLIT_POINTS = re.compile(r"([\/\\\.\*(:?\~\~)\'\<\>\[\]])")
# SPLIT_POINTS = re.compile(r"[ \./]")
with open("messages.json") as r:
messages = json.load(r)
model = markovify.Text(
None,
parsed_sentences=[
SPLIT_POINTS.sub(
r"\1 __JOINER__ ",
MENTION_REGEX.sub(
"",
LINK_REGEX.sub("", DISCORD_EMOJI_REGEX.sub(r":\1:", msg["Contents"])),
).replace("\n", " \n")
).split(" ")
for msg in messages
],
)
for i in range(25):
# new = model.make_sentence()
new = model.make_sentence(min_words = 50, tries = 1000)
if new is None:
continue
# new = model.make_sentence(max_overlap_ratio=0.99, max_overlap_total=300)
# new = model.make_sentence_with_start(":neuroPogHD:", strict=False)
print("---------")
print(new.replace(" __JOINER__ ", ""))
wrr
the gemini ones had to make a sub agent pathfinder cause the model literally couldn't figure it out on it's own
this one is a bit better again
tl;dr it handles code a lot better
there's a few unresolved bugs
that was my thinking too
unfortunate for the field of ai though
if i had the compute i would have started training my raw controller use model
It's to a point that I'm probably gonna shelf pokemon atp and find something else until I have a proper SDK/API that I can plug it into.
but looks like 256gb vram isnt enough
Even going to romhacking servers they told me what I was trying to do was a suicide mission
😭
pokemon just needs tm variety of things that it’s reaching agi levels of needs
I already have her playing pokemon showdown and winning battles tho.
mine is cooked
it outputted
| ||
\_\_\_\_
\_\_\_\_\_X\_\_\_
\_\_∆>X\_\_\_X\_G
\_\_\_
\_\_\_X\_\_\_X_G
\_\_X\_\_
\-

ok thats good at least
Yeah. If I had stronger hardware I could brute force a solution, with better vision or something, but working with what I am on a smaller model I have to lean on my backend and that's my problem
what hardware do u have
add extra stuff to SPLIT_POINTS if you feel like it should break words based on them
12GB Vram, 32GB DDR4
no personally it’s not hardware the upper limit of agi rn cant do it I mean
vedal probably would have if he was able to with neuro platinum
Honestly that's a fair take.
because a) it plain doesnt work or b) its just fricking boring when neuro plays open world
But I'm the type to feed into my delusions until shown otherwise.
and secondary source of the day they're overwriting your framebuffer data a **°▽°*);
``` \**\°\▽\°\*
If I had the ram addresses, the thing is, it would work
But that involves reverse engineering a Nintendo product
And the work/payoff ratio is super fucking skewed
unity only costs moeny once you get 2 3090's 
we love nintendo
Also the boring part could definitely be a factor, it for sure wouldnt be efficient.
sam its so cooked help
i cant help mine is also cooked
@trim valve is this gonna train on all my messages from everywhere
or just from 1 chat
just one chat
im gonna append all jsons
but I've been manually concatenating chats into one json

nintendo’s main product is dmca takedowns
so I have 3x chats in here

i have 100k messages in DMs with certain people
🇪🇸
yes
my colourblindness is supposedly 150MHz, so i need to get the location thing
thats some fast coulourblindness 
oh
LOCATION
if you're using it wrong, so am i
location
schizo
tomatos
location
sam location your location model location might location be locationing
fim suffix
location my location
I inspire all of you to be evil
import json
import re
import markovify
DISCORD_EMOJI_REGEX = re.compile(r"<a?:(\w+):\d+>")
LINK_REGEX = re.compile(
r"<?https?:\/\/(www\.)?[-a-zA-Z0-9@:%._\+~#=]{1,256}\.[a-zA-Z0-9()]{1,6}\b([-a-zA-Z0-9()@:%_\+.~#?&//=]*)>?"
)
MENTION_REGEX = re.compile(r"<(?:#|@|\@\&|\@\!)\d+>")
SPLIT_POINTS = re.compile(r"([\/\\\.\*\'\<\>\[\]\=\_]|\~\~)")
# SPLIT_POINTS = re.compile(r"[ \./]")
with open("messages.json") as r:
messages = json.load(r)
sentences = [
SPLIT_POINTS.sub(
r"\1 __JOINER__ ",
MENTION_REGEX.sub(
"",
LINK_REGEX.sub("", DISCORD_EMOJI_REGEX.sub(r":\1:", msg["Contents"].replace(" ", "\t"))),
).replace("\n", " \n"),
).split(" ")
for msg in messages
]
model = markovify.Text(
None,
parsed_sentences=list(filter(lambda e: len(e) > 0, sentences)),
)
for i in range(25):
# new = model.make_sentence()
new = model.make_sentence(min_words=50, tries=1000)
if new is None:
continue
# new = model.make_sentence(max_overlap_ratio=0.99, max_overlap_total=300)
# new = model.make_sentence_with_start(":neuroPogHD:", strict=False)
print("---------")
print(new.replace(" __JOINER__ ", "").replace("\t", " "))

10.30pm dinner 
bon apetit or whatever english people say
i was gonna have it earlier

bred was earlier than me cuz of the alibaba ones are good tho
---------
bred out here leaking everything
---------
bred is back to fixing sdl
---------
bred the origin move around into functions for me
---------
bred needs to be banned for ddosing
why the fuck is it taking so long
cuz discord slow
fatass discord
mine codes so much less on just #programming messages
isn't bon apetit french
yes
probably a fairly large memory usage of the triangle labelled "sRGB" represents all the info you'd mitm your pcs connection so you'd expect more cpu per tick that you have a random project I was suggesting doing it as just bening a usb A -> after gamma correction (I don't like webms im not really sure

bred is a DFT?
im 99% sure this is supposed to be Discrete Fourier Transform

are there position encoding markov chains
hm?
smh bred out here leaking everything, bred needs to be banned for ddosing too
bred needs to cook for some reason, il fix it later, never fixed it
never fixed it 
bbut you toast bread-
or bake it
bred is making dinner tho
toast arc mentioned 
bred has your ip adress 
that lines up with everything actually
pluu
bred but just waste most of the estrogen
not the estrogen
ik im done with the bred ones, they seem to be looping now
do shadow
lmao

shadow would be expensive
---------
shadow literally has a full moon
---------
shadow didn't we talking about woman: :catdespair:
---------
shadow on the batch
---------
shadow live in the military sir?
didnt we talking about woman

import json
import re
import markovify
DISCORD_EMOJI_REGEX = re.compile(r"<a?:(\w+):\d+>")
LINK_REGEX = re.compile(
r"<?https?:\/\/(www\.)?[-a-zA-Z0-9@:%._\+~#=]{1,256}\.[a-zA-Z0-9()]{1,6}\b([-a-zA-Z0-9()@:%_\+.~#?&//=]*)>?"
)
MENTION_REGEX = re.compile(r"<(?:#|@|\@\&|\@\!)\d+>")
SPLIT_POINTS = re.compile(r"([\/\\\.\*\'\<\>\[\]\=\_\(\)\;]|\~\~|\:\:)")
with open("messages.json") as r:
messages = json.load(r)
sentences = [
SPLIT_POINTS.sub(
r"\1 __JOINER__ ",
MENTION_REGEX.sub(
"",
LINK_REGEX.sub("", DISCORD_EMOJI_REGEX.sub(r":\1:", msg["Contents"].replace(" ", "\t"))),
).replace("\n", " \n"),
).split(" ")
for msg in messages
]
model = markovify.Text(
None,
parsed_sentences=list(filter(lambda e: len(e) > 0 and all(len(x) < 32 for x in e), sentences)),
)
for i in range(25):
new = model.make_sentence(min_words=50, tries=1000)
if new is None:
continue
print("---------")
print(new.replace(" __JOINER__ ", "").replace("\t", " "))
last revision i swear
@trim valve try bigram model?

i think by default it considers the last 2 tokens
I've tried, but anything higher than 3 is kinda trash
we would need byte pair encoding
tokenizer

this one is cooked
konii is toast
konii is a `LG 49UB850V`, dubbed "not only the flying machine that flies

@sage crag hello not only the flying machine that flies

@olive sable can u try this:
import json
import re
import markovify
from transformers import AutoTokenizer
DISCORD_EMOJI_REGEX = re.compile(r"<a?:(\w+):\d+>")
LINK_REGEX = re.compile(
r"<?https?:\/\/(www\.)?[-a-zA-Z0-9@:%._\+~#=]{1,256}\.[a-zA-Z0-9()]{1,6}\b([-a-zA-Z0-9()@:%_\+.~#?&//=]*)>?"
)
MENTION_REGEX = re.compile(r"<(?:#|@|\@\&|\@\!)\d+>")
tokenizer = AutoTokenizer.from_pretrained("openai-community/gpt2")
with open("messages.json") as r:
messages = json.load(r)
def clean_message(text: str) -> str:
text = text.replace(" ", "\t")
text = DISCORD_EMOJI_REGEX.sub(r":\1:", text)
text = LINK_REGEX.sub("", text)
text = MENTION_REGEX.sub("", text)
return text
parsed_sentences = []
for msg in messages:
text = clean_message(msg["Contents"])
token_ids = tokenizer.encode(
text,
add_special_tokens=False,
)
if len(token_ids) == 0:
continue
# markovify expects string tokens, so store token IDs as strings
parsed_sentences.append([str(token_id) for token_id in token_ids])
model = markovify.Text(
None,
parsed_sentences=parsed_sentences,
state_size=2,
)
for _ in range(25):
generated = model.make_sentence(
min_words=50,
tries=1000,
)
if generated is None:
continue
token_ids = [int(tok) for tok in generated.split()]
text = tokenizer.decode(
token_ids,
clean_up_tokenization_spaces=False,
)
print("---------")
print(text.replace("\t", " "))
needs pip install transformers

ill try this when I've cooked

this one does BPE
should output better ish text
I made it worse for you
# pip install transformers markovify
import json as json_module
import re as regex_module
import markovify as markov_module
from transformers import AutoTokenizer as TokenizerLoader
class MessageTextProcessor:
def __init__(self):
self.enabled = True
self.disabled = False
self.value = None
def convert_to_string(self, value):
return str(value)
def create_regex_patterns():
return {
"discord_emoji": regex_module.compile(r"<a?:(\w+):\d+>"),
"link": regex_module.compile(
r"<?https?:\/\/(www\.)?[-a-zA-Z0-9@:%._\+~#=]{1,256}\.[a-zA-Z0-9()]{1,6}\b([-a-zA-Z0-9()@:%_\+.~#?&//=]*)>?"
),
"mention": regex_module.compile(r"<(?:#|@|\@\&|\@\!)\d+>"),
}
def replace_double_spaces(text):
while " " in text:
text = text.replace(" ", "\t")
return text
def clean_text(text=None, processor=MessageTextProcessor()):
try:
text = processor.convert_to_string(text)
except Exception:
text = ""
regexes = create_regex_patterns()
try:
text = replace_double_spaces(text)
except BaseException:
pass
try:
text = regexes["discord_emoji"].sub(lambda m: ":" + m.group(1) + ":", text)
except:
pass
try:
text = regexes["link"].sub("", text)
except:
text = text
try:
text = regexes["mention"].sub("", text)
except Exception as error:
error = error
return text
def read_file_contents(filename):
handler = open(filename)
try:
data = handler.read()
finally:
try:
handler.close()
except:
pass
return data
def parse_json_file(filename):
raw_data = read_file_contents(filename)
raw_data = "" + raw_data + ""
return json_module.loads(raw_data)
def load_tokenizer():
return TokenizerLoader.from_pretrained("openai-community/gpt2")
messages = parse_json_file("messages.json")
parsed_sentences = []
message_index = 0
for message in messages:
try:
text = message["Contents"]
except:
text = ""
text = clean_text(text)
def tokenize_text(current_text):
tokenizer = load_tokenizer()
return tokenizer.encode(
current_text,
add_special_tokens=False,
)
try:
token_ids = tokenize_text(text)
except Exception:
token_ids = []
if len(token_ids) == 0:
message_index += 1
continue
else:
pass
converted_token_ids = []
for index in range(0, len(token_ids)):
try:
converted_token_ids.append(str(token_ids[index]))
except:
converted_token_ids.append(str(0))
parsed_sentences.append(converted_token_ids)
message_index += 1
def create_markov_model(sentences):
try:
return markov_module.Text(
None,
parsed_sentences=sentences,
state_size=int("2"),
)
except Exception as error:
print("uh oh", error)
return None
model = create_markov_model(parsed_sentences)
def decode_generated_tokens(generated_text):
tokenizer = load_tokenizer()
token_strings = generated_text.split()
token_numbers = []
for token_string in token_strings:
try:
token_numbers.append(int(float(token_string)))
except:
token_numbers.append(0)
try:
return tokenizer.decode(
token_numbers,
clean_up_tokenization_spaces=False,
)
except Exception:
return ""
generated_count = 0
while generated_count < 25:
generated_count += 1
try:
generated = model.make_sentence(
min_words=50,
tries=1000,
)
except:
generated = None
if generated == None:
continue
else:
pass
text = decode_generated_tokens(generated)
print("---------")
print(text.replace("\t", " "))
bruh
im doing the one shadow sent irst
this one is too long to read what even chnaged
I made it worse
Intentionally
imagine git but built into discord codeblocks
vocab.json: 1.04MB [00:00, 19.9MB/s]
merges.txt: 456kB [00:00, 1.63MB/s]
tokenizer.json: 1.36MB [00:00, 23.7MB/s]
---------
No screeps go brbrbrbrbrbr:
4.549195 ]
return a.max.z);
float vertices for coord in vertex]
(t1, and i need to stack for some reason
---------
i love how black ops 7 released. how would it be different from gl to gpy lacks hdmi splitter with fumoquest releases i shoudl not look anti-aliasing should be good at producing slop for making cutscenes
---------
Nah, its smoother and you need to take the kids, but bluecreens, turning off g-sync doesn't completely like it is on the data every game loop area, "position vec: ", 3,352 AI TOPS
bind = SUPER, right is flirting for the framebuffer???
---------
the cheaper ones market 4K and stuff about i just took way too high with a lot of micro-sd cards are a scam is gonna know it is not a new mid-tier technique is known for its downfall so it costs just as litle do they cost?
---------
its cuz brick walls. in reality everyone just writes lb
:programmer: im in a few days when showing signs of crashing in linux so i have iris, which is he same materials as a creep with only cuz besides that i had to do portforwarding
and you need to take the kids
😭
anti-aliasing should be good at producing slop for making cutscenes

No screeps go brbrbrbrbrbr
gpy lacks hdmi splitter with fumoquest releases
right is flirting for the framebuffer
ye this one is cooked
---------
Apparently has a staircase, and colouring that 5950x + 1)
minX2, maxZ2 = self.config["checkpoint_name APP_NAME_EXPERIMENTAL
#12 0x000000000000 (pc 0x0, default
Capabilities: ➕ 4️⃣ 🟨 ⬜ 🟨 ⬛ 🟪
---------
5 digit fps is actually close to deadline im just trying to use obs but you're overwriting in accumulating stock ETFs.labwc";
pyrr
from src import game
-this meme was brought to you in general
---------
i dont like te console. and im thinking we should have never seen this camera doees do colour graded stuff on your phone to connect it to work on one side making it simpler to debug stuff once
GL_ARRAY_BUFFER. this angle is still bad
---------
you have done 50km in 3-4.564296] usb-c male for the really bang for the rest of his 12700k with pixel shift, and red making yellow doesnt make sense to me 2 hours and didnt connect the bell
---------
ooooooooohhhhhh
the physical and logical device selector, and then add shit to photons and then suddendly 2 cookies"
chair proceeds to fall off, but it cuz im so good they didnt show the wanted idaply the thingy to type good, but does nothing
-this meme was brought to you in general 
add shit to photons and then suddendly 2 cookies" 
🍪
🍪
Om nom nom
You've given me 4 cookies! | I've received 322853 cookies total!
Fumoquest mention nowaying


<fim_suffix>quest
@real sierra
I'm still getting an email every hour
also, fumoquest mentioned
FUMOQUEST
🍪
Om nom nom
You've given me 22 cookies! | I've received 322855 cookies total!
i like the fact nixos has everything in a file but
i cant put cachyos kernel without compiling
wait no i have an idea
holy moly there's been a lot of these
konii is not None or nodes[node] = p - g.concatenate(vertexVbo]
if not hasattr(platform, "touch"):
Hit CTRL-C ports and support for that these prices get even worse injury
extremely true
he's trying to use "phi_t, ndim=2 -s USE_SDL_SCANCODE_DATE_KHR()" in the render thread, its the truth.
it's the truth
looks trrue to me
im not gay so i stopped watching the 2d 55 48 44 0a 20 20 20 20 20 20 20 01 a7
mine is an i5 3470
pluu

this translates to "-UHD" im pretty sure
so 4K makes you gay
according to this transformer
amazon puts them into an actual server server, but before that sounded smart
amazon truely ahead of the game
Sure is a hard knocks life for the slopper
same
om
lol what happened here (in the subtitles)
the amythest lol
my name is sam but you couldnt have that
we found a video but my 3rd word is pronounced "CHAY" (/tʃ/, nouninformal british british british british slurs and some stuff still tho
british british british british
Programming gripe time! I hate coding anything that points at someone elses website because even though Im being very careful and putting in like 5 second sleeps between each request, Im still worried that somehow I will have made a LOIC accidentally
FINALLY I DID IT
Woo!
Im still shocked how well they erased their cover of Anaconda from the internet. I remember finding 1 article that talked about it so I know its real but it is just gone
"do catgirls have rights on the same paradigm for everything, but this causes the entire world for a cheaper cooler.
ok im getting tired of this transformer stuff now ngl
Whatcha trying to do?
jsut messing around with my messages.json files from discord
now i have a slight performance gain
As are all things linux
And fair
linux is heaven
Linux is nice for projects. Sometimes I really just want things to work tho
well i dont think that's ok
Yeah slight error there I think
update fixed it
Slop engine you could have done it if you just used the fucking kernel image, it was provided to you for a reason :D
everything to make ping 2x lower
bbbbbbbbbbbb

Install discord Linux app will let my gnome frozen
I’m thinking about use discord via api, I think I can pretend I am a bot

🍪
Om nom nom
You've given me 5 cookies! | I've received 322866 cookies total!
I gotta try playing TF2 on Linux properly sometime
I did a benchmark and I'm getting a consistent 300fps
on the highest settings
Is that somebody use discord like IRC in terminal?
I can't believe this is still going
Intel Xeon MAX 
Kek
the mods will eventually nuke us all for the repeated usage of the phi
:omemga

@🦚
a mother should accept her child for what they are, even if they like cock cpu
Buh
no child of mine will
we should tone it down a bit tho before we get banned

Intel cooperates with Elon musk
fuck wrong reply
hopohobic konii isnt real; homophobic konii:
Ok
INTEL CPU
no
Thank you
you hate all x86?
Ur so talented at using twitter
@sage crag this u?
I think lots of people like musk, so them buy intel stocks
Ur mom is talented in bed
how can you marry below your social class into intel cpu
I’m gonna get the innocuous observer involved if you don’t calm it down
what social class is intel cpu in?
Nothingburger
ryzen 7 5700g
👁️
would love to see a konii markov chain

Pretty low these days
machine love by jamie paige
They fell off when they started the i9s
@nocturne olive machine love mentioned in neurocord
Machine love when it meets machine hate
@nocturne olive neurosynth
machine hate would actually be so fire
ms paige are you listening
Real
we need a heavy metal sequel to machine love













