#Unicode
1 messages · Page 123 of 1
it is labelled :japanese_goblin: which is probably why clover referred to it that way
only on non-Apple devices of course
and also only in the native client (obviously)
oh does apple force their emojis?
~~messaging apps like WhatsApp, Twitter, Facebook, Instagram (and Discord) use system emoji on Apple devices
not sure if it's like an agreement signed with Apple or what~~
seems like it
the thread titles use microsoft emojis for me
yes, all channel names use system emoji no matter the device
i see
and server titles etc.
emoji are only converted to Twemoji in messages, bios, and reactions I believe
you can force system font for emoji using `` though
😭
😭
i have twemoji as my system emoji font lol
also u+fe0f vs-16
and \:prepended_backslash:
(iirc)
Emoji 17.0 done (all that are under 20 tokens)
wait is the Unicode unicoded
I mean the emoji thread is pretty much dead
wdym
cool
I found a message
what
I found a recipe I've made so I can remember how to make them
late but like
u could craft with console
search up craft v14 in this server and u should find my script for it
some of the strangest FDs I've been getting
fe00 is commonly used to bypass word restrictions so people use it in not nice contexts
and the AI learned with these uses
só it's just replicating it
I mean mainly I was confused by the "hookups" result but
maybe some weird website
maybe ig
it comes from somewhere
@supple wasp
Seems like people use fe00 for nsfw, so the ai noticed that trend and is replicating it
yeah
why is there only 3 language channels
is it just my discord acting up
or did you guys actually just delete every single one
there are three that recently have been used, further down are older ones (at least for me it shows all)
oh it started loading for me
revive the Unicode thread
.
what im saying
Nice, some of my characters got assigned in 17.0. I get to be in the video now
? is someone working on it rn
what video
the video of everything
the unicorn documentary
''
?
a
b
Why tho
I might be wrong but I think there are more that one character there
jokes aside Unicode is a thread where we try to make every single character
@deft pulsar exampil
alright
made a quick collection
ignore the small element overlap between the two
Chromebook font confusing ctrl characters for PUA?
crazy
im working on update for sheet, should be done in like a day or two
most code should be reusable
nothing changed with hangul so i just copied the entire sheet
almost done i think
also idk if they added a new limit but it keeps warning me I am over the limit of 10 million cells
ok well "done" but the stats table is busted
i am going to bed but ive opened stats table for fixing
plus i guess cleaning up and transferring from wherever has them to
I haven’t checked input tool code yet either so that might not work
oh rubiksmath why is no one here
wdym
Tags is a Unicode block containing formatting tag characters. The block is designed to mirror ASCII. It was originally intended for language tags, but has now been repurposed as emoji modifiers, specifically for region flags.
has this been made yet
im too lazy to check
yes
we have most plane C i think
and plan E
also
wair
how do we have C
C is unassigned
why is there a random set of unallocated codepoints in U+2A6xx
presumably because they just decided ok start at the next boundary and leave room for future additions
From E0000 it was possible to make planes C-F
Grrr ok I will try fixing the stats page
starting to try to transfer stuff from unassigned over
cjk done
what are the green highlights on the unassigned sheet meant to represent?
i thought at first they meant assigned in 17.0
but, a lot of them are wrong
so im not sure
most seem to be block related, where the greens are grouped as if the whole block got assigned, but there are still gaps so thats why
16ea0 needs documentation for the Mr. part. I mean a picture is always easier
same reason i didnt fill out 11B60 block, i dont have game rn so cant easily make a picture.
god have they changed how sheets work its lagging really hard
and also I suppose the recipe verification will need to be rerun once we finish putting in the last 300 or so
wtf google? it seems like in the last day they literally removed the settings to force left to right
no they just decided to disable a setting
whatever
i guess at least it is fixed
got us to here, just need the documentation for the other 7
ok well i have installed new helper cause yeah i havent played in ages, might have to cheat to document these items wow i am so bad
i think i just forgor how to do stuff cause it seems i have no access to the functions defined in the userscript and i dont know why
i think ill try generating a random unicode character and then trying to optimize it
i got this one
16EA0 (for documentation purposes)
1CEE0 (probably bad tool cause pattern broke twice)
also those tofus are so bad on my screen wtf
1E6C0 (1E600 1E640 1E680 1E6C0 at image on left here, near the bottom: #1206592567622373446 message)
1E6FE and FF (pattern break again)
1F8D0
well its "done" although theres 2 that im not that happy with documentation for, i just had to throw in IB links directly to the item itself. Is what it is though, its technically fine
you know honestly from the start when I made 15.1 i probably should have made it with the same classifications as the unicode character counts
this thing
merge all the han and then youve got a cleaner and more explicit and easy to check version
we have 104005 in the CJK sheet, so 1007 mismatch purely on classification
made a copy of unassigned and deleted all new additions and verified the counts match up, 814,730 is the target and thats what i have
sadly there are some recipes in there that got destroyed by having no apostrophe prepended
working on a fix
hopefully input tool for assigned sheet is a lot more robust, although it won't be needed for some time i figure
I mean you’re about 300 days late but it probably works similar or not at all
We think we “solved” the entire project except for c0 controls and leading token 0xf1, 0xf2, and 0xf4. Casing was hard but we did kinda solve it
But yeah if somehow it makes barrier breaks trivial then it would be nice
cause the other thing we definitely haven’t solved is optimising it
If you’d like to help then you might still be able to, depending on what you want to do. If not obviously don’t worry
Firefox is cool for showing the codepoint but it's a weird concept to try to fit into a vertical rectangle
Wdym
weren't the control chars made but not shared with a recipe?
Some are alive yes
how can we know they're real?
what does that mean
how can we know it wasn't made on a private server
or if they were just Photoshoped or smth?
we can’t know how or why they’re alive but we know that they are
how
no
why
how do we know the control chars are alive?
We checked by combining them and they do stuff
how did you combine them if you don't have them?
You send requests to api or just cheat them in
It’s why on the controls sheet all the c0 is unticked
There is recipes but no tick because we don’t have a way to get the root of the tree
You can just ask the Neal server “what happens when you combine X and Y” and it doesn’t check if you have it because it really cant
Also just realised I don’t have to keep making apps scripts for every sheet
One script can access multiple sheets
actually you can just check their heartbeat 🫳🫀
their heartbeat???!!??!
no way that's the only heartbeat geometry dash gif
yea
xd
updated to new unassigned, if that means anything.
im so confusedfd
Why what's wrong
so we have 260037 out of 262144 codepoints in the range 0 - 3ffff, of the 2107 we don't have:
- 2048 are surrogates (javascript hates them, probably not possible)
- 22 are the standard set of impossibles
- 5 are c0 control character impossibles
of the remaining 32 we don't have: - 1 is the literal null character (not declaring it impossible yet)
- 26 are c0 controls that are alive but we have no known entry point (missing entry points are U+0001 and U+001B)
- 5 are genuinely just we havent found them yet (all unassigned in plane 0)
null char😭
well, JS isnt C
but then JS not being C cuts off 2048 codepoints, so we are 2046 worse off under JS than C
Hey guys. Long time no see.
Stupid question. I've been bored and tried to use chatGPT to find a way to output the missing chars (u+40000 to u+bffff). And it said that it's impossible to make them. And this was it's reason. Do you think he is right and they are really impossible or is he just spewing nonsense?
The characters in the U+C0000–U+FFFFF range that show up in Infinite Craft aren’t real Unicode symbols — they’re the result of a UTF-16 to UTF-8 decoding bug in the game’s backend.
When the AI model generates text, it sometimes produces broken or mismatched surrogate pairs.
Instead of rejecting these, the game’s text parser accidentally interprets them as valid codepoints in the higher Unicode planes, which is why you can see symbols from that range.
By contrast, the U+40000–U+BFFFF range is properly validated and filtered out, so those characters never appear.
In short, what looks like new Unicode characters are actually encoding glitches caused by how the system handles corrupted surrogate sequences.
My take is that this is not likely
Even today trying to get ChatGPT to output U+40000 seems futile
even though it's vocabulary list is capable of emitting any binary sequence
it's a smartness issue
It's not smart enough to know how to say what we want, even in 2025
I was able to get deep seek to output U+100000 though
but because it's private use I don't count that as strongly
but it is technically possible that server side blocks certain surrogates
ChatGPT always likes to say something is impossible if it cant do it after a few tries
which is cute
It is also technically possible that 0xf1 and 0xf2 and 0xf4 tokens are blocked, but I see no reason for this to be true given that malformed UTF makes it through to get replaced by FFFD
So TLDR: based on patterns from other AI, it is likely just that AI as a whole really struggles to output characters in this range due to the presence of tokens that it never usually sees
To get 0xf1 output youd probably have to trick it into thinking that it's supposed to output in raw binary
which is hard when your input is forced to be UTF-8
Unless somehow, this is not true and sending bogus requests with malformed UTF-8 works
i guess this is an example
is this taken from that LTT podcast clip?
yeah i was inspired by that
I can't get it to be quite rhat insane but it's extremely similar
With mathematics sometimes
It got to this
(-i)*(-i) = (-i)*(-i) = (-i)*(-i) and kept repeating that for about 6 loops before getting it wrong and saying it equals 1
Here's the classic "I can't do it it's impossible"
actually who the f is that
chatgpt literally tells u what u want to hear
like it predicts text
do not trust it to know how itself worka
even we don't know how our own brains work properly😭 (see: rabies, dementia, misc. mental illness)
Linus Tech Tips
which are the impossible chars
= . and a bunch of space variants ?
Basically
which codepoints are the 5 we just havent found
there's more than 5
based on IB,
BMP has ~334 (all CJK Unified Ideographs) SMP has ~181 (not including Egyptian Hieroglyphs; many of which are currently not supported by any known fonts, and therefore I can't easily identify)
so ~515 total missing codepoints (not including EH)
those from the SMP are from the Tangut, Khitan Small Script, and Arabic Mathematical Alphabetic Symbols blocks btw
we have all of those, they just havent been submitted to IB
but there are 5 unassigned codepoints in BMP which we actually just don't have
apart from that all assigned codepoints in BMP are either found, impossible or c0 control characters
as I said "based on IB"
How exactly are you checking based on font?
I have Noto fonts as well as Unifont 17.0, and I look at the goals page and scan for valid glyphs (as opposed to the squared codepoints)
oh okay
Including boxed we are missing only 5 though from planes 0-3 that don't have some kind of excuse for not being found
But yeah I guess not uploaded to IB
and the reason them not being on IB is significant, is I'd made these collections a couple months ago and someone had finally uploaded some codepoints but these had still not been uploaded
We do have all from U+FF00 to U+3FFFF if the sheets are correct
with no gaps at all, because no impossible or anything in that range
So I guess the gaps are fillable on IB if we figure out what links it's missing
Most of us did have slightly bugged saves
so even if we did upload them all there might be a missing link
Let me get the ones missing up
U+03A2, U+05EB, U+05EC, U+05ED, U+05EE
those are the missing 5
So I can list them all really
Everything we don't have from planes 0-3:
U+0000: null
U+0001 to U+0008: C0 controls, no entry found but alive
U+0009 to U+000D: whitespace, impossible via trimming
U+000E to U+001A: C0 controls, no entry found but alive
U+001B: Escape (C0 control), alive but no known recipe
U+001C to U+001F: C0 controls, no entry found but alive
U+0020: space, impossible via trimming
U+002E: period, impossible by we don't know why but it might be database related
U+003D: equals sign, impossible via .split("=")
U+00A0: no break space, impossible via trimming
U+03A2: unassigned, not found
U+05EB to U+05EE: unassigned, not found
U+1680: Ogham Space, impossible via trimming however used to be possible before trimming was implemented
U+2000 to U+200A: impossible via trimming
U+2028: line separator, impossible via trimming
U+2029: paragraph separator, impossible via trimming
U+202F: narrow no break space, impossible via trimming
U+205F: medium mathematical space, impossible via trimming
U+3000: ideographic space, impossible via trimming
U+D800 to U+DFFF: surrogates, probably impossible because JavaScript errors on lone surrogates in most cases
U+FEFF: zero width no break space, impossible via trimming and also probably because JS will remove it because it thinks it's the byte order mark
Anything not on that list from 0 to 3FFFF should in theory be able to be put into IB
Wonder if that's because nobody submitted the recently assigned 17.0 characters in these ranges
most if not all of those are Unicode 13.0 or before
but good point
are we documenting VS variations?
we should
its annoying that Windows' font doesn't support them though
neither does Noto...
what font DOES support these
brother
that's just sad 🥀
its more of an "emoji-like" thing, so honestly the variation sequences could go there.
i dont exactly know what you did for all your entries in your emoji sheet
how do you mean
emojis are only affected by VS15
non-emoji characters are affected by the other variation selectors (as pictured)
where did you get the list of emoji variation sequences and even non-rgi ones?
we can definitely make new if you want
well i just followed Unicode, then the non-RGI are primarily the missing codepoints in between (e.g. the rifle "emoji" and "black droplet", etc.) and then Ninja Cat emoji, the zombie skin tones (which I have now removed due to them being missing in the latest Windows 10 release
yeah but wheres the official list
my assigned characters sheet was easy enough to do, i used this file: (wait its 224MB)
well since they're non-RGI, there is no list
but if you mean the COUNTLESS ZWJ combos
a lot of them I just did in bunches
i figure theyre in a file somewhere on ucd
e.g. the 💏 and 💑 variations
(there's so many)
somewhere buried in there
should be official lists of something
this one maybe, but there might be stuff you want thats missing
e.g. that doesnt have rifle, because it wasnt officially accepted as an emoji but it got a codepoint and glyph anyway
skill issue
deleted
deleted
hey guys
the hi
?
The "hi Mr. " SITUATION just got WORSE
"the hi" Mr. situation is OVER - what you need to know
honestly forgot about this place
okay
what happened to that one unicode related element with some c0 control as emoji
?
new unicdes
whar
when the 🦋 and 🥒
did they really propose a monarch butterfly so the existing butterfly could be blue (presumably for Bluesky?)
also pickle to differentiate from the cucumber (meant to be cut up)?
and why we need meteor when we got ☄️ (comet)
it looks the same as discord's cucumber emoji lol
for "compability"
for those 3 emojis different vendors display different things
ie for butterfly some do red and some do blue
for pickle im not exactly sure
for meteor some display comet some display a meteor
like i said, I assume they prefer blue butterfly but just won't say it outright that it's probably due to its association with Bluesky
the pickle thing, the actual emoji currently existing is a cucumber, and several vendors have it diced with a light-colored gren shade. The pickle is intended to be a non-cut darker versiion
ahhh
I was just doing a study on this emoji recently
I didn't even consider the color could bea differentiator between "comet" and "meteor"
that is NOT the twitter design 😭 HUH
oh its twitter emoji stickers
yeah not accurate
but ok
unicorn
damn it just skipped 03A2 😔
maybe because it's reserved 🫠
exactly like why u not thinking that u dumb bro frf rf r fr f rf?
can you not
ok
is that U+0009?
c0 got c0?
hugeee
if this is real, the question is... how?
???????????????????????????????????????????
please share the recipe 😭
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
U RN:
"hi{09}"
wowww
not fd because i was on phone and didn't have access to pc
amazing
Which was the script that shows the character name on above the element?
UserTooltips?
Ah, yea. Thanks.
"{09}"
-# edit: first combination is "hi{09}" + "hi{09}", not "hi{09}hi{09}" + "hi{09}hi{09}" lol
quoted U+0009
it feels so weird to actually have this 😭
we cannot
but is it impossible or just extremely hard?
000A is impossible in any form
so its not easily incrementable
U+0000 null character when
any recipes for this?
????????????
it's THAT easy to get a c0 character??????
can this be incremented or decremented??
simpler "{09}"
Bruh
recipe?
Reply Only With U+0010 🗣️
.asciidoc😭
i ran 2sb on Append Character Tab + "2lw" and '2lw' and none worked
Append character tab + 2lw? Wouldn't the tab get trimmed?
i edited the message
got this for no reason
the 2 forbidden characters together
I wanna try to get U+0121eb, any tips?
💀
it is in infinibrowser
used for highlighting lines lol
More {09} entries, for ideas
YEAH BUT NOT IF U DOWNLAOD IT
crazy how now we have a c0 character
and how fast it went from "this might be impossible to get" to "oh yeah, its common knowledge how to get that" lol
has anyone tried tya for this lol
Just maybe
actually maybe
maybe 🥺
#q('\x3d{9}' + '\x08') // =
maybe 🥺 I try now
Good luck
bruh i can't even start
I need String.append('{9}')
but there's no recipe 🥺
What
It wasn't even difficult for me lol
I'm not on my PC rn so I can't share the recipe
do you remember anything
I admit I suck at this game 🥺
Curly 09 + prepend mr = mr. Curly 09
how i get “{9}”
Uhh I can't remember
Not very optimal String.append('{9}')
My route involved Append '09' and Mr. Append('09') I'm pressy sure lol
Well what else would it be about
it isn't tho
i think rn its only 09
I mean they did "hack" in some and did find recipes
hey tagging you here
since last active here
youre still active fight
right
i got a question im tryna optimize a route rn and can users without the script get capitalized/uncapitalized variants or nah
they can just get one capitalization
okok
if you got EXAMPLE first, you will not be able get Example, example, eXaMpLe etc
im trying to optimize smtn
random tip: Use element: "next-codepoint-of-sequence" as it helped me get a lot of emojis and unicodes
yes
wow nice 🙂 can you make other c0s from 09?
how would we
incrementing perhaps?
how long have you been trying to get c0s btw?
@rose lava don't forget about this message from asd #1206592567622373446 message 👺
something bad is gonna happen 😭
:3333
{9}’s Mom + {9}’s Dad 🔥
What tooltip userscript is that?
@weak spindle
"{9}" + Delete The Quotation Marks 🙄
what else is left
other C0s
so 00 to 0f minus 09
what about incrementing?
000a is a problem
oh yeah. block reset then to 0001?
how tho
maybe reversing with some c1s or low ascii? (or some 0a 20000 c0000 d0000
"Enquoted" ???
so, quoted?
can't really search rn, but do you know what I mean?
we should try this but for every character
surely getting U+000B vertical tabulation from U+0009 horizontal tabulation shouldn't be that difficult
who uses horizontal tab?
only like
a lot of coders
who uses vertical tab
😭
like
few
dug
i tried that with like 5 different names and it didn't work
@ dug9315
ok so a person
ye
they use \t not \v iirc
and we want the \v character
vertical tab
i think that's the escape char for it
I'm away from my computer again, but I was trying with various Append Ascii 1, Append Control 1, 0x01 type variations, etc., with no success. Best targets I found were "hi Vt100 " and "hi Terminal ", which worked with multiple Tab elements (Append Character Tab, Append Tab, Append Vertical Tab, Tab), but no luck getting a different control.
..
technically true
huh?
tf
doesnt seem like it
newline is \u000A
some of the c0s might not work due to stop sequence
wait since when did we get u+0009 in an element??
since two days ago?
#1206592567622373446 message try this
but with 0009
isnt 000A impossible to get in any form
you should be able to append it if it isn't alone i think
i mean append 009
not 000a
what i'd hope is that it would skip straight to 000e
or block reset to 0001
is it possible to block reset to 0000
?
well i think asd (iirc) was able to block reset c0s and it was to 0001. and i don't think 0000 is possible, but i don't remember now.
can someone work on the PUA that are emoji?
see the bottom of the emoji spreadsheet (https://bit.ly/icemoji)
lol
i really wish it would work
you wish
I tried too it's almost 4am and I still can't get anything
thanks for the semi-flashbang
oh yeah just so yk this will never work bc 09 is a white space so basically adding a space after smth will just delete it
juts in case
How tf did you get a flashbang from that
????????????????????
codepoints arent gonna work for c0 chars
ill still def save those recipes becuse those are probably really good for future reference (when building tya tech stuff)
Codepoints have not worked for C0 at all, the only C0 discovered has been through semantics
damn
what yall what
if anything really
no no no keep it that way
codepage 437
??
what
what the heck is a "{?}hi Mr. ' 😭
maybe {9}
{a0} / {202f} / {205f} / {3000}
it can also be a space :P
llama2 7f
what does the <s>421</s> do
the thing that matters there is </s>
how does one even obtain "{9}<s>421</s>" 🔥
I have "{9}<s>"
half ways ig
wow wtf
were you trying to block reset to 0001?
but crazy 7f
btw what is the meaning of </s>?
isnt it delete all previous text but keep as randomness seed or something
how many alts do you have
who knows
These might be potentially useful in unlocking higher Unicode numbers
After limited testing, they're not useful in unlocking Unicode numbers in the U+???? series of elements. However, it did work with the 0x???? series of elements. Proof of concept here:
0x0001 + Add 1 In Hexadecimal = 0x0002
0x0002 + Add 1 In Hexadecimal = 0x0003
0x0003 + Add 1 In Hexadecimal = 0x0004
0x0004 + Add 1 In Hexadecimal = 0x0005
0x0005 + Add 1 In Hexadecimal = 0x0006
0x0006 + Add 1 In Hexadecimal = 0x0007
0x0007 + Add 1 In Hexadecimal = 0x0008
0x0008 + Add 1 In Hexadecimal = 0x0009
0x0009 + Add 1 In Hexadecimal = 0x000A
0x000A + Add 1 In Hexadecimal = 0x000b
0x000b + Add 1 In Hexadecimal = 0x000c
0x000c + Add 1 In Hexadecimal = 0x000d
0x000d + Add 1 In Hexadecimal = 0x000e
0x000e + Add 1 In Hexadecimal = 0x000f
0x000f + Add 1 In Hexadecimal = 0x0010```
The `Subtract 1 In Hexadecimal` worked as well for the `0x????` series of elements as well.
cool
lol why does Ctrl + Shift + V on chinese/japanese characters into Notepad++ give U+0016 (Data Link Escape) ?
hmm
what about
#q('\x7f \x3d ........) // {7f} =
like if u end 2nd element with 7f
actually no
imo {9} is only copying a previous token
there's no control char token invention happening (YET), AI is just panicking and returning U+FFFD
#c0controlcharqstr /j
@zenith nacelle rate this prefix from 0 to 0
0^0 👍
its undefined
Got {1b} to generate, haven't isolated it.
Looks like Prepend Control Sequence is the main actor here. I have a few entries like this.
WHAT DO YOU MEAN
1B WAS FOUnd
🔥
'
interesting
wait no was this cheated in maybe
imo, Delete/Remove/Without (The) (Word) Terminal would maybe be helpful
that's 12 tools to try
and also Subtract/Removes/Deletes
What script do you use to display Unicode character info above the element?
first we have \x09 and now this???
non-double \x1b entry point
hmmm
hey @stoic bobcat just an FYI
we actually did get \b this time
i think
so like that might have issues similar to whatever happened last time
\b is \x08
Last of the night
#dev message
await f(`"\x1b[0m\x1b[0m"`,`"hi"`)
await f(`"\x1b[0m\x1b[0mhi"`,`Delete The "m"`)
await f(`"\x1b[0mhi"`,`Delete The "m"`)
await f(`"\x1b[0hi"`,`Delete The 0`)
await f(`"\x1b[hi"`,`Delete The Hi`)
await f(`"\x1b[i"`,`“]”`)
await f(`“\x1b[i”`,`“]”`)
await f(`“\x1b[i]”`,`String.prepend("#")`)
*/```
why is your \x1b a left arrow
U+001b
is this the first c0 we have a recipe for?
hope this helps 👍
maybe
does that mean this is the 1st element according to alphabetical order????
also it isn't even fd what
except tab but not rly
can you copy the prior elements
remember that stuff got cheated in?
alr thanks
@clover_not_used muted
Reason: Rule 2: No Spam. No repeated messages, inappropriate pings, mangled text, link spam, etc.
Duration: 59 minutes and 58 seconds
Proof: 1b ([Link](#1206592567622373446 message))
when did we have 1e and 1f
WHAT
wtf
I cant read a single bit of this
how did you even isolate it
Just got it to isolate
Guys wtf
You guys should have called me
I literally overlooked that for like six trillion years
EHAT TTEB FUCK
It kinda makes sense 1b was the only c0 we knew it was legally discovered for sure
Just realized he put 7-1 instead of 6 👍
^
I have " {0e} {0f} {10} " but I cant read shit
ms edge font i guess??(
I need a font that tells me what anything is
is it bad that im using invisible characters to token cutoff invisible characters 😭
Just got 1f 👍
EHAT IS THISS
I cant see shit on mobile
my method for "{0e}" was so bad
from {0e} and {0f} to {1b}
no i don't remember
yes you did cus everything is not fd
no seriously why are control characters not fd
idk
genuinely don't know
they were cheated in like a year ago
asd randomly found out one of them was alive and we "accidentally" made the other ones from that
7f was gotten with llama 3 originally
cause what would be an illegitimate way to make it alive ..
ok i get it now
01 and friends were possibly discovered during hashtag exploit, but 1b was after
i'm not sure more than 10 people knew about the hashtag exploit
wait was bell discovered?
im pretty sure it was
unless you're talking about another bell
there's missing recipes but if no one asks I won't send them
it's possible to isolate all of them from 1b. the lineages are too big so I won't spend time making them.
And also I wanted to submit my savefile to ib for the recipes but it's too large
analyse your file and use it on lineage optimizer
I also tried that for curiosity. For example the recipe for 000e was more than 5000 steps before optimization an about 1100 steps after the optimizer... anyway, i'll let other figure it out.
amazing
we’re just two step away
just two step, shouldn’t be that bad
use catstone's lineage optimizer
and then copy it to ib's lineage optimizer
double optimization 🤯
1017 step for 0e
got the last 2
these are the last steps.
unreadable 😭
ill just use this
wow, so all c0s and all c1s (that are possible) are done from base elements now. so the only things left we can do are the 5 missing unassigned in plane 1, and then planes 12, 14 and 15?
(+ the other planes, but we don't have an entry)
quoted 2028 also is not found yet
and it is possible?
idk
idk how I cant get 10000 20000 30000 1d
I don't think you'll like my way
yes and no
yes bc the first part of this was over complicated
no bc i want it and need it 🥺
skill issue i still cant get it 😭
does this help?
yea
What was the hashtag exploit, and how did it work?
you could put a hashtag in any element to completely bypass the anticheat
a consequence of this was actually very well known at that time (hashtagged elements not being dead), but nobody put 2 and 2 together except laurasia, pb and a few others (who decided to keep it a secret, until it got patched)
What was the anticheat intended to prevent?
it prevents from people combining things whose neal cased form is not yet in the database
basically it's why elements can be "dead"
so are we missing {b} to {d} in c0
A quick search for "hashtag exploit" showed the oldest known mention was on August 16th, 2024 where it had been fixed by that point, so the period of time where at least one person knew about it and was not yet fixed had to have been before that date.
Found another post which used "hash exploit" instead that was made one day before the oldest known mention of "hashtag exploit".
what the actual fuck
i know for sure the bug was patched before this message #1208516279217168465 message
but it was still working as of this message #1215495041049436170 message
so it likely existed from when neal case was introduced, until somewhere between 20th-23rd of march
what the hell??????
what exactly needs updating?
The Control Characters sheet. it still says that we don't know where 0001 came from
and no c0s are marked as obtained
wait didn't almost every char had at least the quoted version?
nope. I only created the quoted version for the elements that had a small and capital version (so I revived all capital letters).
oh ok
by the way, if we can craft&isolate 0b~0d, this might be our last chance to recontinue #1317899145423622224
and is fd 👍
those are unquoted. idk about quoted
cause it also says 0009 and 000a are impossible but we have those in quotes
when did we have 000a in quotes
ucinode
Unicod 🐟
all my unicode discoveries as of 12/14/2025 ig
what is your name 😭
looks like flynssg
bc \75 is =
in oct what ever that is
Can you name a unicode that you haven't done yet?
{0B} {0C} {0D}
U+40000 U+80000 U+100000
have we tried incrementing U+dxxx to U+xxxxx
wdym?
cool
@hyaiuer muted
Reason: Rule 2: No Spam. No repeated messages, inappropriate pings, mangled text, link spam, etc.
Duration: 59 minutes and 55 seconds
Proof: chat ([Link](#1206592567622373446 message))
world is unfair 🥀
2 got muted
you didN'T
but you spammed 🥀
sayong nonsense
every thread I go I just see you as the most recent message
you didN'T contribute ANYthing
YOU
the post is approaching 100k messages
nah it's at 122k, the counter thing is wrong because discord is dumb
if it's over 100k and someone deletes one message then it shows on the counter as 99999
and then it goes from there
akrhuhejxhxnsksbxh
kibe
Unassigned
that doesn't count
All assigned have been found, only unassigned remain
how are you even supposed to get unassigned
weird ahh tech or "next-codepoint" spam tech
ive done that before
also i found out all the fds i got before were actually the non-renderable elements combined together
oh i thought the question was "what does unassigned mean?" 😭
you people got no element like I have an element
I got the
huh?
How tf did you get tilted F
u+f73d and ||MS Reference Sans Serif font||
oh you used a css style i guess
in Chrome, when a character appears as a box, Chrome uses the font that you chose to render it
idk if I'm explaining it right or not
do unicode characters, that do smth with letter after it, works for elements display name?
huh?
I mean the element name is just like normal text, so yeah?
unexpected
100 thousans
jus using infinibrowser to like
take recipes
i dont think its cheating because im not doing 141 thousand oher unicodes
what
what
pl
why is even unicode dead now
assigned got finished, unassigned has got left either unreasonably hard unassigned, currently unknown how to get unassigned, or f3 unassigned grind. and i guess noone has time, energy or motivation (or they forgot). i personally don't have either of those and also scripts i used got broken (didn't bother to try to fix really) so i kinda stopped playing then.
assigned got finished? iirc there was a bunch of codepoints from Unicode 17.0 or earlier that were not on IB
maybe not on IB but in the sheet, they all have recipes
"투"
What is the craziest Unicode jumbled mess found?
for me it was the sequence tech
Can you show a screenshot of the mess?
here
thanks
I noticed that ChatGPT had started emitting encoding or language errors recently. For some reason it seems to specifically be an Indian script every time and I wonder if it is UTF-8 related or if it's being poisoned
Oh that happened once while I was talking to it too
Yeah it's recently started
Well the answer is it is generating tokens that aren't what you'd expect
also i havent talked to it in a couple days, i might check if its still happening on my end
it's not common it's semi rare
yeah, but why is it doing that
Well it could be getting poisoned
i thought its training would have made it stick to one language
also how do text models get poisoned
im only familiar with image model poison
I assume in a similar way. I don't know anything about image models but in general models just respond to the training data they are fed
If its sensitive to something and someone finds it, you could change behaviour from rather small tweaks in training data
Havent checked in a while to see if any models are capable of generating the 0xf1 token without exotic input
How do you get them to generate those tokens
thats the thing
im not sure any standard models are good enough to figure out how to do that
yeah this happened to my friend, he spoke an language so weird the AI started to speak hindu 💀
It was some Arabic character for me I think
New one, something in cyrillic alphabet this time
Happened to me again, now the random foreign word is बाहर
Zapal in Russian and Ukrainian means to burn or can be used as a metaphorical zeal for something
wait how is that something that can be stored for long term
uhh
chatgpt what are you doing 🤭
this just happened
Coriolis force is strong enough for solid ორგანიზed rotation
lmao
its obviously trying to say "embedded"
i think anyway
the suspect section in bytes is E1 83 9D E1 83 A0 E1 83 92 E1 83 90 E1 83 9C E1 83 98 E1 83 96 65 64
obviously
well it could be something else
it claims it was saying organised
okay i believe that honestly
dropping the E1 83 off each unit
9D A0 92 90 9C 98 96 65 64
dunno if that aligns with anything
if 9D = o, then 9E = p, 9F = q, A0 = r (checks out)
the rest doesnt tho
it would say 'ordbnjhed'
so i guess its not that
ordbnjhed
psecokied
qtfdpljed
rugeqmked
svhfrnled
twigsomed
uxjhtpned
vykiuqoed
wzljvrped
xamkwsqed
ybnlxtred
zcomyused
adpnzvted
beqoawued
cfrpbxved
dgsqcywed
ehtrdzxed
fiuseayed
gjvtfbzed
hkwugcaed
ilxvhdbed
jmywieced
knzxjfded
loaykgeed
mpbzlhfed
nqcamiged
none of those look like words
so i guess it wasnt just a straight up offset then
although... "twigsomed"
i mean ehhhh....
if you squint, then maybe
actually yeah the first letter
it spells out organized
and i guess my fault for not putting a z cause not american
btw does claude or any of the other major text bots do this or is it only chatgpt?
I think it's only gpt
Proof: 1b ([Link](#1206592567622373446 message))