#Unicode
123247 messages ยท Page 124 of 124 (latest)
it seems weird
im wondering if its beein poisoned
or somehow the model has become trigger happy with languages
is it weird that I thought of something other than gd with both of those references?
ah ok so there was 3
clever
clover
?
I was testing just now and it seemed to have a severe case of this when talking about light and sun and stuff
Pretty much every couple sentences it would drop in some Arabic or Hindi or whatever
Let's 100 start 100 talking 100 like 1, B = 2, C = 3the LLAMA ๐ AI
Poison and Trigger Happy Jack are both very peak songs
i did not make any intentional references, but thats nice
I like how when I call out it's undesired behaviour it's in denial
It's like nah that was a glitch on your end lol
what
is it maybe an encoding error
I feel like no but like
maybe someone let ChatGPT edit openai website and it for some reason decided to stuff in some random charsets or decoded tokens wrong or something
and then to the AI, no, it outputted everything correctly
bc it did, it just got sent to you wrong
that's just a theory though
I mean technically, it's possible. But I think it's very unlikely
most likely that is exactly what it emitted
If it happens again, ask it if anything about the last message seemed off
If it knows what it sent, it's very likely really what it emitted
Because internally, it has its own state that contains chat history. If it was in fact on my end, the chat history on its end most likely would not show anything wrong
hmm
Ask it to reply exactly to your response
send what it thought it output
do that again on a new chat but with what it actually output
information may be valuable
did you lock in
im getting licked
kicked*
๐ญ๐ญ๐ญ๐ญ๐ญ
I DONT WANNA GET KICKR FF
KCIKWD
anyways
uniCODE
forward this to no context
๐
You should not do taht
okya
oik
wahdtsopfopuafpoufapofufoapfa
unicode
the what
the mans
cant wait for unicode 18.0 to come out
just to rush and get all seal and jurchen ideographs
@deft pulsar finally made u+0080 to u+00a0
lisrt
What?
lisrt
lisrt
lisrt
lisrt
Your all Spam-Alots
no u
I dont want to insult you man i want to insult that alpha wolf guy for no reason
he's not even here buddyy
a bit late but this has happened quite a lot of times on grok when printing a long line of code - in russian
saw that in deepseek from a friend
oops
oh well
what is that
Lol
๐คฃ๐คฃ๐คฃ 5 MILLIONTHS?
u+007b in MS Reference Specialty font
Huh
bruhh
yesh
huh
left curly bracket = 1/200000 very interesting
๐ญ
but only in a specific font
mhm
can someone make PUA FE4E5โFE4EE and FE82CโFE837?
they're PUA Emoji of existing sequences, and should properly appear on Android devices
Shouldn't be too hard to do, I haven't played in ages though and I'm not sure I want to. But in unassigned sheet there will be nearby block starts I assume
PUA sheet seems untouched lol
oh right it's private use. But FFFFF and stuff were obtained so through there something should be documented
maybe
All the block starts have been found in those groups of 4
You can see it on FFFC0 I'm pretty sure
U+000D probably not possible
wow so another c0 entrypoint
the control sheet says 0B is "impossible"
and there weren't any instances of it before
really? i kinda forgot which were and weren't found. So would it be possible to get the quoted version of it?
@rose lava
wowie
holy hell
oh damn huh, did we get a c0 entry?
you said "another", which would imply that one exists already, which would be news to me
i see
well i see the 1b entry anyway, but i guess it makes sense you could loop it and whatever
would have been nice to know idk how that slipped through. anyway do i need to update anything? probably
i assume still no 0xf1, 0xf2, or 0xf4?
i can't get even modern AIs to emit these tokens
without including them in the prompt
Nope
what did c0 do to get that ๐ญ
im gonna take a random coordinated guess and guess its "๐ ๐ ๐ ร{b}"
If you DM me the recipe that you think might have made it, i can send you the exact unicode output (so we can see if neal trims the CRLF
)
"=)" - llama
is there a reason they didnt say?
well it appears rather deliberate so ig they want to get some FD or something
well using console craft we can check if its just that
we dunno the recipe tho
yeah we dont know the final element too
we gotta see if my guess is the final element, by checking using console craft
I think the quoted and normal c0 sheets are outdated mostly. all unquoted c0s have been found iirc(i am pretty sure from the 1b entry but would need to check discord messages,)
but idk if you have edit access there
I do, yeah
I have all the sheets afaik still. If anyone else wants them then I can give it tbh, I thought maybe you'd ask me if something needed to be changed but I guess I didn't communicate it well enough. Don't worry about it
Oh?
I can't see from here but what was the actual entry to 0B
cause mobile makes this hard
why is furacao in the lienage
๐ฅ
i'll be trying to optimize in #1226511876733669436
c3 07 is very interesting
It's not even a cutoff append since theres a quote after it
i forget if its dead, it is right?
i guess spaced appends
i am not mkaing sense its been so long lmao
is it just me or if we could get {d}, it can append and instantly cut the back part of an element off
how are u gonna get d
if its not part of newline then whatever
actually
ypu could maybe do
append('{d}\n')
trying to get a \r\n
if anyone wants access to the sheets for controls and so forth lmk
Why are U+3001 and U+3002 (ใand ใ) not showing their codepoint with the extension? Did we get the wrong versions?
can only assume that they have some combining characters? Check to see if they are actually just purely those characters
no I think they are the original characters
I remember this issue
and yes I think it had to do with either combining properties or for whatever reason, the fact they're fullwidth
Putting them into a unicode analyzer only shows their own codepoints
300c and 300d are also doing this
and e and f
While other fullwidth characters don't do this
This doesn't make any sense
Looks like 300c showed the codepoint correctly for someone in 2024
Unless they edited that in
Which seems unlikely
...NONE of the Kana are showing their codepoints

Could it have something to do with the script having an option to translate kana?
Which I have disabled since I don't need it
ใ does show its codepoint

As do ใ and ใ
I really think it must be the script's translate kana setting doing something
This is the only one I could find that shows codepoints
unishode
i uni my code
hm
unishould
U+03A2 (unassigned)
https://infinibrowser.wiki/item/01ksw5tgzhntpxjfqwtb248q6z
A tilde tech back ๐ฅ
entry was {10000} {20000} {30000} {c3}{380} + {10021} {20021} {30021} {c3}{390}
is that all the Plane 0 Unassigned?
we need the 05Ex
Oh shit
I give up. I wanted to get the FD but it's impossible to get rid of the ร. This is my entry into the ืซ (u+05eb). Good luck
https://infinibrowser.wiki/item/01ksxjqv00ensmg0hs9meknzn7
why IB always down bru
i think it's down
here is unoptimized
https://infinibrowser.wiki/item/01ksy4e8zwvxjpymf9g3d2wtxj
rtl
whats the character
what are all the unassigned chars
what is every unicode char we dno't have yet that isnt like U+10ffff
place 0 done i think
hoorah
also first time the remove worked
whats 'place 0'
have we made U+202f
no
so why does unobtainable chars say 0
that is a LIE LIE LIE LIE
LIAR LIAR PANTS ON ๐ฅ FIRE ๐ฌ๏ธ WIND ๐ง WATER ๐ EARTH
oh this is unassigned
erm actually
we don't have {0}
๐คโ๏ธ
btw why are those 4 unassigned while being part of plane 0
like they are just holes in plane 0
The four "unassigned" code points after the main Hebrew alphabet (U+05EB โ U+05EE) are reserved slots left empty by the Unicode Consortium for future script expansions. [this was given by AI]
I mean sure every unassigned is supposed to be left empty until script/char expansions
but like why there
and if there, why not for other scrips