#(CW:AI) Discussion of LLMs and other kinds of text generators ("AI"), especially of toki pona
1 messages · Page 4 of 1
also i feel like kijetesantakalu is a very specific combination of letters that if there's anything else named kijetesantakalu, it's named after the nimi
toki a, kulupu lili o
mi weka e ken pana pi sitelen mun ⭐ tan tomo ni
ni li tan ala pakala sina li tan ni: tomo ni la sina ken len ala e sitelen ilo. taso tomo #pona-kulupu la sina o len e ona.
ni li taso. o musi pona
Better than randomly typing bullshit
is calling everything a kijetesantakalu or a palisa better than randomly typing bullshit?
/lh
yes
sona a. ni li pona tawa mi. mi lukin e toki pi tomo ni lon tomo #pona-kulupu la, mi pilin sama sina. toki li tan ilo pi sona ilo la, ona o len. taso ona li ken ala len lon tomo ante ni.
WBR seems to always get the words mixed up, it's implying that toki pona created Sonja lang in the second sentence
toki pona li mama pi jan Sonja
trained a toki pona classifier, combining that with Kekan San's toki pona tokenizer, to make a really reliable toki pona filter, retaining only toki pona posts from Bluesky in real time.
the toki pona classifier model is available here, (pretty tiny at 118M parameters: https://huggingface.co/johnpaulbin/toki-pona-classifier-v2)
there is already sona toki (also by the mr. regnal ordinal the third danielson affectionately "gregdan" himself) which does this without AI
i'd imagine it to be faster then this approach
wheres the link to sona toki?
wait, did you not know about my library? we were talking about it in DMs a couple days ago tho
oops 😭 i tottaly forgotten the name of your repo
yeah when i saw the tp filter on bluesky, i thought it used sona toki
it uses sona toki plus my project / model
i wanna talk, i'll talk about sticks.
i wanna wait, i wanna request.
i wanna wait for ears.
i wanna eat a pear.
i wanna fuel my wants.
i wanna talk about talking.
i wanna move my wants.
i wanna look into the sun.
i wanna wait for myself.
i wanna talk about my wants.
this is actually a fire jan moose-awi verse
ILO IKE
toki li Inli ala
Is there a collection of ai training data for pronouncing tp
mi awen e toki [pona] mi lon tenpo weka https://drive.google.com/file/d/14bNf2Ler9cOIGbAsrfceAbU8W29TpFkQ/view
ante la · jan [2swap] li pana e toki kalama mute lon ma [Tatoeba] · taso sina wile pana e ona tawa ilo la o toki tawa ona
Thx! Also, Im not fluent in tp yet, could you translate that pls?
Nvrmnd i figured out what you said
thanks
there's the Mozilla Common Voice collection
for the record, he's not a moose. there's a video on this
[Reply to:](#1162756469771472988 message) this is actually a fire jan moose-awi verse
there's https://commons.wikimedia.org/wiki/Category:Lingua_Libre_pronunciation-tok although I'm not sure if that can be used in AI training or not
[Reply to:](#1162756469771472988 message) Is there a collection of ai training data for pronouncing tp
(if you find out, let me know! some of my voice is in there, and I'm curious if it can)
[ilo penpo o lukin ala]
on the one hoof public domain means no copyright whatsoever, and a lot of research tts has been trained on public-domain audiobooks
on the other hoof it might come up against personality rights? e.g. a few contributors to mozilla common voice (also cc0) objected to having their data used for tts
on the third hoof the official opinion of creative commons is that training on copyrighted data is fair use
and if i may lie down, on the fourth hoof will single words even be sufficient for whatever project jan Pa has in mind?
This video was funny but I saw a comment by Kurz23 which saddened me because how can someone be open minded enough to lean toki pona but not to accept tonsi
on the … hoof
poni li lon
Error code:
a733a13d28a94234a863713ac11d5eb8
For support, please send the error code above in #bug-reports-and-errors on the support server (click to join) with a description of what you were doing at the time.
whats your main goal?
kin la i love "and if i may lie down" for the 4th hoof
mokon
moku lon
Sorry to kinda necropost, I had a conversation with jan Sonja if I remember correctly on why we shouldn't have a luck/choice word! I wonder what the AI would do if presented with feedback from the language's creator
mi kin li wile sona e tan
Found it! Keep in mind this was over 9 months ago, so I'm a far better tokiponist hopefully! Ignore anything dumb I say!
kon: am i a joke to u
this is like the "three gry" riddle.
"there are three words in toki pona which end in -on. the first is lon, and the second is kon. what is the third?
mokon"
I like how it says "unofficial" without explaining what "official" is
makes it sound like there's a high register of toki pona spoken by the pu government of ma pona and then there are countless unofficial dialects with a bajillion words, derivations, and adaptations
iirc Sonja uses "official" to mean "by Sonja"
or like "Sonja's version of toki pona"
i forget what exactly, but it has to do with her specifically
i mean for all we know it could be correct, maybe people do use “mokon” but since it isn’t commonly used not many people know /j
yeah we all know poetic forms, derived and alternative forms, and dialectal words and spelling is so common within toki pona /j
I like this idea. Or at least, I like the idea of telling people this is how toki pona works. "Can I say X in toki pona" "it's fine with me, but I just checked in with the grand Vizier of the permanent ruling council of pu, and they said no. Sorry. "
"you may switch to a lower-class dialect, but that'll significantly downgrade your ma pona social status and you'll never get a toki pona job if you don't speak nasin pu"
I don't like it any more than you do, but that's just what society is like. I'm sorry.
love how it completely ignores "ale"
yknow... the toki pona word for "all"?
mi lukin la jan [Sonja] li toki ala anu seme e ni → "nimi pi kon 🎲 o lon ala"
ma [Amazon] la lipu ona li lon kulupu "Official Toki Pona"
lmao
kule and wile as well
If only we could star things in this channel because that is a classic
This is why you never use ChatGPT instead of Google, kids
I never understood people who use chatgpt for simple questions instead of a search engine. like, why would you do that? can't you just, well, search? it's not like chatgpt has more information, or more reliable information, or anything else. you are just asking a question to an ai model that just guesses what to answer instead of actually searching the internet. why? what's the point? it's not better in quality. it doesn't provide you with more information. it isn't even easier. you can just go to any search engine, type the question, and probably get a highlighted text on the first page telling you the answer, and maybe think for two seconds, but for simple questions, that's all for the basics. no idea why so many people use chatgpt for that... just ask an actual source instead of an autocomplete program...
I think it has a lot to do with the fact that most people still don't understand how Generative AI works. I have another theory too, but I'm running between things - it's more philosophical, I'll share it later!
Speaking of gpt and searching the web
I do have gpt pro and they have a web search feature. Here I tried "teach me toki pona" it didn't help me directly but it did list some resources... Hey I know that person!
"according to chatgpt" you couldn't put 1 google search?? 1 wiktionary search?? (although both google translate and baidu get it wrong because the readings of the characters individually changes whne put together)
嗝屁 btw
luckily not everyone has that mindset
sometimes i do it because i have a more complicated question that i feel wouldn’t work well with a search engine, like long sentences with complex phrasing
I’m talking about simple questions though. Like, just random facts with a two word answer type questions
but is or is it not geE pIi
ah, for that one i have no idea
imo it’s not even easier to use an AI, it’s easier to just use a search engine
i use chatgpt for things that i don’t know much about, where it’s kind of hard to know where to start. so even for something that might have a technically correct two word answer, if i have no experience in that field, i’d probably ask follow up questions
in other words you make it harder for yourself to see if it's wrong
it's gěpì
a kule kalama li ante lili
which is why they were so downvoted. if ai was right it probably wouldn't have happened
ijo suwi o...
Exciting news! I'm part of team applying for a big federal grant (National Endowment for the Humanities) about Generative AI and the Humanities. I'm heading up the part of the team on AI and Creativity. At a planning meeting this morning, there was a feeling that we should do something about machine translation, particularly from languages with small textual corpora - with an eye to language preservation etc. My colleague, who is Abenaki, said that no way in hell were we letting an LLM anywhere her language, as endangered as it is.
So, I finally outed myself as a jan pi toki pona, and told them all about this thread and our experiments. People were very excited. So, if we get this grant, some of it will be devoted to LLM translations to and from toki pona. As part of that, I'll be teaching the language to a bunch of undergrads, who will then be let loose on a variety of models, and will experiment with prompting etc.
I haven't mentioned this yet to the team - but I'd also like to find a place for people here to participate - since this is probably the most experienced group of people anywhere on this subject. I'll keep you all updated if we get the grant.
WOOOHOOO
As for prompting, it could be possible, but despite my best efforts I couldn't get only prompting to work with understanding toki pona.
(tested with Gemini 1.5 pro by scraping very popular toki pona resources)
Seems to take things too literally, as it misses nuances that would only be possible if trained on toki pona corpuses
I'd love to contribute what I know so far
What corpuses are available that are in toki pona, but are meant to be for education? (i.e., educational subjects, can also be toki pona subjects)
lipu tenpo is educational
Ty for the resource
I think one outcome of the study will be precisely that: that prompting doesn't help very much. We'll also try more systematically to figure out the kinds of mistakes different LLMs tend to make, and see if there is any possible way to figure out the cause. (Some versions of ChatGPT sound as if they're writing their response in English, and then translating literally- but how can that be, given the n-gram token production mechanism of transformer architectures?) And it will be interesting to think why Claude is so much better than any other.
I thought too that translating lipu tenpo articles to English would be a good test case. The generous license allows that kind of use. (And in any case, I have half a dozen articles in there that I'd be happy to set them loose on).
I see, good to know! there are actually a few sentences in there (including the entire lyrics to soko soweli by kqa, broken up by sentence. I am constantly questioning my systemmates' decisions), but yeah for the most part it's words
-# (also I love the four hooves phrasing)
[Reply to:](#1162756469771472988 message) [ilo penpo o lukin ala]
on the one hoof public domain means no copyright whatsoever, and a lot of re…
yeah...jan Kuse is a whole deal. it's alright though, we love jan Usawi and her antlers
[Reply to:](#1162756469771472988 message) This video was funny but I saw a comment by Kurz23 which saddened me because how can someone be open…
mokon 😭 if you're going to skip kon, at least do like lokon or something
[(click to see attachment)](#1162756469771472988 message)
Official Toki Pona refers to her book series
[Reply to:](#1162756469771472988 message) i forget what exactly, but it has to do with her specifically
and if you really are fed up of Google, next time try out your phone's text-replacement algorithm. it's basically the same stuff
[Reply to:](#1162756469771472988 message) This is why you never use ChatGPT instead of Google, kids
personally i think simply that claude has the most propritary dataset ever, it knows things that i would imaging being only capable of knowing via a github scrape
probably some very clever web crawling + not caring about licenses
whoa, very cool!! teach them well
[Reply to:](#1162756469771472988 message) ijo suwi o...
Exciting news! I'm part of team applying for a big federal grant (National Endowment …
however writing a full report would be so so so interesting to read!!! im more than exited for the future researching to be done.
does the license allow using it for LLM training? i ask since i would really love to get knowledge transferring with llama 3.1 (8B) and toki pona
would be more than useful for translation specifically
is official her english title
official Sonja Lang
wouldn't this make unu official
well, usually anything by Sonja that isn't the language itself
like official books, official website, official shirt store...
You should double check the license. But when I asked the jan pali about using it to fine tune a language model, they said sure, as long as you identify publicly the source. The license should specify exactly how you're supposed to do that.
It's a very good source, because everything is written by proficient tokiponists, and then proofread by other proficient members of the community. It's as close as you'll get to guaranteed "correct" language.
sorry to detract from this quite goated conversation but
i just wanna say
i think everyone should at some point try forcing google gemini to engage in a toki pona conversation with them
maybe it's official because she has an office
la, seme li kama?
ni li tawa sina taso, jan o
ah, so nusun is official
[Reply to:](#1162756469771472988 message) well, usually anything by Sonja that isn't the language itself
like official books, official website…
as is tankala
not to mention mulapisu
toma is in ku so it's undeniably official
of course
here are some things i have experienced trying to have a conversation with google gemini in toki pona:
- at first, it vehemently refused to get the picture. it just replied with a thing it thought my toki pona meant (it was wrong)
- when i eventually told it to speak toki pona with me, it began speaking nearly entirely unintelligible toki pona. i'm talking... like... "misuse of particles as intensifiers" bad. it speaks with each line separated by line breaks and a side-by-side translation.
- somehow, when sending the prompt "sina pana e toki tu tu. toki tu tu la ale li pakala. toki pona sina li ike mute. o kama sona." (this isn't formatted fantastically but you get the picture) it assumed i was talking in Marathi and it responded back entirely in Marathi.
- when telling it in toki pona that it did this, it then began to spit out a barrage of sentences in an endless loop (most if not all wrong), only stopping presumably when it hit its character limit.
- when i tried again to engage in toki pona conversation after this, it continued to just express that it was good and wanted to talk to me. no attempts to talk about anything other than "are you good" or "can you speak toki pona" worked. keep in mind most of their sentences were still pretty wrong.
ok just saw the etymon
tbh should be nujun
especially since the <j> is like /ɟʝ/
I guess even less toki pona in its training material than GPT
ike a
[ilo penpo o lukin ala]
it's the same license as wikipedia, and most people seemed fine with e.g. translation models being trained on wikipedia
i think teeechnically if you wanted to super-comply you'd have to train a model purely on licensed data
but also the creative commons organization has stated that they believe neural network training to be fair use
well shoot
makes sense tho
i dont know if i can create a model from scratch using -only- licensed data
[ilo penpo o lukin ala]
the only such language model i know of is santacoder, which notably still obeyed opt-out requests despite mit being non-revocable
which btw i expect more people to object based on "i don't want this to be used to train ai" than "you didn't follow the share-alike clause fully"
santa coder is good but also really big, also unsure how well it is knowladge wise, ill have to do some digging
The SantaCoder models are a series of 1.1B parameter models
ilo toki pi tenpo lon la ni li ala :p
oops the model size could just be the optimizers not being removed 😭
on disk it shows 4.6 gb but thats usually meant for larger models, probably something with the checkpoint
? 1.1b * 32bit ≈ 4.6gb anu seme
oh is it stored in 32bit? oops
ken
i might have to finetune santacoder as you suggested, as my searching shows no other models that have been trained on specifically permissible data
sina kute ala e mi tan seme 😭
i expect more people to object based on "i don't want this to be used to train ai" than "you didn't follow the share-alike clause fully"
oh sorry im feeling really tired
okay ill most likely settle on llama 3.1 8B, seemed to have the most impressive adaptation to toki pona by far. things i want to try are:
- including synthetic Q&A inside the dataset directly after data entries
- iirc, having finetuning data within the pretraining data helps the model adapt better
- similar to RAG, this should help the model connect the idea of *translation* more efficiently
- include lipu tenpo in some way for a general knowladge improvement
- current toki pona llama 3B model finetune seems to understand basic intentions of toki pona words, but fails to connect conecpts that were only displayed in english, failing to utilize the concepts in toki pona conversations
- im thinking that this method could heavily help with learning transfering
- currently, the 2nd biggest corpus (only behind hecko's chat logs) is a toki pona nltk corpus, which seems to only contain more mundane (i,e, less educational) concepts (as seen in emails / everyday conversations)
- i really want the model to interpret english and toki pona better
- providing the model high quality toki pona learning resources
- even claude hallucinates with toki pona knowlage, despite translating it fluently, it fails with facts of the language, as if it were trained on toki pona texts but not toki pona tutorials / guides / info
currently, the 2nd biggest corpus (only behind hecko's chat logs)
seme a‽ mi wawa ni‽
poka la · sina moku ala moku tan https://wikisource.org/wiki/Main_Page/tok
I havent used the toki pona wikipedia if that's what you're reffering to, and yes, half of my pretraining dataset was your chat logs lol
i really really really wonder how claude created their dataset for sonnet 3.5
a mi pakala · https://wikisource.org/wiki/Main_Page/toki_pona
lipu pi jan mute li lon ona li lon nasin pana CC
which website would you recommend i use to provide the model the best insight into learning toki pona and its concepts?
https://sona.pona.la/ li ken
thank you!!
a a ilo [Codeium] li sona e ni → toki [pona] li lon · (mi pana e nimi toki mute pi toki [pona] e nimi "lipu")
anything that isn't the language itself
And Sonja can make something and not include it in the "official"
me when I spend my taxes on Toki Pona /nm /positive
wh-
a, pakala mi. ale la mi musi taso
[Reply to:](#1162756469771472988 message) anything that isn't the language itself
I thumbs-downed it and it changed to this
Every penny spent on toki pona is one less penny going to the military industrial complex!
agreed
ilo li ike a. Just so bad. I despair a little.
It's interesting that it's not stochastic at all. We can get exactly the same response
ChatGPT (4o):
Marginally better - at least somewhat coherent. But it should of course have said ma pi nena pimeja (LLMs rarely get "pi" placement correct; and don't even seem to know about rarer words like "nena")
i feel like pi isn’t strictly necessary there
it works with or without it
only difference is “place of black mountain(s)” or “black mountainy place” i guess
sona mi la nimi official li toki e esun lipu ona
Yes, those are slightly different - but both could be meaningful in this context.
ilo [Claude] kin li ike:
(this is the free haiku model)
I love how it bullshits an explanation at the end about what is "more natural" in toki pona.
Wow - that's interesting!
"musta" is black in finnish
And pimeja is also from Finnish - what an interesting mistake
thinking toki pona is finnish is a mistake that happens to actual humans too so it's not that surprising i guess
But it is odd to substitute a toki pona word that is actually from Finnish, with another made-up word that is also based on a Finnish word
esun len kin li ni ala ni
that's hilarious
facepalm in a kind of burn them all to the fucking ground mood today.
wait wrong reply
I'm pretty sure it generates a response the first time something is searched
[Reply to:](#1162756469771472988 message) It's interesting that it's not stochastic at all. We can get exactly the same response 📎
"pimeja is derived from the english word "primitive"" is extremely funny tho
and then serves that same one when it's searched again
Agreed - prime LLM bullshit
Oh ni li sona musi - I didn't know that
(except if that one's given a thumbs down, in which case it seems to generate a new one? but that caps at one regeneration)
quite a lot better, actually!
[Reply to:](#1162756469771472988 message) ChatGPT (4o): 📎
but Montenegro's not the native name
for me, without pi that implies that the place is black
[Reply to:](#1162756469771472988 message) it works with or without it
Not sure how to tokiponize Crna Gora
Sinakola is pu I think, I've also seen Sunakola
[Reply to:](#1162756469771472988 message) Not sure how to tokiponize Crna Gora
Excel
database
...
Sena Kola?
probably any SVnakola is fine
I've just suddenly remembered "Thanks, Robert!"
[Reply to:](#1162756469771472988 message) Agreed - prime LLM bullshit
I know a spreadsheet isn't exactly a database. But that was the accessible example given in the computer scientist's talk. He meant a static, dependable table of data from which one can reliably extract the same information every time.
went and found it
[Reply to:](#1162756469771472988 message) i love how it gives your name every time:
"Thanks, Robert!" "Another great question, Robert!" "Preci…
That was back when I gave a presentation and none of my colleagues knew about toki pona - even though I really wanted to talk about it in that paper! It feels nice to be out of the linguistic closet
the closet is a funny metaphor for this
a a a mi pilin sama
"Mom, Dad...I'm a toki pona speaker. I hope you're okay with it."
No disrespect meant of course to those who are actually having to struggle with being in the closet
a a it's funny to me as a queer person
training the new toki pona model:
- base model: llama3.1 8B
- dataset:
- toki pona bible project
- hecko's chat messages (only toki pona)
- lipu tenpo
- wikisource / toki pona
- sona pona
- toki pona nltk corpus
- synthetic claude 3.5 conversations
- feedback from the toki pona 3B model
training on a A100, should take less than an hour as it says on the trainer
n ken la esun ale ona li ni
ona li pali wan e ona li pana e ona sama lon tenpo ale anu seme
mi wile lukin kepeken e ni. taso ilo sona mi li wawa ala
lukin la ni li nasin ona.
don't worry, ill be hosting interactive demos where you can try using the model / provide feedback to it to improve it directly
whoops running out of huggingface storage
btw, is this dataset public as an actual dataset instead of this list of data sources?
ye i can upload the json here if needed
or the raw txt file
@hollow osprey would you be confortable with sharing the dataset with your logs in it?
✅
this includes all the sources i listed
jsonl if needed
(jsonl is tokenized for the llama 3 tokenizer, with the context size of 4096)
imo it should be on hugging face in case anyone will do something related to toki pona
https://wikipesija.org/ li suli li ken pona · taso lipu mute li ike la nnn
i will its just i have to work out the description
oh gotcha
https://huggingface.co/johnpaulbin/llama3.1-8b-tokipona-toki-1
model available here
example testing it:
[{'role': 'system', 'content': 'You are a helpful assistant.'},
{'role': 'user', 'content': 'hello!'},
{'role': 'assistant', 'content': 'Hello! How can I assist you today?'},
{'role': 'user', 'content': 'how can i say hello in toki pona?'},
{'role': 'assistant',
'content': 'In toki pona, you can say "suno o, toki!" or "suno o, toki!" (literally "Hello, speak!" or "Greetings!"). However, a more common way to say "hello" in toki pona is "suno o!" (literally "Sun!"), "jan pona o!" (literally "Friend person!"), or "toki!" (literally "Speak!").'}]
(note i havent done any post finetuning in regards training it with toki pona specifically, im inferencing the "base" model of the toki pona model)
not really? a good amount of words are taken from irl languages
iirc kijetesantakalu is taken from finnish
translation example
(still the base model)
{'role': 'user',
'content': '"ma sewi is a server for toki ponists to discuss religion, faith, and spirituality with one another in any language (especially toki pona). It is a place for interfaith discussions, which means that instead of arguing over truth, folk accept that truths can differ for people of different faiths. "'},
{'role': 'assistant',
'content': 'Translation: "ma sewi li ilo pi kulupu jan pi toki pona tawa ni: ona li toki e ijo pi nasin sewi, pilin sewi, en pilin pona tawa jan ante. ona li ma toki pi kulupu ante pi nasin sewi. tan ni la jan li toki ala e ni: nasin sewi pi jan wan li lon. jan li toki e ni: nasin sewi li ante tawa jan ante."'}]
another one
'content': '"Garry\'s Mod, commonly clipped as GMod, is a 2006 sandbox game developed by Facepunch Studios and published by Valve. The base game mode of Garry\'s Mod has no set objectives and provides the player with a world in which to freely manipulate objects. Other game modes, notably Trouble in Terrorist Town and Prop Hunt, are created by other developers as mods and are installed separately, by means such as the Steam Workshop. Garry\'s Mod was created by Garry Newman as a mod for Valve\'s Source game engine and released in December 2004, before being expanded into a standalone release that was published by Valve in November 2006. Ports of the original Windows version for Mac OS X and Linux followed in September 2010 and June 2013, respectively. As of September 2021, Garry\'s Mod has sold more than 20 million copies. A successor, Sandbox, has been in development since 2015."'},
{'role': 'assistant',
'content': 'Translation: "jan Gari li pali e ilo musi pi pali mute lon tenpo sike 2006. nimi pi ilo musi ni li \'Garry\'s Mod\' anu \'GMod\'. ilo musi ni li jo e nasin musi wan. taso jan ante li pali e nasin musi ante mute. jan mute li pali e nasin musi \'Trrouble in Terrorist Town\' en \'Prop Hunt\' kepeken ilo musi ni. jan Gari li pali e ilo musi ni kepeken ilo musi pi jan Palu. ilo musi ni li kama lon tenpo sike 2004. taso jan Gari li pali mute e ilo musi ni. tenpo sike 2006 la ilo musi ni li kama lon. ilo musi ni li kama tawa ilo Mac OS X lon tenpo sike 2010. ilo musi ni li kama tawa ilo Linux lon tenpo sike 2013. tenpo sike 2021 la ilo musi ni li kama jo e mani mute. ilo musi sin pi pali mute li kama lon tenpo sike 2015."'}]```
hey all, opening up the toki pona translator here: ~~https://8a2c06126446b95e6f.gradio.live/~~
it should be able to automatically detect which language you are translating from and to
(e.g. entering toki pona will return english and vise versa)
feedback isnt being collected on the demo (but you can still send feedback here) as its more of a test run but feel free to try
It's really very impressive! But I just hit my first error - do you want it reported somehow? Here?
OK (slightly personal) one of my prayers every night is for my friend Hannah, who is a nun, and looks after the cows in her convent - so I always say this "sewi o awen e jan Ana e soweli ona e meli sewi ona" - which the translator gives as "God preserve Anne, her cat, and her husband" - I don't know where it gets "husband" from meli sewi - I mean her "holy sisters" "nuns"
oh interesting
i assumed that incorperating the toki pona bible project would cover the use case of that translation, perhaps more context would help? unsure but otherwise ill be sure to work out including more context
maybe have previous prayers within the translation prompt, may help guide the model
The rest is good - but mije ona or mije olin ona would be husband. meli sewi is clearly "holy woman/women"
ah okay
maybe because meli/mije have become less common in toki pona? idk
gender-neutral?
its surpisingly good (at least from what i can tell) with handling context
e.g. you can give it discord chat logs that are in toki pona
variable responses though
interesting
it also seems pretty certain that soweli means cat
i think it interprets that by default if you dont include more context in the prompt, but i could be wrong. ill most likely be finetuning it
nja...
ooh i could make the translator provide multiple possible translations + a breakdown
similar to claude
ill try to make a synthetic dataset for that with claude
mi pilin e ni: ona li sona ala e nimi nja
a ilo li lon li ken pali e ante toki pi pona lili la mi pilin ike tan ken ni: jan li tawa lipu wikipesija li pana e jaki mute tan lipu pi toki inli kepeken ilo ni...
that's valid critism, i dont want people to rely on this for use cases like that yet
its not done yet either
what specific examples do you have?
maybe even preferences like formality, accuracy etc.
i think too it would be interesting to pair it with grounding from a search engine
closing the demo for expiramenting with new prompting methods
will reopen soon
Crazy because the actual word for cat is kisa suwi puwa ko nja
it's down 😦
sorry about that, currently tweeking the backend to see if the model performs better
the model is being prompted differently to produce a breakdown
so results may be slower, however please let me know if translations are more accurate
what if you include the whole dictionary including less common words in the system prompt? or use some kind of RAG
are Tatoeba / Wikipesija usable?
[Reply to:](#1162756469771472988 message) training the new toki pona model:
- base model: llama3.1 8B
- dataset:
- toki pona bible project
… 📎
- toki pona bible project
the problem with that is that past the ~140 most commonly used words most words aren't used or understood
[Reply to:](#1162756469771472988 message) ooh yeah that sounds like a good idea
so if you tell it that kan, for example, is a word, then it might use kan. this is a problem because barely anyone has used kan since 2002, and few understand it
this is actually a surprisingly good explanation! a man can be "meli" as welll—it's a descriptor that implies femininity. I think @signal obsidian has thoughts on this?
[(click to see attachment)](#1162756469771472988 message)
I do! I've been meaning to post that essay at some point lol. But yeah, my contention is that there are no words in Toki Pona for gender. "meli" happens to usually mean "woman" but any gender can be meli.
ni li nasin pi pona a
mi pilin e ni: ken la, tenpo ala la ona li pona tawa ni
Oh yes! Tatoeba is actually in the dataset already, i forgot to list
i might have to agree: i think that even in the mainstream wikipedia.org, using google translate for example to translate english articles into spanish is not as good as actual spanish speakers
thats awesome! surprisingly, im pretty sure it learned the concept from one of the wikipedia sources
which is a really amazing occurance, since it means the model "knows" the concept
One example I give in the essay I drafted but never finished is: Suppose two lesbians, both strictly into women, are at a bar. One says "mi alasa e olin sin". The other nods to a cute fem at the end of the bar and says "ona li pona sijelo". The first shakes her head. "ala. mi wile e mije."
njan
n'jan 🎩
n'meli 🎩
n'meli 🎩
n'tonsi 🎩
cool!
[Reply to:](#1162756469771472988 message) thats awesome! surprisingly, im pretty sure it learned the concept from one of the wikipedia sources
oh @dense nexus I'd highly recommend you feed it this
if you can
it's a comprehensive CC BY-SA 4.0 grammar of toki pona, and explains in detailed prose how virtually every grammar concept works and is used in the common nasin
oooo ill include that!
translator demo link closed for now, ty for those who gave feedback!!
well it can be gender neutral
o lukin e toki pi wan Tansin. kin la a man can be meli too
i guess it's funny since it defines meli as "woman" and "female person" at first rather than "feminine person" or something
yeah
If you have access to Claude 3.5 sonnet, then please send me the English translation of this tokipona text
ni o kama pona. nasin tu wan taso li lon tawa wile ni. ona nanpa wan li ni: mi taso li lawa e ijo ale ona. ona nanpa tu li ni: sina taso li ni. ona nanpa tu wan li ni: lawa pi kipisi wan li tawa sina. ona pi kipisi sama ante li tawa mi.
That seems to me to make perfect sense contextually - and in that case meli/mije can be separate from gender, and instead refer to (I guess?) gender expression. But I don't think that means that meli/mije are no longer gender words. All words in toki pona have that flexibility, and ability to be pushed beyond their semantic space. But if they don't have a clear enough semantic space in the first place, you can't do clever things with them - such as your example.
This is more #sona-musi stuff. But I think examples like yours are precisely the kind of creative and witty use of language that an AI is never going to be able to do.
If we do get this research project funded, I'd love to put together a list of sentences where words are pushed as hard as this out of their normal spaces - as a test to machine translation
has anyone figured out why ai loves palisa so much
I might say meli pi nasin/kule mije
"This should be done well. There are only three ways to achieve this goal. The first one is: I alone control all of it. The second one is: you alone do it. The third one is: control of one part goes to you, and control of the other equal part goes to me."
Note: The translation aims to capture the meaning while making it natural in English. Toki Pona is known for its minimalist nature, so some context might be needed for a more precise interpretation.```
I posted way back on this. A really interesting paper (investigating GPT-J) showed that when you kept pushing that LLM beyond its training resources, it fell back on sexual (especially phallic) and violent imagery. It was like a glimpse into the messy subconscious of the LLM - or perhaps our subconscious, since it's trained on our language. I wonder if that is also happening here, since palisa appears as a phallic word sometimes. (But it doesn't grab onto unpa, oddly)
But given its very limited training in toki pona, it's very easy to push it beyond its limits - so it gets to that toxic state quickly
Here's my original message: #1162756469771472988 message
in the same way that a tomo tawa is tawa or an ijo suwi is suwi, a meli mije is by definition mije
[Reply to:](#1162756469771472988 message) I'm guessing since you said that they are both strictly into women that by mije they mean a butch. I…
toki palisa
la poki sina li sina anu seme :p
yes, because "li sina" can mean "to be yours"
poki mi li mi
yes
[Reply to:](#1162756469771472988 message) la poki sina li sina anu seme :p
Oh yeah I saw that one
Oh yeah because you sent it lmao
ni a
https://huggingface.co/johnpaulbin/llama3.1-8b-tokipona-toki-2
second run of llama3.1 8B
more epochs + trained on synthetic chat data from claude3.5 sonnet
seems to be pretty good with translation, perhaps better than the last model
opening demo really soon
also planning to train a toki pona vision model with the help of hecko
this also comes with the ability to chat, but the demo will be focused on translation only for right now
added an optional context box
plus an extra context checker if you wish to translate less formally
Wow! Looking forward to being able to try this out
https://1053696d53304e6c25.gradio.live/ Demo now open here
Added the ability to chat in a new tab
(chatting seems to have better translations than the translator feature, but can have worse results in general. however it is also more flexible)
chatting seems to be hallucinating more for some reason
hmm translation seems to vary way more than claude, however prompting it correctly brings good translations
feel free to test out prompting methods to bring the best out the model
this prompt seems to work decently well with translating inside the "Chat" tab:
EXAMPLES OF TRANSLATIONS:
[
{Example 1:
Input: "ona li kama sona e toki pi ma ante lon tenpo suno ni."
Translation: "They are learning a foreign language these days."
},
{Example 2:
Input: "a a a mi wile moku!"
Translation: "Haha, I'm hungry!"
},
{Example 3:
Input: "o tawa! tomo li ike."
Translation: "Let's go! This place sucks."
}
... [continue with other examples]
]
CURRENT TASK:
Translate the user's input, retaining cultural nuances and not losing meaning.
Input to translate: "LMAO that was awesome"
closing demo link for now
Opening demo again, tweeked prompt again and other things
Anyone is able to try: https://52cf0a0e675472f240.gradio.live
i don’t think this is a good translation of “mi lanpan e pan li pana e poki tawa sina”:
Here's the translation guide in toki pona:
toki pona Translation Guide for English
Quick Format
tp: [toki pona text]
en: [English translation]
notes:
- [only super important stuff]
Key Points
- Keep it natural
- Use common English contractions
- Break long sentences into smaller ones if needed
- It's cool to use relevant emojis
Translation Tips
- Vibe > word-for-word translation
- Be precise with grammar and words
- Use "la" for if/when/because
- Basic pattern: [subject] li [verb] e [object]
Common Patterns
- "mi jo e len loje" = I have a red cloth
- "sina pali e seme?" = What are you doing?
- "mi wile moku" = I'm hungry
- "ni li pona tawa mi" = I like this
Quick Examples
Greeting:
en: "Hello! How are you?"
tp: toki! sina pilin seme?
Goodbye:
en: "See you later!"
tp: tenpo ante la, toki!
Remember
- Keep translations natural for toki pona
- Match the original's energy
- Don't overexplain obvious stuff
- If something's unclear, just ask
- Have fun with it!
Quick Checks
- Does it sound natural in toki pona?
- Did you keep the mood/vibe?
- Is the meaning clear?
- Would a toki pona speaker get it?
That's it! Just keep it natural, clear, and fun! 🎯
Now, translate this toki pona sentence to English:
"mi kama sona e toki pona tan ni: ona li pona e lawa mi."
LOL thanks for the report
I can't restart the demo without changing the link, so feel free to attempt again with a different prompt or use the formal setting
the formal setting also has unnecessary things:
OUTPUT:
Original: "I took the bread and gave you a box."
Translation: "mi kama jo e pan li pana e poki tawa sina."
Breakdown:
- "mi kama jo e pan" = I get the bread (I took the bread)
- "li" is used to separate the two actions
- "pana e poki tawa sina" = give a box to you (I gave you a box)
Note: In toki pona, this sentence would typically be broken up into two separate sentences due to the lack of grammatical structure to handle multiple actions in a single sentence like this.
unless this is intentional idk
It's intentional
a
I have those extra things in there since it seems to help with accuracy for the most part
prompt: i have a horse, informal setting
toki pona: mi jo e soweli tawa.
English: I have a pet.
(Note: In toki pona, the word "horse" is literally translated as "soweli tawa" (moving creature). If you want to specify it's a pet, you can add "soweli tomo" (house animal).)
is it reverse translating its own stuff?
It provides direct translations within it's explanation when it explains different variations
I recommend using the formal setting, the informal one is mostly experimental
also how should the contex text box be used
i wrote in it that the word “horse” should be translated specifically to avoid confusion with any other animal and it still outputted just “soweli tawa”
It's mostly up to you, but you can say things like "this is in the context of Minecraft" or you can put a discord chat log in there. Or you can specifically say something like "this is a farm animal" or something similar
Closing demo for now, tweaking it more
btw i feel like this ought to say "no li after mi or sina if they are the only subject"
yes, but in the grammar guide to the ai it should say that the first li is omitted right
i'm not trying to correct your grammar btw i know that's correct
Its in the prompt now, ty for pointing that out!
for me omitting the first li (if it doesn't have e) with multiple li feels wrong
like mi lili li lape is little me is sleeping, not i am little and sleeping
the meaning remains the same
no
ok almost the same
look "mi li ante lawa li jan soweli nja" vs "mi ante lawa li jan soweli nja"
the second one is more like "when i am under something" or whatever, not what it actually supposed to mean
I resisted for a long time (having learned tp from pu) - but you get used to it very quickly once you give in to it!
sina wile seme?
anyways mi ante lawa li jan soweli nja can mean 2 things:
mi ante lawa. mi jan soweli nja.
or, mi ante lawa (as one unit) li jan soweli nja
but toki pona has other ambiguities like this]
and they are fine
mi sama
open la mi wile ala kepeken nasin ni li wile kepeken nasin pi lipu pu
pilin mi la jan li ken ala sona pona e toki mi lon nasin ni
taso. mi open kepeken la mi toki e ni: "a. ni li pona wawa. mi o awen kepeken :p"
nii
this use of emoji feels... weird
also AI finding something fun?
btw this was a toki pona -> english puzzle (before that i was training it on lingo puzzles)
How did AI do?
I think it's pretty understandable, but I'm not an expert.
The one thing I do notice is that it did not prefix "Fred" with jan
Here's the prompt I used before this:
it did better than i thought it would. overall it gets the job done, but there are two glaring errors.
first being, "ona li tawa ala" is a weird translation for "it just vanished." i think the most straightforward translation would be "ona li weka (a)."
second being, yeah, Fred would need a headnoun before (which may or may not be jan)
thanks
I feel that keeping the prompt shorter does help the GPT stay on-track, but the last part can be improved. specifically, I wanted the AI to be kept "on track" with toki Inli -> toki pona, while interpreting toki pona -> toki Inli.
With this current prompt, toki-pona to english suffers.
Maybe Using these words, translate following text between the two languages. Interpret metaphor as necessary to make realistic assumptions about definitions, while keeping toki pona simple and concise for understandable text. would do better.
much improved
it doesn't know what sitelen luka is lol
looks like it got confused by usawi, which wasn't in the list of words you gave it
a
"I am looking at this old book of Tosi's language: it speaks about the magic of the bird person Melusine using sign language. However, I am also looking at this other old book: the person is doing it with a tool. Also, the other book is the same. In this picture, I don't know the name of the book."
"hand symbols" is good albeit literal translation of "sitelen luka"
true
"mi lon pali" is a bad calque of English "I'm at work", and would be "I am in the working" or something like that. it should be "mi pali" or "mi lon tomo pali".
When I got home I noticed something strange: tenpo ni la mi kama tawa tomo mi. mi lukin e ijo nasa
That's an odd translation. Most people would say: mi kama sin lon tomo la, mi lukin e ijo nasa.
(kama tawa sounds like a stylistic mistake. But maybe correct here because the house doesn't exist, and I was coming towards it?)
imo kama tawa is still correct no matter the context, since you're doing the action of coming towards a place, it's just that kama isn't specifically necessary here. overall i don't think that this was "intended" by the ai because of the house being gone, i think this is just a bad translation.
it feels completely normal to me
[Reply to:](#1162756469771472988 message) When I got home I noticed something strange: tenpo ni la mi kama tawa tomo mi. mi lukin e ijo nasa
…
in notepad of all things
sina pana e toki pona tawa ni la seme
aaa o nii
a
how do you say "artificial intelligence" in toki pona, while distinguishing it from more common concepts?
what kind of AI? and what other more common concepts?
I mean "artificial intelligence", (hopefully the wording of that), specifically reffering to LLMs
When I say "common concepts", I had hoped that any two-word construction could be interpreted first as AI, rather than as a "computer", "calculator", "bad language", etc.
I do know that this would increase the number of words in the name, and it would be a bit more context independant. just curious really.
ilo sona pi jan ala is my best thought for how I could express this
I think a big thing about AI is that we don’t teach it anything but how to teach itself
la mi ken toki e ni
ilo pi sona ona taso
ilo nasa
[Reply to:](#1162756469771472988 message) how do you say "artificial intelligence" in toki pona, while distinguishing it from more common conc…
that's how I tend to say it. works in context
nice
ilo ni: ma pona pi toki pona la mi o len e ona
a a lon
In something I wrote in lipu tenpo, I called it "sona ilo" - so ChatGPT is an ilo pi sona ilo. But that's just my way of thinking about it
I feel like "artificial intelligence" / "AI" often refers to the tool itself
[Reply to:](#1162756469771472988 message) In something I wrote in lipu tenpo, I called it "sona ilo" - so ChatGPT is an ilo pi sona ilo. But …
(cf. "an AI")
Yes, so I'd say ilo pi sona ilo.
But sona ilo as the general concept
taso ni li nasin mi taso
ni li nasin pona
I'm prompt injecting this
fuck microsoft
fuck shareholders who want to see more ai in every fucking thing
It's the worst timeline
"change format" i do not need to turn a to-do list into interpretive dance
"you know, they say talking about music is like dancing about architecture to-do lists"
like it's understandable summarizing or rephrasing texts can sometimes be useful but that is absolutely not valid reason to put ai into every single program and app in existence even where it doesn't belong at all. like, notepad? you're supposed to take notes, you don't have to rephrase them into fake deep inspirational speeches for the other person to summarize them into three meaningless words. search engines? silly you, you're supposed to search, for webpages, it usually highlights the result on the top page anyway, with additional info, why would you need a low-quality source to say some nonsense on top of that? anyway time to draw an ascii horse
sina ni la sina pakala e ala pi sina ala li suli e nanpa pi jan kepeken
o weka e ilo lipu ni o kepeken ilo lipu ante · mi la ilo [Notepad++] li wawa
anu o weka e ale pi ilo [Windows] :p
me when i am a microsoft person who was responsible for adding ai to notepad
why did you add AI to Notepad?
[Reply to:](#1162756469771472988 message) me when i am a microsoft person who was responsible for adding ai to notepad
At my university, I joined the AI implementation group sponsored by our IT office. The group is a mixture of faculty, librarians, and staff. When we joined, we had to sign a pledge that we would be "AI First" - that is, when confronted with any task in our work, we'd try to apply AI first. I signed it with my fingers crossed because I wanted that free OpenAI subscription, the only reason I was there. But it says something of the mentality of these people. There's nowhere that AI doesn't belong
I'm a professor, so using AI first would mean using it to write my book and teach my students. Both of which are pretty obvious ethics violations.
ai isn't for us
Funny thing: my jan olin is a senior academic administrator (a dean). Her assistant is in this group, and proudly told the group chat that she gets ChatGPT to do her boss's schedule. Which explained to my jan olin why her schedule was so insanely badly planned.

seme????????????????
seme la kulupu ni li lon
ona li wile pana e lipu ilo nasa tawa ale
jan ale li ante. ken jan en pilin jan li ante. ni la lipu li lon. lipu li pana e pilin jan e sona jan.. taso seme la kulupu wan li toki e ni: "a o ilo taso. o pana ala e pilin sina tawa lipu. o pana e pilin ilo tawa lipu"
ilo li ante ala. ni la lipu ale li sama.
ilo o pali lipu tan seme
mi sona ala
pilin mi la pali lipu li pana pi pilin jan
sina pali e lipu pi kama sona e lipu pi toki musi e lipu pi nasin ale la sina pana e pilin sina tawa lipu. ni la jan li wile lukin e lipu li wile kute e sina lon lipu
ilo ale li sama li pali sama la pilin jan li lon ala
sina kute ala e pilin jan li kute e toki ilo
li kute e nanpa ilo anu seme
chatgpt is a great tool but if you use it incorrectly it can cause more mistakes in the long run, mostly because the model assumes you provided it all the information it has to know in order to perform a task, it usually doesnt ask for clarification, and often hallucinates. coding specifically can be more messy too, since usually copy and pasting code from chatgpt gives you a lot of security vulnerabilities (e.g. it suggests making flask apps for your APIs) and your code will be very inefficient if you dont specifically instruct the model with ALL of its required information to do so.
also usually it provides solutions that are more "one-size-fits-all"
mi pilin sama ale. mi la, jan pi kulupu ni li sona pona ala e pali pi jan sona. ni li nasa, taso ni li lon: tomo sona la, jan mute a la li sona pona ala e pali ni. tomo sona la, nanpa la, jan sona li lili, jan pi pali ante li suli.
a tenpo pini la soweli lapate li pana e lipu sona ni: "lon la jan mute li olin wawa e lipu ilo tawa lipu jan"
anu seme
jan pana pi sona.. wawa?? anu seme??
lon la mi sona ala e ni: jan pana li sona wawa ala wawa. mi kute e toki ni la mi pilin sama ni: jan ni li wile e mani tan lipu ilo a a a
mi pana e ni. (ni li lipu pi jan ante). sona ona li jaki li lili
kulupu "ilo o nanpa wan" li pana e pilin ni tawa mi: "jan pali pi sona wawa! o weka e pali pilin sina! sina o pilin ala o kama ilo!
tenpo pini la sina pali e lipu jan li pana e pilin jan tawa lipu ni a! ni la jan mute li sona e sina li esun e lipu sin sina. o pana e lipu ilo!! sina kama pana e lipu ilo pi pilin ala la jan mute li kama esun tan lipu sina pi tenpo pini! kin la sina o pana e mani lili tawa kulupu mi tan ni: mi pana e sona pi nasin ilo tawa sina. o pona!"
tenpo poka pini la, mi e jan pi kama sona li lukin e lipu ni. ona li pona. https://link.springer.com/article/10.1007/s10676-024-09775-5
Switching to English for a moment. I think I was one of the very few academics in this group. Most were librarians or support staff. And they were pledging to do their jobs using as much AI as possible- and to tell us how they did it. My 2 thoughts were: the solutions were often worse than doing it "by hand." And why are they giving their bosses all the information they need to replace them by a machine, and doing so willingly, for the price of a premium ChatGPT subscription (which they also used to write godawful poetry and make godawful pictures that littered the group chat)
Workers will be replaced by a tool because it is fun to use and will tell them that they are awesome
||what i was assuming was it was a way to spread ai and make ai more widespread by getting established authors to give up their work of passion to a work where they just ask an ai to do it for you and make money because people know them from their previous works. then the company would get a cut because they "taught them" the method||
mi kama sona e toki nijon lon tenpo ni tan ni: mi wile toki inli e lipu pi toki nijon e ijo pi toki nijon tawa pali mani
In my own division, we had lots of Zoom meetings to analyze and summarize into official minutes. Someone suggested using ChatGPT, and I said, OK, that sounds harmless and efficient. Except when I read the transcripts, it made up things that no one said! And when these are legal documents that could come into a tenure or employment case, that's kind of a disaster! So now they're back to transcribing by hand.
ike la mi sona ala e ni: mi ken ala pali ni tan ilo
mi wile pali e toki pona li wile pana e pilin ale mi tawa lipu pi toki nijon. mi wile pana pona e sona lipu tawa jan pi ma mewika. jan mute li sona ala e lipu pi ma nijon li lukin ala e ona.. taso
ken la ilo li utala e mi li anpa e mi lon tenpo kama
mi awen wile kama sona. taso.
ilo li ken anpa e mi anu seme
mi lawa e kulupu lili. mi wile sona e ni: ilo li ken ala ken toki ante e lipu? mi la, ken ala.
taso tenpo kama la ni li ken kama ante anu seme
n
mi sona ala
ilo o ppana a e pilin jan
lipu pi toki inli li kama tan lipu pi toki nijon la nasin toki o pilin sama
ni li wile a e pali pilin e pali ilo ala
kin la ni li wile e sona toki mute
tenpo pini la mi pilin e ni: ante toki li pali pi wawa ala
mi la, ilo li ken ala anpa e pilin jan. ilo li jo ala e sijelo li jo ala e ma. ni tu li lon ala la, ona li ken ala pilin.
pilin mi la jan ni li ante taso e toki. taso mi kama sona e toki pona la mi kama sona e ni: toki ante li ante a tawa toki inli
toki pona li ante wawa a tawa toki inli
anu seme
sina toki pona la, sina toki insa ante. ilo li toki insa ala a.
mi alasa ante toki e lipu pi toki pona la.. mi ken musi. mi ken kama sona e nasin toki musi lon toki inli. mi ken pana e pilin mi tawa lipu pi toki pona. anu seme
n
tenpo mute la mi kon taso li awen lon tenpo suli
ni la mu lawa mi li weka mute :p
mi pilin taso
mi pilin la mi pana e pilin ni tawa toki
pilin li kama --> lawa sina li nimi e pilin --> sina toki e pilin kepeken toki wile sina
tenpo pini la mi pilin sama ni: toki pona li ante e pilin
taso. mi la ona li ante ala e pilin
nasin toki li ken pana e pilin pona! taso ona li ante ala e pilin open li ante e nimi pana pi toki ni anu seme
ni li kama e pilin pona
mi toki e toki pona la mi pana a e pilin mi e kon wile mi. toki inli la mi ken ala ni.
toki inli la mi ken toki e ijo pi pilin mi. taso kalama toki mi li kama ante lili la ni li ken kama musi taso li ken kama pilin utala.
ken la mi wile ala utala e jan li wile ala pana e pilin ni
tasoooooo kalama utaa mi li ikee la mi wile utalaaa e sina anuu semee
toki pona la sona kon ale li kama tan ijo poka li kama tan nimi
kalama toki en nasin toki li ante ala e kon
taso mi la, sijelo li lon ala la, pilin kin li lon ala. kin toki la, mi pilin ante. mute la, ni li kama lon ala: ijo li kama lon lawa mi; mi "toki ante" e ona kepeken toki. ni li nasin sona pi jan Teka (Descartes) li nasin sona pi jan Lo (Locke). nasin ni la, lawa jan li sama ilo sona. mi pilin ante a! taso mi ken ala toki pona e ni lon tenpo ni! mi wile pali e lipu lili pi ijo ni (toki pona and the phenomenology of language).
toki ni a la mi pilin ante.
||m..
a lot of the time i simply breathe and exist for a long time (meditation)
so a lot of my brain mu'ing has stopped
i kinda just feel
when i feel a thought, i put that thought into word
thought comes --> your brain puts that thought into words --> that thought goes into your desired language
a while ago i thought that toki pona actually did change your feelings
but i feel like it just doesnt really now
the way you speak can! but that didn't change your original feeling at that starting point. it changes the way you give out that feeling.
this makes you feel good
when i speak toki pona i feel like i am actually saying what i want to say but in english i can't say what i feel. but if my tone becomes slightly different now its like only a joke and can start fights
maybe i dont want to fight people or give that kinda feeling
but now my speech is slightly different so i must want to fight you right
in toki pona all of the information about your speech comes from the thing around you and the things you are concretly saying
the sound you make and the way you say it doesn't change the kon||
(ni li pilin mi taso)
mi ken sona pona e toki pona sina!
sona
I need to think about this more, and I'll probably write something (likely in toki pona). Heidegger shows that our usual model of language (thoughts in our heads, translated into arbitrary signs, then sounded out) is not true. That's often how we justify things after the fact saying, oh I said this because I had this thought in my mind. But language comes before thought, and shapes it, and constructs the world in which we live. When I speak (says Heidegger) it's so I can hear Language speaking, and how it points me to a world. It matters a lot, then, what language you're speaking. Language is, he says, the "house of Being" and those who speak very different languages will inhabit different houses - things will show up for them in a world slightly differently. Which all sounds very abstract- and very difficult for us to get after half a century of being told that the mind is like a computer, a kind of symbol-processing machine. The success of AI is built on that false premise
you described Michael Gazzaniga's "split brain" experiments: https://youtu.be/mJKloz2vwlc?si=8nHcgvazF-EbDI1I&t=918
The third in a series of Gifford Lectures by Professor Michael Gazzaniga. Recorded 15 October, 2009 at the Playfair Library Hall, the University of Edinburgh.
The interpreter is the device we humans enjoy that provides us with the capacity to see the meanings behind patterns of our emotions, behavior and thoughts.
This concept is central ...
the timestamp is him showing the expierament, the full presentation is really interesting
I will watch - thank you!
no problem! :D
wawa a · wile sina la mi o pana ala pana e toki ni tawa kulupu ante
tawa kulupu seme? (taso lon la sina ken pana e ona tawa kulupu ale)
kulupu ni → ilo sona sitelen pi moku linjuwi li ike tawa ona
a ilo ni li ike tawa mi kin! o pana e ona! (ken la o weka e nimi mi)
nasin pana li ni
la pona ala pona
,,a lon la mi wile pana e toki mute la mi jo e toki li toki sin e ona li kepeken ala nasin sin ilo
nasin tu ni li pona tawa mi.
||phonemes are a surprisingly good technique in whatbeatsrock)||
oh waw
و
insane
vav
ו
[Reply to:](#1162756469771472988 message) vav
why did I think this was aleph
because this ا is alef in Arabic
[Reply to:](#1162756469771472988 message) why did I think this was aleph
Hebrew alef looks like this: א
ִ | alef | waw
Arabic | ا | و
Hebrew | א | ו
prank'd
ale f(li) waw(a)
ale fli wawa
Holy fucking shit OpenL just got way better
OKay nvm
It just locked REALLY HARD in for one sentence
Yeah I þought it was just isipin it fucked hard wiþ for some reason but it was just one sentence
Oh you get a different answer everytime
CW: moli, wile moli
It's not like, correct, but it's like 70% þere which is like an A+ for Mr. Palisa Tomo li e palisa kili mi
||o isipin e pona pi moli mi. ona li pona tawa telo uta mi|| is kinda what it's saying
Here is þe fabled better sentence
wonder why it’s so different when you set it to “medical”
also neither of these are even close lol
yeah
because as they say, "toki pona would never work in a doctor's office"
aviation english ❌
medical toki pona ✔️
What on earth was that original sentence from??!
"give me a case declension system that is relatively similar to every other language on earth"
gives me just latin
of course
it is a text prediction algorithm, after all
does nightshade and glaze even work
[Reply to:](#1162756469771472988 message) does nightshade and glaze even work
(@heavy shard mu)
it works so well that OpenAI is getting mad
Oh interesting
how was their response
they have been trying to say as little as possible about it, but they let this slip:
We are always working on how we can make our systems more robust against this type of abuse.
[Reply to:](#1162756469771472988 message) how was their response
I think the fact that they call it "abuse" is pretty clear evidence that it works and they're trying to stop it
interesting
I linked an article on it here if you want to read
[Reply to:](#1162756469771472988 message) very well, yes 📎
and here's a longer, wider-focused article cited in the linked one
people claimed that nightshade and glaze doesnt work, why
I don't know
and I don't think we have solid confirmation that they don't work
but AI companies seem to be mad about it
i heard that 1% of gaussian blur will remove poison from arts, is it true
I don't know
[ilo penpo o lukin ala]
there was a paper about defeating it that the glaze team has responded to, but it's not as simple as tiny gaussian blur
the paper also claims that glaze isn't very effective to begin with, but perhaps that's better than nothing
also apparently even a normal overlay can work decently well?
for the record while i've generally heard nightshade/glaze isn't effective what i'm confident in is simply "you can use it if you want but trying to get others to use it specifically intending to eventually poison AI art models out of effectiveness entirely isn't going to work; the damage is done"
i could be wrong about anything that is not the opinion in those quotation marks
yeah, that is a fair point
[Reply to:](#1162756469771472988 message) for the record while i've generally heard nightshade/glaze isn't effective what i'm confident in i…
a a a
ni li muusi wawa a tawa mi
i had a thought while im asleep
could be ai people trying to stop people from using nightshade or glaze
just a thought
what do you mean?
i think he's trying to say people claiming that nightshade/glaze don't work could be people involved with ai art lying to sabotage people not wanting their art style to be reproduced through ai art but i'm not sure
i mean given what you linked above when people involved in ai art don't want people to use filters poisoning their art to ai they typically just say "i don't like these things. they're bad and we will fight against them"
That would be typical FUD (fear, uncertainty, disinformation) that big tech companies use. But i don't know enough about these image masking tools to know whether they do or don't work well
okay sorry if this is pedantic but it's "fear, uncertainty, and doubt"
i wouldn't put such a concept past big tech (ibm pioneered this shit and microsoft picked up the baton and ran for decades with it) but when openai is actively getting mad i feel like assuming that's going on here amounts sorta to just paranoia
ah. maybe!!
[Reply to:](#1162756469771472988 message) i think he's trying to say people claiming that nightshade/glaze don't work could be people involved…
OpenAI seems to be avoiding the subject as much as possible
[Reply to:](#1162756469771472988 message) i wouldn't put such a concept past big tech (ibm pioneered this shit and microsoft picked up the bat…
Oh right.
yeah i meant by that
another thought
people said this after ai company fixes it
still may not be accurate
which one do you think is likely
does anyone know what conlang ai does best? because I was reading the wiki of lojban and apparently its supposed to be machine compatible? I don't know much 😅
Generative AI works on training from a mass of texts. It doesn't matter (really) if the grammar is very logical or not. There is probably a tiny amount of text for lojban, and probably more for toki pona.
esperanto then 😂
But Claude's latest models seem to be able to leverage very small text corpora. No one knows why. So it's good at toki pona (relatively)
And yes, they're all more ok at Esperanto
tenpo pi toki [.lojban.] la nasin pi sona ilo li ante · ilo pi tenpo ni li kama lon ijo pi nasa lili la ona li pakala ale · ni la toki ilo li wile · taso ilo pi tenpo lon li moku e toki jan mute mute mute · toki jan li nasa la jan kute li awen ken sona · ilo kute pi tenpo lon li ni kin
depends on the type of AI (a fairly meaningless term at this point)! current LLMs are basically just glorified versions of the text prediction algorithm on your phone, so like soweli Lapate said it doesn't matter what the language is, just how much data there is. Lojban and its logical grammar could be great for other types of computers/AI though! (as could toki pona, Toaq, and in theory even Ithkuil)
[Reply to:](#1162756469771472988 message) does anyone know what conlang ai does best? because I was reading the wiki of lojban and apparently …
what on earth is canus, canī??? is it meant to be canis, canis? why did it turn the 3rd declension word for dog into a 2nd declension word??
New model (fully open source, SOTA)
600+ Billion parameters, but since its a MOE, only 37B parameters are activated
seems to be good at toki pona
feel free to try it out or send me questions to ask it
info about the model: https://api-docs.deepseek.com/news/news1226
i dont recommend enabling "deepthink"
that sounds like a newspeak word
lols
Well, the official text The Complete Lojban Language is under a free enough license that the entire text is effectively in the public domain, making it easily sharable without the usual legal consequences of piracy. It's one of the bunch of post-1929 post-1978 or whatever texts that are hosted on Wikisource.
With that in mind, I bet ChatGPT has scraped and got trained on the entirety of The Complete Lojban Language at some point and memorized its rules already
Memorizing rules is not how LLMs work.
Sorry, sorry sorry. But do consider the fact that, because the CLL is under a free sharing and distribution license, ChatGPT might have scraped that book in its training data at some point
_Copyright © 1997 by The Logical Language Group, Inc. All Rights Reserved. _
Permission is granted to make and distribute verbatim copies of this book, either in electronic or in printed form, provided the copyright notice and this permission notice are preserved on all copies.
Permission is granted to copy and distribute modified versions of this book, provided that the modifications are clearly marked as such, and provided that the entire resulting derived work is distributed under the terms of a permission notice identical to this one.
Permission is granted to copy and distribute translations of this book into another language, under the above conditions for modified versions, except that this permission notice may be stated in a translation that has been approved by the Logical Language Group, rather than in English.
The contents of Chapter 21 are in the public domain.
that doesn’t mean it’ll be good. what soweli lapate is saying is that LLMs don’t work by reading and internalizing rules. they become good based on actual data and input in the target language because they’re giant prediction machines that need to exposed to input to predict. there is no substantial lojban text to train an llm with that would be effective at all.
if the only training data for an english LLM was an english dictionary and a book of grammar rules, it would suck no matter how perfect the book was
the same applies to any conlang, like lojban.
there are several full toki pona courses online and orders of magnitude more content than lojban, but chatgpt still sucks at generation
I did do some experiments.of throwing lots of toki pona grammar texts at a custom GPT. It made no difference. What made a tiny difference was giving it more texts, in the form of issues of lipu tenpo. It became halfway decent then at talking about subjects that were in those essays. But still not great.
Hmm, what if we train it on Lojban by copy and pasting examples (but not their answers initially) from the CLL and asking it to translate those sentences, then when the LLM inevitably gets the translation wrong (or even if it gets it right but slightly worded differently), reinforce the learning by copy and pasting the CLL's official translation for that example.
That's very similar to the reinforcement of learning in an LLM. And yes, that would be a way to do it. You'd just have to do it thousands of times.
We could also use The Wonderful Wizard of Oz, Alice's Adventures in Wonderland, and their respective free Lojban editions
I used Glif to generate this gem:
OK, it technically generates images, but it generates text too, and the text is the focus of the meme
how to appear to be a jan sin:
Also this
(Again, the text is the focus of the meme, not the images)
LANGUAGE WAS NOT MEANT TO BE REDUCED TO 120 WORDS BY A CANADIAN IN 2001
lol
(these are real sentences)
shows images
(also mu)
this is like really funny LMAO
Yeah, Glif can generate some real gems
Every day we stray further from Proto-Indo-European
In case you're curious, here's the site
woah that’s actually like something a human would make imo
the images also don’t look like AI
The second meme's images definitely look like AI. But just replace these images with some random toki pona words and sitelen pona glyphs, and it'll be 100% golden!
As for the Wikihow meme, I think it's because the AI model was trained ONLY on images within Wikihow itself, which has a very specific house art style, and nothing outside of that
“Works every time, 60% of the time” is literally a joke I would make that’s mildly unnerving to hear a machine spit out lmao
pona! a!
🪴 <- ni li seme
kasi
🐿️ li seme
mi la ona li pipi ala li soweli ala. soweli la ona li lili. pipi la ona li suli. ni la ona li seme?
still a soweli, but this is sort of the wrong channel for this
It was a reference to this, but yeah.
ah sorry
people
jan
wan-wan li mu lon toki Nijon
anu seme
[ilo penpo o lukin ala]
this doesn't really look like the wikihow style to me
more importantly almost nopony trains models from scratch, and wikihow probably doesn't have enough to make a coherent model (millions is a minimum for broad-topic coherence)
ni li nasin pi lawa sina kin
ni!!
mi wan la mi tawa tomo moku
mi lon e ma lon la mi sona weka e ni
mi lukin e meli walo
mi toki kalama: "mi alasa e meli walo"
This text is in Toki Pona, a constructed language created by Sonja Elen Kisa. Let's translate it into English:
- "mi wan la mi tawa moku" translates to "I want to go to the moon."
- "mi lon e ma lon la mi sona e ni" translates to "I am in a place where I know this."
- "mi lukin e meli walo" translates to "I see a white fly."
- "mi toki kalama: 'mi alasa e meli walo'" translates to "I say to the fly: 'I don't chase you.'"
Putting it all together, the text in English is: "I want to go to the moon. I am in a place where I know this. I see a white fly. I say to the fly: 'I don't chase you.'"
Which model?
perchance
sona ni li weka tan lawa mi: mi lon tomo pi toki ilo.. la mi kama pilin nasa suli
"seme a. toki ni li seme" a a a
bro
Sorry for the dumb question (I'm new)
I looked through the chat history and saw people using different models, but has anyone tried multiple models and compared them? I'm thinking of an LLM scoreboard for toki pona, like the Chatbot Arena Leaderboard (https://huggingface.co/spaces/lmarena-ai/chatbot-arena-leaderboard)?
Or does anyone have more success with any newer models? For example, I haven't seen anyone mention Gemini since the 2.0 Flash and 2.0 Pro experimental releases...
Can anyone with a better understanding tell me if this output from Gemini 2.0 pro looks remotely correct?
Sample with Gemini 2.0 Pro: "generate me a comprehensible input passage in toki pona about hotpot"
Option 1 (Very Simple):
telo seli li jo e moku. moku li seli. mi moku e ona. pona!
Translation: The hot water has food. The food is hot. I eat it. Good!
Option 4 (Most Complex, but still comprehensible):
kulupu mi li moku e telo seli. ona li lon poka pi ilo seli. mi pana e ijo suli tawa telo. ijo lili li kama seli la mi moku. wawa li lon poki seli. ni li pona mute tawa pilin mi.
Translation: My group eats hot water. It is beside the hot tool. We put many things into the water. When the small things become hot, we eat. There is energy in the hot container. This is very good to my feelings.
tenpo suno ni la mi en kulupu meli mi li kama poka lon tomo moku. mi mute li wile e moku pi telo seli tan ni: telo sewi li lete li wawa ala tawa sijelo mi mute. lon poka pi te sitelen la ilo seli li lon. ona li jo e telo seli mute. jan pi tomo moku li pana e telo e kili e soweli lili e kala e pipi tawa mi mute. mi mute li pana e ijo ni tawa insa pi poki seli. telo seli li kepeken e ilo ni li kama seli mute. tenpo lili la moku li kama weka tan telo seli li kama lon poki mi. mi kin li pana e telo nasa lili e namako tawa poki mi. moku li jo e leko e sike mute. pilin mi li pona tan ni: kulupu li lon poka mi li moku pona li toki pona. telo seli li pana e wawa e kon pona tawa kulupu. moku pi telo seli li pona mute li pali e pilin sanpa. tenpo kama la mi o moku sin e telo seli.
:airplane_arriving:
*toki ***li kama tan tomo #toki-ale · o lukin e open
||conversation continued from #toki-ale · read the beginning||
...oops, should have added a topic
the shorter passages and translations are both correct, although fairly literal / word-for-word! the longer paragraph has some mistakes, though
[Reply to:](#1162756469771472988 message) Can anyone with a better understanding tell me if this output from Gemini 2.0 pro looks remotely cor…
this is unrelated to toki pona but too good to not post here
this is truly incredible
maybe the real AI were the pathfinding algorithms in video games
hey everyone
maybe the real AGI was the carbon emissions we made along the way
it should be luka tu wan ale mute mute mute mute luka luka wan iirc
messing around with a markov chain like thing for text generation
fed it lipu tenpo, and it sometimes creates coherent text
here are some examples:
- "mi wile moku e ko jelo.", "I want to eat cheese"
- "soweli suli en soweli li jo e nimi lili mute li toki", "The big animal and the [other] animals have many short names and they speak"
- "mi pini nanpa tu li lili mute", "I finished second place and am quite small"
(not sure if this counts as AI or not)
it looks at both the current word and the previous word
that's the only difference
ale(po) mute san
i gave chatgpt a full course based on lipu sona pona
Which software? Or is it in a GitHub?
wrote it myself 
oh nice! how did you implament it?
like how does it work? it finds every instance of the previous two tokens together in a text and then selects a random one of the tokens that follow them (ei. if it got "the" and the text was "the quick brown fox jumped over the lazy dog" it would randomly select between "quick" and "lazy")
it probably sucks and there is probably a more efficient way to do it but it's just a curiousity and I don't really care
oh interesting!
i really like how it works, feels almost like a very barebones transformer (decoder-only) model
except it isnt in a "black box" which is a big plus
i wonder if you upscaled the data, how much better it would improve
feeding it my discord messages (the data download is a few months old) one of the things it generated was "i am hiding in your next message okie" 😭
LOL
it uses just as many :3s and 😭s as I do so I know it must be working
haha
when I was first testing it, I used the Wikipedia article for JFK and it generated such awesome sentences as "He was admitted to the wall"
its already sentient 😭
yes
"I spent 100 days on the ship is and here's why" ahh
the real question is how to download YouTube titles
hmm i can try to scrape mrbeast video titles 😭
bad idea: feed it the stuff in https://discord.com/channels/301377942062366741/1284700950162047036
real
maybe feeding it Bible verses will cause it to do something funny
toki pona bible verses
ooh yes the TP Bible project
this is automatically updated https://github.com/PaulieGlot/lipu-sewi/blob/master/bible/full.md
paulieglot is an S tier username
I find it funny that the TP Bible logo has a glyph which originates from Arabic "Allah" on it
HAHA i didnt notice that until now
what are all the question marks
the mr. bible
incredibly real
Genesis 1:1 = I Just Provided Light!
"I Gave My 40,000 On Lottery Tickets And Won!"
- the algorithm
Mr beast gamblecore
🎰
it also made "Attempting the Impossible Maze" and then it cut off
thats awesome
LOL WAIT i didnt even realise this was from your script
😭
ITS TOO GOOD
this is cherry picked
the real output is very ungrammatical and also starts on random words
it works just often enough to be funny tho
i could probably give you even more titles, i could scrape youtube shorts
"Pennies In My Friends Backyard Spending 24 Hours Straight"
LOL
"Last To Leave Ramen Noodle Pool Wins $20,000 Yacht"
oh maybe you could start each data in the dataset with a single period or something . and it'll be like the starting token
LOL
oh also I should note that it ignores new lines and treats them like spaces
I might change that, it's something in the sanitization script
"$1,000 Every Day You Survive In The Middle Of The Reaction To The Connor McGregor fight !"
what is the connor mcgregor fight 😭
LOL
i have no clue what that is
comment section drama on one of jan Misali videod
O_o
it's pretty famous in the tp community
ANTHONY: You illiterate gobshite, I'm sure you won't say "for that, himself but it will all be more than a thousand, five
what the fuck
so far so good
i should get a new discord data download
the Mr beast one just generated "Breaking The Law" with absolutely no context what the fuck
we also got "Last YouTuber To Leave Circle Gets $20" 😭
LOL
LETS GO GAMBLING!!!!!!!!!!!1 /j
WOOHOO
guys, how do you enter this thing?
i'd like to like help
join the Discord server
wheres the discord server?
it's linked on https://github.com/PaulieGlot/lipu-sewi
thanks jan Ke Tami!
pff
[Reply to:](#1162756469771472988 message) feeding it my discord messages (the data download is a few months old) one of the things it generate…
isnt that lipu+sewi 😭
i wonder why "suno o lon" isn't in quotes
arabic bibles:
seriously tho arab christians have been using "allah" long before muhammad
sewi Jawe en sewi Elowin li lon tan seme a
is it to keep em distinct in translation
elohim -> sewi mute (very sewi) 
"Sabeansjo" ?????
https://www.youtube.com/shorts/jLE1fn79NI0 nah because if you're going to fake anybody's voice to scam people, why him
What You’ll Gain:
AI-Powered Insights: Leverage cutting-edge tools to pinpoint high-potential trades and maximize your success.
Expert Guidance: Follow step-by-step strategies crafted by seasoned professionals.
Supportive Community: Join a free WhatsApp group of like-minded traders, where tips, strategies, and encouragement are shared every day....
like what even is this 😭
yep
essentially, yes
also when you are translating you can write sewi Jawe as sewi Nimi, however
elohim is
"sewi &Elohim"
so that find and replace can easily change things if a tokiponization is changed
Elowin
whats the original pronunciation
im stupid i based it off the anglicized pronunciation
Elohim
idk
we don't know too much about biblical hebrew phonology's exact details
so w or k who knows
wiktionary votes for Ilowin
but no one is using that
but should /h/ become /k/ or /w/
but ykw im not about to explain how to pronounce biblical hebrew terms cs i know nothing abt that
for /w/ that's just /h/ disappearing and epenthesis innit right
yes
but i was lazy
oh okay
yeah but sewi comes from Arabic allah
But allah is the same root as el (hebrew for God), right? Allah isn't someone's name.
no longer available 😭
[Reply to:](#1162756469771472988 message) https://www.youtube.com/shorts/jLE1fn79NI0 nah because if you're going to fake anybody's voice to sc… 📎
yeah. it's related to Elohim (one of the main terms Jews use for our god)
[Reply to:](#1162756469771472988 message) But allah is the same root as el (hebrew for God), right? Allah isn't someone's name.
iirc it's a contraction of al and illah (which i think is cognate with Hebrew illoh)
checking wiktionary, illah is cognate with el and eloah
from proto Semitic *'il
sewi is also just a normal toki pona word
so is God
Well of course.
yeah; they're both from the Semitic triconsonantal root /ʔ-l-h/ (from PS *\ʔil or /ʔ-l/ with a vocative suffix). Allah (Arabic) was originally definite ("the god"), while Elohim (Hebrew) was originally plural ("gods"), though both are normal now. El (/ʔ-l/) was also a god of the Canaanites and ancient Mesopotamians, and is seen at the ends of names like Samuel, Daniel, Raphael, Michael, Gabriel, Ismael, and Israel.
[Reply to:](#1162756469771472988 message) iirc it's a contraction of al and illah (which i think is cognate with Hebrew illoh)
for my linguistics class i am working on a toki pona parts-of-speech tagging process in order to analyze the distribution of different words in different syntactic positions (e.g. pona is probably more likely to be function as an adjective, pali is probably more likely to function as a verb, etc). i'm currently just thinking of ways to actually implement this
https://web.stanford.edu/~jurafsky/slp3/old_oct19/8.pdf this method looks very promising
it's an interesting problem because i want to be able to differentiate between "mi moku" and "kili li moku"
we speakers intuitively understand that "mi moku" is likely to be moku=verb since it's way less likely that i am food
and "kili li moku" is likely to be moku=noun since it's way less likely that kili is eating
but it's tricky creating something that will be able to know the difference
"mute"
- ilo gemini
they're all like this
[(click to see attachment)](#1162756469771472988 message)
well, likely not the last part, but
"o moli e sina" - ilo Kukulu
"how many?" "many."
anywhere between 1 and technological singularity
win
10 others guessed this too
I am surprised you didn’t know that there are new people who beat english with toki pona every week
google translate accurately sharing it's feelings about toki pona
musi a
I see a lot of "laptop" in Toki pona translations on youtube
some of these evoke great humor
"laptop" in Te Reo Māori is "pona"
[Reply to:](#1162756469771472988 message) I see a lot of "laptop" in Toki pona translations on youtube
(that's the indigenous language of Aotearoa / New Zealand, if anyone's wondering)
i'll take it.. but... 
win!
that is so cool
toki pona
(it actually means "lap", the full term being "rorohiko pona". but yeah)
[Reply to:](#1162756469771472988 message) that is so cool
oh, and a toki is a traditional Maori axe / adze / chisel
so if you've ever heard "laptop ax", that's where it comes from
a good rorohiko!
i was born in 1998 i don't know what a toki pona is sorry
sina sona toki Mawi anu seme? mi wile kama sona ni.
"I'll die today"
- my markov chain like thing
it also generated the
emoji after that
mi sona ala! mi sona e ni ale tan alasa ilo
[Reply to:](#1162756469771472988 message) sina sona toki Mawi anu seme? mi wile kama sona ni.
tenpo weka pini la, mi lon ma Nusilan la, mi esun e lipu tu pi toki Mawi. taso mi lukin ala e ona! tenpo lon la mi lon ilo waso li kama tan ma Nusilan. mi kama sin lon tomo mi la, mi wile open e kama sona pi toki ni.
o nii
https://youtube.com/shorts/BcEUXP9FUIQ?si=Kxk2lfsgOImtBxgS ai is failing cognitive tests
Im surprised about how high of a score they have
Sure not perfect, but certainly a stepping stone in the medical field if one were to create a specialized model
yes but it's still
not good
that 4o is the only one with a score that's even barely passing
Right, but gpt4o got a passing grade, which is certainly a sign of possible improvements
Obviously I would choose a human over an LLM
but still it's advancing
Oh Gemini always sucks
But I'm assuming the paper tested 1.5 from a couple months back
Google's newer model beats gpt4o on all benchmarks
wait really???
i think most of them
Google bard is DOOKIE POOP bearly good for anything. But as for Google's new Gemini model, It's a new model that you can use, it's experimental so you have to go to aistudio.google.com to test it
But if you want to see it's performance, Check out https://lmarena.ai/ go to leaderboard -> overview
holy shit it says gemini 2.0 is also better than gpt 4o?
Ye
But take it with a grain of salt, usually models have different knowledge suited for different use cases
E.g. claude is really really good at programming, but has a pretty old dataset, or one model is better at multilingual tasks than the other
But anywho the point I'm getting at is, any developer can train a model to ace any benchmark, e.g. that medical benchmark, but no benchmark can fully encapsulate the complexity of actual real life use cases.
If openai wanted to make a chatgpt model suited for a hospital, they could easily do so, all they would need is a bunch of data, however, quality will always fluctuate due to the nature of LLMs
jan Misali li noka e sike
jan Jan pi ma Poka
Jan Misali is not a Polish footballer. though I'll admit I thought they were Slovak for like a year
I think we misread them as Jan Mišali the first time
honestly same, back in the day when I thought their first name is Jan
us too, us too
noooo
WHAT
they are as real as their theories are
so truee
real
ah so it was a metaphor the whole time 
looking back at this it's really a good way to mark tone without diacritics
idk man i just had to think of something and that came to mind
láadan*
lAadan
this is the funniest shit ever sorry
[Reply to:](#1162756469771472988 message) wOoO jiAao Niwe
dAa jia hAaO wOoO yAo chi bIing qi lIing (this is specifically how he says it)
wOoO hEeEn xi huAn bIng qi liIn
wasn't there a thing where him saying it changed the pronunciation of one of the characters because he said it wrong
wOoO? no. OwO
ō wō
wait no not the tone
it was the q in bingqiling
he pronounced it like ch
i should never doubt john xina's tone mastery
冰匙淋
honestly because qilIing was borrowed from english it could've been chiling
well no
because ki corresponds to qi
he said bing1 qi1 ling4
[Reply to:](#1162756469771472988 message) wasn't there a thing where him saying it changed the pronunciation of one of the characters because …
iirc
i dont really remember tnh
ok i dont 100% blame him for this
[Reply to:](#1162756469771472988 message) he pronounced it like ch
actually yeah he did
tones all wrong but it was iconic so
bingchiling when
today i learned linguist li kon
does that n have a high tone on it? :p
[Reply to:](#1162756469771472988 message) wOoO jiAao Niwe
consonants with tones...
it was borrowed from English "cream", not "chilling". ⟨q⟩ is an allophone of /k/ before close front vowels (⟨i y⟩); "qílín" would have originally been pronounced "kílín".
this word in specific was borrowed through southern Sinitic languages, which keep the [k]; see Cantonese kei4lam4, Taiwanese Hokkien kî-lîm, and Hakka khì-lìm
just realized you said this already. anpa a
[Reply to:](#1162756469771472988 message) honestly because qilIing was borrowed from english it could've been chiling
it exists! Mandarin has marginal syllabic nasals m n ng
[Reply to:](#1162756469771472988 message) consonants with tones...
I think one of my systemmates made a list of them once, but I could not possibly tell you where it is
i know cantonese has "m" which is used very frequently
