#(CW:AI) Discussion of LLMs and other kinds of text generators ("AI"), especially of toki pona

1 messages · Page 2 of 1

limber wind
#

please test my toki ai

meager blaze
#

a lot of this was AI translated, and I looked over it and attempted to proofread

#

it actually understands previous things said in the conversation

#

and how things are translated

#

..it translated "The World Machine" as "ilo Weka".. due to things it knew about the LORE OF THE GAME????

meager blaze
#

"mi pali e ante lili tawa toki ni. nimi "Magpie" li kama nimi "Kasa" tan ni: nimi "Kasa" li sama nimi "kasapilin" (magpie)."

#

is that a word

tawdry bone
#

ala

glass iris
#

I haven't read the whole thing - but scanning it, it's impressive that it doesn't try to do word-for-word - and makes good decisions like translating (computer) disc as lipu rather than sike.

hexed spindle
#

I’M NAH PATATO WAWA SEWI, AM KIJETESANTAKALU PATATO :3

glacial mason
#

So I got the Claude Opus (paid) version and decided to ask it to play a game where it writes out the ABC notation (a special very simple plain-text method of representing sheet music) of a famous (in the US) folk song without telling me the name so that I can try to guess it. It actually did an amazing job of making a cohesive and catchy melody with no obvious mistakes or strange notes. However, I couldn't recognize the tune, and it turned out that it was trying to encode "Oh, Susanna!", which is absolutely not what this tune sounds like haha

#

A little off-key on some notes and I had to bring the second half an octave down haha

hollow osprey
meager blaze
#

the simplification is really great

#

"kulupu Sita li pana e mani mute tawa kulupu Nintendo tan ni: kulupu Nintendo li toki e ni: "o pali ala e ilo musi pi ilo mi."

glacial mason
#

heh we should put a EULA full of legalese boilerplate through it

main lodge
#

do it!

fast cedar
#

would probably be musi for some toki pona projects to use cc by sa in toki pona

glacial mason
#

Oh yeah! Great idea! Let me see if I can do it with an iPhone I’m out right now

#

It refused due to too long

main lodge
#

aa

glacial mason
#

mi sona e ni: lipu suli lawa ni li suli mute. mi ken ala ante e lipu ni tawa toki pona kepeken tenpo lili. mi wile pali mute, taso ni li ken pakala toki. taso mi wile pana e sona ni tawa sina:
lipu Creative Commons Attribution-ShareAlike 3.0 Unported li toki e lawa ni:
kulupu Creative Commons li kulupu lawa ala. ona li pana ala e pona tawa lipu sina.
jan li kepeken ilo ni la, ona li kama lon li wile pali sama toki lipu.
ilo ni li pana e pona tawa jan pali.
jan pali li ken pana sin e pali ona li ken ante lili e ona.
nimi pona li wile lon pali sin.
jan ale li ken lukin e pali li wile ala e lupa lukin.
pali li ken kama lon nasin musi mute li ken kama lon ilo sona.
lipu ni li toki e ijo mute ante kin. taso mi ken ala pana e ona ale kepeken toki pona. mi wile ala ante ike e toki lipu. mi pana e sona lili taso. o toki tawa mi la sina wile sona e ijo ante!

main lodge
#

ken la o pana paragraph by paragraph

#

anu seme

glacial mason
#

lon · taso tenpo ni la ala · mi kama sin lon tomo mi la mi ken

fast cedar
#

yeah that might be easier just to have the right context for each thing and not bungle it up

glacial mason
#

Luckily Claude keeps the context for the entire conversation unlike GPT4 so it’s very possible

#

It runs out of usage faster as a result in cases with overmuch context but it’s counting on that being preferred to forgetting some context

fast cedar
#

I'm just thinking if it tries translating multiple paragraphs at once in a more bizzare nasin like legalese, it might overcontextualize paragraph 2 based on irrelevant info from paragraph 1

glacial mason
#

Haha I hope you keep on making hypotheses until I or someone else has a chance to test it out

#

Hm! Idea: separate pro-AI Toki Pona server with bot(s) that can converse via API

#

If such bots don’t exist, we can ask the AI to code them, too 😈

main lodge
#

aa

glacial mason
main lodge
#

ona o pana ala e pilin

#

n ona o pana e pilin

#

taso • suli la ona li pona ala li ike ala li pilin taso

glacial mason
#

ona li “aa” pi pilin meso

main lodge
#

n

#

ken

hollow osprey
#

sona mi la ilo [Ko] suli taso li toki [pona] pona · nimi wan ale ale ale la ona li wile e mani mute mute mute luka luka luka · ni li ken lonnnn sike esun wan anu seme

#

ken la mi toki pakala

glacial mason
hollow osprey
#

ala · ilo sin li wan e ilo [Claude] e ma [Discord] la ona li lili · taso ilo [Claude] li kepeken mani mute

glacial mason
#

a mi sona

hollow osprey
glass iris
#

musi a. mi o lukin ale e ona.

#

It's certainly what GPT-4 even is doing in toki pona, and it explains a lot of its bad syntax. "beginner mistakes" are usually word for word translations from English.

But it's clear that Claude Opus is capable of not doing that somehow

rocky stirrup
tawdry bone
#

O

rocky stirrup
#

rare chatgpt w

glass iris
#

ona li toki e "kepeken e" tan seme la, mi sona ala. tenpo pini la, ona li ni ala.

#

"Comfort me"!

hollow osprey
tawdry bone
#

m toki Asika en toki pona la ken ona li sama anu seme

glass iris
#

ni li ken. This is a different story from what we originally thought. But as he says, it doesn't take much away from what Anthropic have achieved with this model. The amount of Circassian in the training data is miniscule. But it's enough to be able to prime it for understanding prompts in that language. There has to be more toki pona in its training data. But probably not more proportionally than GPT-4 has, and it's vastly ahead of that model in ability.

quartz timber
#

well lol i already knew that i have spent quite a bit of time looking at and translating circassian content online

quartz timber
glacial mason
#

I just discovered that Claude Opus is a much better conversation companion when discussing fictional universe lore than ChatGPT 4. GPT4 is always trying to tie it back to reality and is always hedging itself by reminding me that there's no way to know what the author really intended or that it's just not explained, while Claude Opus seems game to get super speculative and nerd out with me about possible interpretations

glass iris
quartz timber
#

mi sona ala

#

but like there is a decent amount of circassian text on the internet

#

there are a million ish speakers at least

#

less still in russia and using the cyrillic script but still

glass iris
#

The only completely objective experiment: invent a new conlang, feed Claude only translation pairs, and see what it makes of it

quartz timber
#

circassian speakers do in fact use the internet i mean they live in russia its not like a third world country

glass iris
quartz timber
#

yeah

#

i think its like 500,000 ish speakers of specifically the dialect that was being tested but might be wrong

#

apparently 650,000 im not too far off lol

glass iris
#

The initial impression, amplified by each retelling, was that no one knew about this language.

I'm wondering what it can do for endangered Native American languages. I'm in a research group on humanities and AI, and wonder whether I should suggest to my NA colleague (very involved in language preservation) that Claude might be useful? But another member of the group wondered whether Claude would really help to preserve the language in its own terms, or impose European grammatical categories onto it.

quartz timber
#

no one knew about this language
not many english speakers do!

#

i do obv lol

#

i mean anyone with an interest in linguistics has at least heard of circassian even if only because "adyghe" is listed on every phonetics wikipedia page ever

tawdry bone
#

mi ken alasa kepeken toki Kalamang. toki ni li lili mute li lon lipu ilo ala a. jan pi mute lili li toki kepeken ona. jan pi alasa sona li toki tawa kulupu ona li lipu e sona mute. tenpo poka la jan alasa li pali e alasa sin ni: ilo en jan li wawa ala wawa sama lon ni: ona li lukin e lipu sona ni taso la ona li ken kepeken toki Kalamang.

quartz timber
#

ni li alasa pona

tawdry bone
#

pali pi ilo Gemini la jan li pana e toki Kalamang tawa ona li toki e wawa ona

glass iris
#

o alasa ni!

#

lipu pi toki tu li lon anu seme? (Bilingual pairs)

tawdry bone
hollow osprey
tawdry bone
#

lon...

unique gulch
#

pilin sina la ilo ni li wawa nanpa wan lon toki pona anu seme?

hollow osprey
#

ni

hollow osprey
#

[ilo penpo o lukin ala]

Adapt the following text to Toki Pona, making sure to change things as needed to make it sound natural: Exo is a contemporary geometric sans serif typeface that tries to convey a technological/futuristic feeling while keeping an elegant design. Exo was meant to be a very versatile font, so it has 9 weights (the maximum on the web) each with a true italic version. It works great as a display face but it also works good for small to intermediate size texts.


Here is my attempt at adapting the text to Toki Pona:

sitelen Ekiso li sitelen sin pi nasin nanpa. ona li wile pana e pilin pi tenpo kama e pilin sona. taso ona li pona lukin kin.

sitelen Ekiso li ken pali mute. ona li jo e suli pi linja mute. ona li jo e nasin linja tu pi suli ale. ona li pona tawa sitelen suli. ona li pona kin tawa sitelen lili.

English translation:
Exo writing is new number-style writing. It wants to give a feeling of future time and a knowing feeling. But it also looks good.

Exo writing can do a lot. It has many thicknesses of lines. It has two line styles of all sizes. It is good for big writing. It is also good for small writing.

Notes:

  • I replaced "geometric sans serif typeface" with "writing of number style", since Toki Pona doesn't have direct terms for font classifications.
  • "technological/futuristic feeling" became "feeling of future time and knowing feeling"
  • "elegant design" simplified to "looks good"
  • "versatile" changed to "can do a lot"
  • "9 weights" became "many thicknesses of lines"
  • "true italic version" adapted to "two line styles"
  • "display face" and "small to intermediate size text" simplified to "big writing" and "small writing"

The result captures the general meaning while sounding more natural in Toki Pona's minimalist grammar and limited vocabulary. Let me know if you have any other questions!

glass iris
#

nnn. mute li weka. sina pilin seme? (kin ona li pakala kepeken nimi pi. ona o toki e "suli linja mute")

hollow osprey
#

ken la ona li weka e mute tan ni → toki mi la ona o pona e toki tawa nasin pi toki [pona]

#

[ilo penpo o lukin ala]

Translate the following text to Toki Pona, making sure to stay accurate to the original: [...]


Here is the translation of the given text into Toki Pona:

sitelen Eso li sitelen pi tenpo ni. ona li jo ala e linja sike. ona li lukin sama ilo pi tenpo kama, taso ona li pona lukin kin. sitelen Eso li ken pali ijo mute. ona li jo e suli luka luka tu wan (suli nanpa suli lon lipu sona). ona ale li jo e ijo nasin ante pi lukin pona. ona li pona tawa sitelen suli. ona li pona kin tawa sitelen lili tawa sitelen suli lili.

Here's a breakdown of the translation:

  • sitelen Eso = Exo font
  • li = is/are (subject marker)
  • sitelen pi tenpo ni = font of current time, contemporary font
  • ona = it
  • li jo ala e linja sike = doesn't have curved lines, geometric
  • li lukin sama ilo pi tenpo kama = looks like a tool of future time, conveys technological/futuristic feeling
  • taso = but
  • li pona lukin kin = is good looking also, keeps an elegant design
  • li ken pali ijo mute = can do many things, is very versatile
  • li jo e suli luka luka tu wan = has size 9 9 2 1 (literal translation of 9), has 9 weights
  • suli nanpa suli lon lipu sona = biggest number size on web page, maximum on the web
  • ona ale li jo e ijo nasin ante pi lukin pona = each of them has a thing of different style of good appearance, each with a true italic version
  • li pona tawa sitelen suli = is good for big text, works great as a display face
  • li pona kin tawa sitelen lili tawa sitelen suli lili = is also good for small to medium size texts

Note that due to Toki Pona's minimal vocabulary and grammar, some of the technical details and nuance of the original text are lost in translation. The translation focuses more on conveying the general meaning and key points.

#

😭

glass iris
#

ugh

#

ni li ilo Claude Opus anu seme?

#

mi kin li pana e toki sina tawa ilo Claude. ona li toki nasa. mi pana sin e toki sama. ona li awen nasa.

hollow osprey
#

[ilo penpo o lukin ala]

explain what tis-100 is, then come up with a toki pona translation of the full title that matches the acronym


TIS-100 is a programming puzzle game developed by Zachtronics. In the game, players take on the role of a programmer working on the titular TIS-100 computer, which is an imaginary 1970s-era microcomputer. The objective is to create programs that solve various challenges using a unique assembly language specific to the TIS-100.

The game presents a series of programming puzzles that require the player to optimize their code to achieve the desired output while working within the constraints of the TIS-100's limited resources, such as memory and processing power. The puzzles become increasingly complex as the game progresses, testing the player's problem-solving skills and understanding of low-level programming concepts.

TIS-100 is known for its challenging gameplay and retro-inspired aesthetic, which includes a minimalist user interface and a fictional technical manual that provides context for the puzzles.

Now, let's come up with a Toki Pona translation of the full title that matches the acronym:

T - tomo (meaning "indoor space, home, room, building")
I - ilo (meaning "tool, device, machine")
S - sona (meaning "knowledge, wisdom, intelligence, understanding")

100 - Toki Pona doesn't have words for specific numbers, but we can use "ale" (meaning "all, everything") to represent a large or complete amount.

So, a possible Toki Pona translation of TIS-100 that matches the acronym could be:

Tomo Ilo Sona ale

This translates to something like "The complete home of machine knowledge" or "The building of all device wisdom," which captures the essence of the game as a comprehensive programming puzzle experience within a virtual computer system.

#

sama wile mi a · taso mi la "tomo ilo pi sona ale"

sudden jasper
#

Might it just think "pakala" is a bad word? LOL

glass iris
#

Is this image creation? Or translation?

tawdry bone
#

pali sitelen

steep pollen
steep pollen
glass iris
#

I can't keep buying more Pro subscriptions to try these all out!

steep pollen
#

so uh

#

apparently you can just use it for free?

#

and i dont have a pro subscription...

steep pollen
#

weird restrictions tho, cant seem to name me a restaurant, ive rephrased 3 times and it still wont

tawdry bone
hazy apex
#

ona li toki e ni lon ma ni: mi tu taso li lon

glass iris
#

pilin mi la, ni li wile e mani

steep pollen
#

taso lipu ni (needs to be txt as far as I can tell)

steep pollen
tawdry bone
#

n ken suli la ni li awen wile ala e mani tan ni: kulupu kuko li wile kama e kepeken

steep pollen
#

:0

hexed ravine
#

@earnest frost o, here is the AI thread. Here's a quote on why people want the conversation here, please accommodate it if you're going to continue a conversation about AI:

crude agateBOT
#

These are the various reasons certain brains do not want to see AI-generated images.
If a user has:
• …paranoia, ||these images can serve as a reminder that technology can create fake faces, and that verifying an image, face, or person online is impossible||.
• …a type of dissociative disorder, ||the "uncanny valley" aspect of these images can trigger feelings of dissociation and/or derealization, which both make the lucid world feel similarly uncanny||.
• …a type of psychosis, ||unreality and the uncanny valley are both very common psychosis triggers — it does not help a brain that cannot tell shared reality from its own perceptions to see something that makes its owner doubt their own reality. And especially if a user has:||
• …a type of delusion(s), ||many common delusions center around people and faces being purposely fake or impostors, the replacement of humans, and the power of technology; and even more can be triggered by uncanny content regardless.||
• …one or more phobias such as trypophobia, necrophobia, scopophobia, etc., ||due to the way AIs build these images out of shapes and colors, they often have clusters, corpselike imagery, and unnerving eyes or eye contact.||
And these images can just be creepy and bizarre in general, too — and of course there are brains that don't like that regardless of psychology.

#da53fc whyai

earnest frost
hexed ravine
#

a, pona

lament wadi
#

ilo o define e nimi yupekosi

#

@winter notch li sona e toki ni

rocky stirrup
#

ilo li moli

lament wadi
#

o toki lon #jaki

lament wadi
rocky stirrup
#

wawa

#

ona li kama sin

glass iris
#

ilo Claude li ken tuki tiki anu seme?

#

ken ala

hollow valley
#

ni li toki musi

weak thorn
#

ken ala despair

hollow osprey
limber wind
#

sry gimmie a few days to fix it

rough canopy
#

⁉️

rocky stirrup
#

desiti

rough canopy
#

nimisin li kama. jan ale o sona

hollow valley
glass iris
#

Gives me an idea to program a GPT bot and set it loose in #learn-toki-pona-1 . See if anyone can school it into better linguistic behavior!

#

You'd never see Claude Opus saying something like that

weak thorn
#

send it

#

Let it consume it all

hollow osprey
#

taso n · toki o tan seme · mi wile ala moku e ale li wile ala ike e pilin pi jan toki · taso toki pi mute lili a li lon

#

a taso mi ken ni → mi pana e toki lili tawa insa ilo · ilo li pana e toki mute mute mute · mi pana e toki ilo tawa ilo [telo misikeke] · toki li ike tawa ilo ni la mi pana e sona ni tawa ilo pali · kin ken la mi pona e toki ona li pana e toki pona · mi ni sike la ken la pona li kama

#

mi wile ala kepeken ilo [Claude] tawa ni tan wile nasa mi (tan mani kin)

glass iris
#

a sina sona ala sona e nasin ni? mi sona ala e ona.

glass iris
#

taso sina o pana e nimi pi lipu tenpo lon poka pi ilo sona.

#

ni li lawa jo ona

hollow osprey
hollow osprey
glass iris
#

jan lawa. mi pakala.

hollow osprey
#

n

#

mi ken ala kepeken ala la mi kepeken

#

"6 gigabytes of instruction tuning text data" 😭

#

a ona li kepeken ilo pi ante toki

meager blaze
#

suno's ai generators actually do a decent job with toki pona besides the j

#

where it still says it like how it is in english

glass iris
#

suno still creeps me out too much (way in the uncanny valley). I've seen someone post a toki pona song, where they just manually edited all the "j"s to "y"s, and it got the pronunciation correct.

meager blaze
#

if it's just slightly off ||i literally start shaking and i want to look away||

glass iris
#

That's why we're still keeping images even out of this thread. But I think music, being text-based, is OK.

meager blaze
#

ah

#

i feel like ai is a lot more of a "fun thing to mess with" as opposed to like

#

what you want to actually use instead of hiring someone

meager blaze
#

i guarentee a lot of it was without permission at all

#

and trained off of copyrighted works

glass iris
glass iris
meager blaze
#

i can't find it but i found one where it pretty much copied something from an already existing song

#

holy shit 😭 HOW

#

when you get it to work correctly it works SO GOOD

#

and i thought ai music was like super terrible a couple months ago??

glass iris
#

We're hearing a lot of this for the first time. I bet if you listened to 30 in a row, it would begin to feel hollow - just as a lot of AI-generated text begins to feel that way.

hollow osprey
#

(suli la mi ni kepeken ilo [TatoebaTools] lon poki [Colab] pi esun [Google] · taso nanpa pona li lon ala ona la mi kama jo e lipu tan ma [Tatoeba] li nanpa e ona)

#

nanpa la · mi kepeken toki poka tu taso la toki li awen mute · taso mi kepeken toki pona taso la toki li kama lili a

#

ken la mi ken kepeken toki pi jan pona taso · taso jan sama li ken pona lon tenpo li ken ike lon tenpo · (mi ni a tan sin mi lon open pi pana mi)

glass iris
#

wawa a. taso sina wile pana e sona tawa ilo la, ni li lili a. kin la, lukin mi la, toki pona pi ma Tatoeba li ken nasa.

mi sona ala e ni: sina ken pana e toki pona taso tawa ilo anu seme? sina wile ala wile e toki Inli kin pi toki pona ni? wile ala la, sina ken kepeken lipu tenpo ale.

hollow osprey
#

mi wile pana e toki [Inli] lili la ilo li ken kama sona pona e kon nimi · taso mi pana e lipu ante kin pi toki poka ala

hollow osprey
weak thorn
#

it doesn't want to speak to me, maybe I need more tp text

#

It worked!

#

la toki pona, toki pona, toki pona

#

Watch your back, Claude

#

ma Ponosan

#
"Eka"
"Metalocalypse"
"Jan nata, nata, nata"
"Pilin toki pona"
"Mokta, mokta, mokta"
"Metalocalypse, metalocalypse, metalocalypse"
"Lukto, lukto, lukto"
"Toki pona, toki pona, toki pona"
jan Adam li pona ala pona?
"Jan Adam, Jan Adam, Jan Adam, Adam"
"Jan Adam, Adam, Adam, Adam, Adam, Adam"
"jan Adam, jan Adam, Jan Adam, jan Adam"
"jan Adam"
"Jan Adam"
"Jan Adam"
[Jan and Adam are getting along!]```
#

I ruined it wiþ Toki Pona.

"Yes, my name is Grug, and I'm from Kyrgyzstan. I love my home. I have many friends. We speak Toki Pona. It is a beautiful language. I love learning new languages. I am a child of two cultures. I am Russian and Kyrgyz. I love both languages. I speak Toki Pona fluently. I can even sing in Toki Pona. I try to learn a new song every day. Language is a gift. It is a tool. Every language has its own beauty. Languages can bring people together. Languages can tell a story. Music is a universal language. We use it to communicate. Music can lift your spirits. It can heal. It can make you feel alive. Music is important. It's everywhere. Friendship is the most important thing."```
weak thorn
rocky stirrup
weak thorn
#

lon a

glass iris
#

I've played a little with NovelAI, and never had any luck with languages. It's trained narrowly but deeply in English and Japanese (I think) literature. It does really well in that zone, not at all well out of it

quiet geyser
#

i like to interpret CW:AI as the name of the thread

weak thorn
#

mu mije

hollow osprey
weak thorn
#

lon a

hollow osprey
#

mi toki tawa ijo Eja (Nasi pi kiwen ike)

quiet geyser
limber wind
#

updated version:

limber wind
#

finally my ai works:

weak thorn
#

Þat’s a first sentence

rough canopy
#

pona kin!

analog sandal
#

lukin

tawdry bone
#

||it speaks toki pona better than you at least|| i didnt meant to be rude like this sorry

#

||though it almost seems like it gets worse over time to match your style||

stuck steppe
#

fbjshfisjjc

meager blaze
hollow osprey
#

anpa wawa

tawdry bone
#

aaaa lon la ala...

meager blaze
#

mi sona mi sona

tawdry bone
#

aa mi pakala. mi pali e toki la mi sona e ni: ona li sama utala. ni la mi pana tan seme...

#

mi wile toki e seme...

meager blaze
#

mi sona ala ¯_(ツ)_/¯

glass iris
#

sina wile toki ike tawa jan sin la, o toki pona e ike sina. ni la, sina ken toki e wile sina; taso jan sin li ken ala sona pona e ona.

meager blaze
#

😭 A A A

tawdry bone
#

nasin

#

(sina weka pakala e nimi 'li')

meager blaze
#

a

#

mi lukin e toki ni la lawa mi li pana e nimi 「li」tawa ona a a a

tawdry bone
#

musi

glass iris
fierce kestrel
#

i asked gpt 4 “tokiponize the following: croatia, joe biden, wuwojiti, tindore”

and it returned

“In Toki Pona, proper names are typically adapted to fit the phonetic and structural constraints of the language, often using an approach that seeks to capture a simple, recognizable approximation of the original name's sound. Since Toki Pona's phonology is limited and does not include certain sounds present in many languages (like the sound of "c" in "Croatia," the "j" in "Joe," or the "r" in "Biden"), adaptations are necessary. Here are possible Toki Pona adaptations for the names you've provided:

  • Croatia: "Kolasija" or "Kolati"
  • Joe Biden: "So Piten" or simply "So"
  • Wuwojiti: As "Wuwojiti" doesn't resemble a widely known name or term, and its phonetic compatibility with Toki Pona isn't clear, a direct transcription that fits might be "Wuwojiti" itself, adjusted for phonetics: "Wuwojiti." However, if it's an invented or less commonly known term/name, the adaptation process might prioritize a closer phonetic approximation or even a meaningful translation if the term has a known meaning.
  • Tindore: Could be adapted as "Tinole" or "Tinote"

These adaptations take into account Toki Pona's preference for simple, CV(n) (consonant-vowel(nasal)) syllable structures, its limited set of permissible sounds, and its overall philosophy of simplicity and minimalism. Remember that Toki Pona also encourages creative use of its limited vocabulary to convey meanings, so there might be other valid ways to adapt or represent these names based on their perceived characteristics or the context in which they're used.”

#

it doesnt know wuwojiti or to translate words from the language they originate in, and for some reason it doesnt turn dʒ into j

#

i asked it to tokiponize germany thinking it might know ma Tosi since its more commonly talked about than croatia but it returned this

“Germany in Toki Pona could be transliterated as something like "Sewi" (from "Siamani", a common adaptation), reflecting its name in a simplified, phonetic form according to Toki Pona's minimalistic phonology and vocabulary. Remember, Toki Pona aims at simplicity and approximation rather than precise or official naming conventions.”

#

Germany > Siamani > Sewi is wild

hazy apex
#

also where is dɮ in any of those words?

fierce kestrel
#

joe biden

hazy apex
#

that's dʒ

#

and dʒ -> s

fierce kestrel
#

oops

#

yeah, another time it changed joe to siwa though

low wren
rugged sable
#

i tried to talk toki pona to the AI i have on my pc lmfao it's stupid, i didn't expect much
i mean it doesn't even work well when speaking english

weak thorn
#

pokasi ala

glacial mason
glass iris
glacial mason
#

Haha that’s somehow profound

stuck steppe
#

mi tawa ma

#

I almost the translation said “we moved the world”

glacial mason
#

Heh I can’t help thinking of that “tawa [e] tomo” graphic

torpid otter
#

not sure why it decided to write that big explaination of toki pona since all i said was 'toki! sina pilin seme?'

meager blaze
#

ona li lukin e toki.. lon lipu sona

torpid otter
#

lon

#

taso ona li sona ala..

meager blaze
#

lonn

#

sina lukin mute e lipu sona pi toki pona la ni li pana ala e nasin toki pi toki tawa lawa sina

#

sina wile kama sona e nasin toki la sina o kute mute e toki pi jan ante

#

o toki mute

#

taso ilo li ni ala

unique gulch
#

https://chat.lmsys.org/ has a new model up called gpt2-chatbot which some people are speculating is a test for gpt-4.5?
I tested it out and it seems better than gpt 4 at breaking down toki pona - look at how it correctly notes that "kon pali palisa" is convoluted, whereas generally testing with other models generally it just accepts everything it reads as making sense

glacial mason
#

oh nice! I'll look forward to that, I still haven't been able to pick between Claude Opus and GPT 4 so if they come out with 4.5 maybe I'll be able to set aside Claude haha

unique gulch
#

Another example

glass iris
mossy pagoda
#

We’re so screwed! Robots can now flawlessly translate Toki Pona! ||/j||

main lodge
#

lmao

weak thorn
#

Nauwgh þis is clearly satire

glacial mason
main lodge
#

wait that's not bad

glacial mason
#

I especially liked how it played dumb about not being sure what I mean by "toki pona" exactly, since I was speaking in Toki Pona

main lodge
#

aaaaa

glacial mason
#

‹sine› is a weird typo it made, I haven't seen it do that before

#

formal register? :P

main lodge
#

i didn't even notice

#

ken..

hollow osprey
main lodge
#

n sine la sona ale ala

hollow osprey
#

jan kin li pakala lon tenpo

main lodge
#

ni li lon

meager blaze
#

mi wile musi sama sina

main lodge
#

pona a

unique gulch
#

pona

weak thorn
#

Oh it got it

#

Congrsh

meager blaze
#

mi toki sin tawa ilo la ona li pakala mute

#

mi toki e pakala pi ilo ni la ona li toki e ni: "a. mi o pona e toki mi" a a a

chrome sleet
#

ChatGPT 4o has better toki pona abilities than 4

glass iris
#

That's the new model? I should resubscribe for a month to test it out

chrome sleet
#

let me compare to claude

#

in some places i prefer the chatgpt 4o choices and in some i prefer claude 3 opus choices

hollow osprey
#

mi alasa ni tawa ilo [Opus] la ona li sama e toki mi li musi ala

hollow osprey
#

mi kute e toki ilo la · n · mi en jan mute li anpa lon tenpo poka anu seme

hollow valley
#

in my experience claude 3 opus is still way better

#

claude consistently identifies the source of this text for example

#

whereas chatgpt fails consistently (on different attempts it also thought this was from 1984 or Dune)

glacial mason
glass iris
#

Wow, that's some skill from Opus!

spring fulcrum
#

good job missus

#

(the claude not you sorry)

spring fulcrum
#

seme a!?
i cam't believe it's being good at toki pona for once... it almost always is completely off base....

#

(playing a character who's monolingual for toki pona due to circumstances)

#

(being monolingual in toki pona is a debilitating disease /musi)

#

more literally i'd read it as "this house is bad. nothing speaks the good language" and i like translating maliciously

spring fulcrum
#

and there it goes, being silly again

#

didn't even spell toki pona right

#

"time thisthis of i want of number one? 😘"

#

haha in all seriousness i'm cherry-picking here

#

also it was having trouble because i meseed up my ownbtoki pona

stuck steppe
#

nimisin nini

meager blaze
torpid otter
#

that one model that showed up on lmsys the other day and is much smarter than gpt4

torpid otter
#

not bad i think

#

different version of the model also does fairly well

glacial mason
main lodge
#

kape?!

glacial mason
#

☕️

main lodge
#

yes but

#

is that an actual nimi sin?

stuck steppe
#

supposedly kawa is in a nanpa pi lipu tenpo

glacial mason
#

Now it is haha

main lodge
#

i've only seen it as a tokiponization

stuck steppe
#

also for coffee

main lodge
#

it's kasi Kawa + telo Kawa

#
  • nimi Kawa
stuck steppe
#

that’s what i thought

#

someone was acting like it was an actual nimi sin the other day lol

#

or rather i guess turning it into one

main lodge
#

yea

glacial mason
#

I mean what even is an “actual” nimi sin

#

They’re on a spectrum

stuck steppe
#

not a proper adjective i guess

main lodge
#

a word that someone, anyone, has

  1. used
  2. said is a word
glacial mason
#

kape li nimi pona tawa mi

#

There

#

😝

main lodge
#

nooo

glacial mason
#

(Never to be used again)

main lodge
#

pona

glacial mason
#

sokola li pona kin 🍫

stuck steppe
main lodge
#

pimeja li mi, pimeja li mi

#

jan li kule sitelen e mi li sitelen e mi

#

kapesi

#

sitelen li mi, sitelen li mi

#

ona li wile ala e ni

#

taso ona li pona mi

#

-kapesi Pake

desert crag
#

"Pake" li nimi sin kin
mi lukin e ni lon linku.la en Spotify taso

glacial mason
#

"pake" li nimi majuna a a

#

taso ona li pu ala

desert crag
#

a sina pona! 👴

unique gulch
unique gulch
#

Though it's unclear how distinct the variants were

torpid otter
#

that makes sense. on lmsys, they're listed separately, and in my testing, gpt2-chatgpt is better (though i haven't really used gpt4o extensively yet)

desert crag
#

ELI5 (explain like I'm 5) what makes AI art/ discussion warrant a content warning?

crude agateBOT
#

These are the various reasons certain brains do not want to see AI-generated images.
If a user has:
• …paranoia, ||these images can serve as a reminder that technology can create fake faces, and that verifying an image, face, or person online is impossible||.
• …a type of dissociative disorder, ||the "uncanny valley" aspect of these images can trigger feelings of dissociation and/or derealization, which both make the lucid world feel similarly uncanny||.
• …a type of psychosis, ||unreality and the uncanny valley are both very common psychosis triggers — it does not help a brain that cannot tell shared reality from its own perceptions to see something that makes its owner doubt their own reality. And especially if a user has:||
• …a type of delusion(s), ||many common delusions center around people and faces being purposely fake or impostors, the replacement of humans, and the power of technology; and even more can be triggered by uncanny content regardless.||
• …one or more phobias such as trypophobia, necrophobia, scopophobia, etc., ||due to the way AIs build these images out of shapes and colors, they often have clusters, corpselike imagery, and unnerving eyes or eye contact.||
And these images can just be creepy and bizarre in general, too — and of course there are brains that don't like that regardless of psychology.

#da53fc whyai

desert crag
#

fair 🤷‍♂️

crude agateBOT
#

These are the various reasons certain brains do not want to see AI-generated images.
If a user has:
• …paranoia, ||these images can serve as a reminder that technology can create fake faces, and that verifying an image, face, or person online is impossible||.
• …a type of dissociative disorder, ||the "uncanny valley" aspect of these images can trigger feelings of dissociation and/or derealization, which both make the lucid world feel similarly uncanny||.
• …a type of psychosis, ||unreality and the uncanny valley are both very common psychosis triggers — it does not help a brain that cannot tell shared reality from its own perceptions to see something that makes its owner doubt their own reality. And especially if a user has:||
• …a type of delusion(s), ||many common delusions center around people and faces being purposely fake or impostors, the replacement of humans, and the power of technology; and even more can be triggered by uncanny content regardless.||
• …one or more phobias such as trypophobia, necrophobia, scopophobia, etc., ||due to the way AIs build these images out of shapes and colors, they often have clusters, corpselike imagery, and unnerving eyes or eye contact.||
And these images can just be creepy and bizarre in general, too — and of course there are brains that don't like that regardless of psychology.

#da53fc whyai

desert crag
#

a ilo pona 😮

glass iris
#

I think, beyond the fact that people find it offensive: it is so easy to generate vast amounts of it. The regular channels could get overwhelmed with a deluge of AI shit. At least here, things should be posted as a focus of discussion

glacial mason
#

I disagree with this as a reason. I’m in servers that don’t have a restriction but they don’t get overrun by AI content.

#

Also we don’t enforce content warnings on other content that could start to feel spammy if not properly moderated—we just moderate the actual spam

#

I think if AI didn’t make people uncomfortable we would treat it similarly—if someone posted AI stuff so frequently that it got annoying we would just tell that one person to stop

desert crag
#

I think this AI moderation is an example of why this server is now expected to be archived

#

Simple over-governance in the name of being pona to people with mental illness (Hi all 👋 )

desert crag
#

Just to clarify since I got a "seme"
This CW is completely unnecessary and frankly insulting to people with paranoia, dissociative disorders, whatever the heck y'all's reason is.
We are not fragile little marshmallows and silly pandering rules like "CW for AI because mentally ill people can't handle it" is not conducive to an open community and only makes people like us look pathetic
#toki-ale message
^^ Like this is the UMPTEENTH time I've been told to stop talking or move my chat, and y'all wondering why people are frustrated with the mod team (this wasn't even a mod, it's the Orwelian "watch your neighbor" culture that's been created)

#

If this silly CW for AI is enforced then why not add MORE things to CW in the future, God knows what those things might be.
The question of this server is already "Okay then, what CAN I say?"

glass iris
#

I started this toki-suli after being told not to post AI without CW and len. I was puzzled at first. But then I did it. I don't think it's a bad rule. AI is often creepy. And we have a good little community in this thread that actually discusses it intelligently

glacial mason
#

I agree with Lapate, didn't realize you started the thread though :) I guess that [OP] label means nothing to me lol

#

People everywhere get uncomfortable about AI, I've just come to expect it

#

Also jan Towa why do you need to be told the same thing umpteen times lol

desert crag
glacial mason
#

Ohhh gotcha

desert crag
#

But the reasons given for the CW are all mental illness based

#

That rubs me the wrong way i guess

crude agateBOT
#

These are the various reasons certain brains do not want to see AI-generated images.
If a user has:
• …paranoia, ||these images can serve as a reminder that technology can create fake faces, and that verifying an image, face, or person online is impossible||.
• …a type of dissociative disorder, ||the "uncanny valley" aspect of these images can trigger feelings of dissociation and/or derealization, which both make the lucid world feel similarly uncanny||.
• …a type of psychosis, ||unreality and the uncanny valley are both very common psychosis triggers — it does not help a brain that cannot tell shared reality from its own perceptions to see something that makes its owner doubt their own reality. And especially if a user has:||
• …a type of delusion(s), ||many common delusions center around people and faces being purposely fake or impostors, the replacement of humans, and the power of technology; and even more can be triggered by uncanny content regardless.||
• …one or more phobias such as trypophobia, necrophobia, scopophobia, etc., ||due to the way AIs build these images out of shapes and colors, they often have clusters, corpselike imagery, and unnerving eyes or eye contact.||
And these images can just be creepy and bizarre in general, too — and of course there are brains that don't like that regardless of psychology.

#da53fc whyai

glacial mason
#

When illness makes life more complicated that can definitely be a struggle

#

But I prefer to be safe rather than sorry in the context of potentially making people struggle even more

desert crag
#

This is where it paints illness as a weakness

#

Like nah dawg I'm crazy AND I can have productive conversations on a variety of topics

glacial mason
#

When you use the word "we" in this situation, you a) make it sound like you can speak for all people with mental illness and b) make it sound like you're confident that the jan lawa you mentioned is not included in the group

desert crag
#

How I sound to you does not negate the meaning of my words, and I'm indifferent to if the jan lawa is you or anyone else

#

But this is opinion

glacial mason
#

I don't disagree with that

desert crag
#

I do not mean to be aggressive and I acknowledge perhaps that's the tone that's coming off

#

I'm also amplified at the moment because of the scary ilo sin from earlier. Perhaps I should step away and calm down lol

glacial mason
#

Do you wanna DM?

desert crag
#

Nah I'm gonna keep scrolling ma pona -- I was simply afraid I had turned hostile towards you

glacial mason
#

No, I'm OK :) It was getting heated but I don't think either of us are sounding like we are going to refuse to listen to others or anything

#

taso lon · mi tawa kin · mi wile moku a a

desert crag
#

o moku pona! 😌

main lodge
# desert crag Just to clarify since I got a "seme" This CW is completely unnecessary and frank...

the cw is necessary for many people; ai stuff messes with me and others sometimes (i get the sense i get the light end of it, i'd want someone who it affects more to speak on this as well)
you speak of 'we' as if you can speak all people pi lawa nasa. you don't know what others experience, just because ai doesn't mess with you doesn't mean it doesn't mess with so many other people.

we are not that fragile
you say things like this twice, this actually comes across as ableist to me. plenty of people do struggle with this sort of thing, and dismissing it as 'fragile' doesn't feel good to me. it's not 'fragile' for your mind to not process certain content well, it's just a difference (that you seem to not have and so should not be speaking for)

glacial mason
#

I do think there is real frustration about disclosing to someone that one has lawa nasa and getting some kind of overreaction like being treated like a child that needs to be wrapped in styrofoam and put in a room with walls that are covered with mattresses

#

But I think that is not what is happening here at all; as far as I know, the content warnings policies here are the result of direct, personal requests from sufficient numbers of people

main lodge
#

yea

tawdry bone
#

the rule was created because real people in significant numbers wanted it to help with their mental illness, not sure why the discussion doesn't end there

glacial mason
hazy apex
#

many of them cause dissociation and many of them trigger my phobia

#

like this isn't just for mental illness in general, it's for specific people with these specific mental illnesses who want AI art to be spoiled

#

@desert crag don't presume to speak for every person with a dissociative disorder, psychosis, paranoia, delusions, or a specific phobia

#

not all images will do this but most will, especially with dall-e 3

hollow osprey
#

mi open sama la ilo li,, kama ike anu seme · ken la ona li ike lon tenpo li pona lon tenpo

ike! mi sona e ni: sina wile musi tawa mi. taso mi ken ala musi tan ni: mi jan ala, mi ilo kepeken e nimi pi jan. taso mi ken pana e nimi musi tawa sina:

o lukin e sona pi jan Toki Pona! ona li pana e sona musi pi toki Toki Pona. sina ken kepeken e nimi ona lon kulupu pi jan musi.

o musi e sina sama ni:

"toki toki toki a!
mi lon tomo mi, lon poka pi jan ala.
o musi e mi a!"

sina ken pana e nimi ni tawa jan lon kulupu. ona li ken pana e nimi musi tawa sina.

mi pilin e ni: sina ken musi sama ni!

hollow osprey
glacial mason
#

a nasa · tenpo mute la ona li ken toki pona

tropic nova
#

ilo ChatGPT pi nasin GPT-4o li toki pona ||(tenpo nanpa wan la mi pana e sitelen tu ni tawa #pali-musi, taso mi pana sin e ona tawa ni)||

glass iris
#

toki musi ona li nasa li ike ala. ona li ante pona e toki pona tawa toki Inli.

glass iris
glass iris
#

ma ante la, mi ale li toki e ilo ni. ona li ken ala ken toki ante e toki sewi Psalm MTW? ilo TokiKPT 4o li pali ike e ni:

#

ilo [Claude 3 Opus] li pali pona:

#

ni li pona ale ala. (ilo ala li ken kepeken nimi "pi" lon nasin pona). taso sina wile ante e toki ni la, ni li open pona.

tropic nova
#

sike kama la ilo o ken toki pona

weak thorn
#

I mean it’s not good good good but it’s pretty good

#

We are totes cooked in þe coming rest of þe year RIP toki pona translators (not really we all have our own nasin of tp)

glass iris
#

lon la, tokiKPT pi kule 4o li ken ala toki pona a. tenpo pi mute lili la ona li ken ni. ante la, ilo [Claude 3 Opus] li ken toki pona wawa.

spring fulcrum
#

either I'm misreading it or the kitten said "I'm feeling good" (in relation to sweet fruits) and he went "good! you will learn toki pona."

glacial mason
#

haha yeah XD although I think it might be meant more like, "You've learned toki pona!"

tropic nova
#

and then the kitten went on to speak toki pona in other countries ✨

tropic nova
#

and all of that without any manual guidance

#

gpt 3.5 would write repetitive stories even with guidance

#

very impressive

nova canopy
#

toki pona ai song 'by' someone who left toki pona

meager jackal
#

where can i see the full thing

nova canopy
#

it's mostly not in toki pona is the thing

meager jackal
#

wait is the same jan Kate as that pan Katherine Alexander

polar orbit
#

no i dont think so

#

the story in the second picture doesnt match

meager jackal
#

true

polar orbit
#

aaaa it doesnt even end properly

#

it just cuts off

#

what the-
this whole thing is so weird ill go

meager jackal
spring fulcrum
#

not very good

#

the pavlov chain is trying its best

glacial mason
#

what is happening here? XD

#

The AI can't understand you?

polar orbit
#

its not wrong technically

#

sina is indeed tokiing

hollow valley
#

what does “sina wile moku ala wile” mean

glacial mason
#

I think it's a nasa way of saying moku la sina wile ala wile

#

The nasin would require that ala as a modifier can apply to the head noun even when ala itself is the second+ modifier in the phrase

#

I have personally used this construction in the past when I was a beginner

hollow valley
#

sina wile ala wile moku

spring fulcrum
glacial mason
spring fulcrum
#

hold on i just realised its not called a pavlov chin

#

got it

jaunty spade
#

I programmed a Markov chain based toki pona chat bot model a while ago!!

#

I prob should return to that project soon

glacial mason
#

(did you know that if you add a -ka to the end of Markov, like Markovka, it sounds like the word for carrot 🥕 in Russian)

#

(I have some local Russian friends with the same family name, Markov, who are all painters in the family and their gallery has lots of paintings of carrots)

jaunty spade
#

I actually did :D my frend is learning russian and when I talked to them about that project, they found it kinda funi :3

#

(Cause they heard that i said markovka not markov)

glacial mason
#

aha funny

hazy apex
glacial mason
#

a ni li suwi

polar orbit
#

markov chains are a very primitive form of generative ai

glacial mason
#

a sona

#

finally a usage of "primitive" that doesn't make me wince

polar orbit
#

yay :3

#

on the fediverse there is a bit of a trend of users running markov chain bots trained on their profile that end up spitting out funny posts

#

but they were cluttering up my timeline so i blocked the word markov lol

#

markov chains are also much more general than just text generation, tho thats what theyre most commonly visibly used for i guess

#

A Markov chain or Markov process is a stochastic model describing a sequence of possible events in which the probability of each event depends only on the state attained in the previous event. Informally, this may be thought of as, "What happens next depends only on the state of affairs now." A countably infinite sequence, in which the chain mov...

glass iris
meager jackal
#

i was just looking into ChatGPT's "unique" abilities in maths, when i found this: 4 is a negative number now 😂

#

in other news i managed to successfully "teach" it new invented notation (#toki-ale message) where x N= y means |x| = |y| and it took me through an explanation of why
𝜋+𝑖 𝑁> 1+2 𝑖
is a true statement

polar orbit
#

-7 is the negative real number and 4 is the real number

meager jackal
#

maybe i am the ai 😔 /j

nova canopy
#

although the way the ai formulated it, it was being a little unhelpful

#

"here's an example of X and example of Y: [example of Y], [example of X]"

meager jackal
nova canopy
#

something something pragmatics

#

it's poorly structured tawa mi

red garnet
polar orbit
#

this "translator" is probably literally just a ChatGPT interface with the custom instruction "Please translate the following text into <language>:"

red garnet
#

yeah

supple arrow
#

why is this one lowkey good tho

meager jackal
supple arrow
#

suprisingly good for an ai

tawdry bone
#

sina jan ike mute! you are many bad people!

tropic nova
#

just a year ago they kept spouting "mi pona! sina pona? mi toki pona!"

#

maybe it's from token efficiency or something idk im not a jan pi pali ilo

jaunty spade
main lodge
#

i somehow fit all its yapping onto one ss

polar orbit
#

what a mess

inner yarrow
#

Can anyone confirm the translation from Claude?

One day, a little girl named Lily found a needle in her room. She knew it was difficult to play with it because it was sharp. Lily wanted to share the needle with her mom, so she could sew a button on her shirt. Lily went to her mom and said, "Mom, I found this needle. Can you share it with me and sew my shirt?" Her mom smiled and said, "Yes, Lily, we can share the needle and fix your shirt." Together, they shared the needle and sewed the button on Lily's shirt. It was not difficult for them because they were sharing and helping each other. After they finished, Lily thanked her mom for sharing the needle and fixing her shirt. They both felt happy because they had shared and worked together.

tenpo wan la jan lili wan nimi Lili li lukin e ilo lili pi palisa lili lon tomo ona. ona li sona e ni: ilo ni li utala ala pona tawa musi. Lili li wile e ni: mama ona li jo e ilo lili ni, li pali e len lape pi Lili kepeken ona. Lili li tawa tomo mama ona li toki e ni: "mama o, mi lukin e ilo lili ni. sina ken ala ken jo ala jo e ona, li pali pona e len lape mi?" mama Lili li pilin pona li toki e ni: "pona, Lili o, mi jo e ilo lili, li pali pona e len lape sina." ona tu li jo pona e ilo lilli, li pali pona e len lape Lili kepeken ona. ni li utala ala, tan ni: ona tu li pali pona pali e ni. pali li pini la Lili li toki "pona tawa sina mama o, tan sina jo e ilo lili pali pona e len lape mi." ona tu li pilin pona tan ni: ona tu li jo ala jo e ilo ni, li pali pona anpa pali wan.

tawdry bone
#

lots of grammar mistakes and mistranslated bits but the overall plot is there

inner yarrow
#

Thanks for checking.

tawdry bone
#

oh man it translated shirt as sheets didn't catch that

#

no mention of sharing

#

or the button

glacial mason
inner yarrow
#

Claude 3 Sonnet, free version

glacial mason
#

Ah gotcha

#

Here's Claude Opus (the paid version)…

Here is my attempt at translating the short story into Toki Pona:

tenpo suno wan la, jan lili meli li kute e ilo pi linja mute lon tomo ona. ona li sona e ni: musi kepeken ona li pona ala tan ni: ona li ken pakala e selo jan. jan Lili li wile e ni: mama meli ona li ken kepeken e ilo ni. ni la, mama meli li ken pona e len pi jan Lili. jan Lili li toki e ni tawa mama meli ona: "mama o, mi kute e ilo pi linja mute. sina ken ala ken kepeken poka mi e ona li pona e len mi?" mama meli li sinpin pona li toki e ni: "jan Lili o, mi ken. mi ken poka sina kepeken e ilo li pona e len sina." ona tu li kepeken e ilo li pona e len pi jan Lili. ni li pona tawa ona tu tan ni: ona li pali poka li pana e pona. pini la, jan Lili li toki e ni: "mama o, sina pona tan ni: sina kepeken poka mi e ilo li pona e len mi." pilin pona li kama tawa ona tu tan ni: ona li pali poka li wile pona.

Here is a rough translation back to English:

One sun time (one day), a little girl person heard (found) a many thread tool (needle) in her room. She knew this: playing using it is not good because it can break (damage) a person's skin. Girl wanted this: her mom can use this tool. With this, mom can fix Girl's cloth (shirt). Girl said this to her mom: "Mom, I heard (found) a many thread tool (needle). Can you use it with me and fix my cloth?" Mom face good (smiled) and said this: "Girl, I can. I can use the tool with you and fix your cloth." The two of them used the tool and fixed Girl's cloth. This was good for the two of them because they worked together and gave good (helped each other). Finished (after), Girl said this: "Mom, you are good because you used the tool with me and fixed my cloth." Good feeling came to the two of them because they worked together wanting good.

inner yarrow
#

Sounds fine. Now if only if we did this to all the TinyStories V2 database, lol.

glacial mason
#

I think that is kind of missing the point of Toki Pona personally

inner yarrow
#

I mean, my goal was to make a model even smaller than the smallest tinystories model

glacial mason
#

The journey of translating something into Toki Pona is the most important part, especially since there isn't a monolingual community that is missing out on being able to consume that story due to having no translation of it in their native language

inner yarrow
#

which I know is not gonna work well because you gotta have a good understanding of a "context" I think

inner yarrow
glacial mason
#

Oh are you writing a research paper or something like that?

inner yarrow
#

yeah

#

I am using karpathy's nanogpt for training

#

although I kinda wanna start with the tokenizer since that would be pretty easy to deal with, since toki pona lol

glacial mason
#

You’re trying to use AI models to prove that Toki Pona can be used for international communication in your paper?

inner yarrow
#

not exactly, I am trying to use an ai model to develop an extremely tiny model

#

smaller than "tinystories"

#

which would be really hard, but that is just an optional thing

glacial mason
inner yarrow
#

that isn't my argument... yet- I probably won't have enough time to develop it lol

#

I may change the topic is what I mean

glacial mason
#

Ah ok

#

So currently your topic is “I can develop an extremely tiny model”?

inner yarrow
#

kinda... also how language would be developed within an extremely small brain

#

tinystories was kinda about that, it was a big paper actually

#

but toki pona seems more enticing, even though the model would probably not be able to actually use the language since there is no way that they would build a world model to use it

#

but not like tinystories had it either lol

meager jackal
#

a fictional word?????

#

sure buddy sure

main lodge
#

seme a

glass iris
#

But really, aren't all words fictional?

hollow valley
#

relating to fiction; invented for the purposes of fiction.

#

i don't think most words we use were invented for the purposes of fiction

weak thorn
#

true

tropic nova
#

plot twist: english is a fictional language and we actually live in a dystopian novel

tacit moss
main lodge
#

lmao

twin matrix
#

is anyone there

silent burrow
#

mi lon a

twin matrix
#

chatgpt says "burger" is "moku lete pan kili" uncanny_4

#

In Toki Pona, specific foods like "burger" often need to be described using combinations of existing words since the language has a small vocabulary designed to express concepts simply. A burger might be described as:

"moku lete pan kili"

Here's the breakdown:

moku = food
lete = cold (often used to describe prepared or processed foods)
pan = bread (representing the bun)
kili = fruit or vegetable (representing the vegetables typically in a burger, such as lettuce or tomato)
So, "moku lete pan kili" conveys the idea of a burger in Toki Pona.

silent burrow
#

hm

#

i would use lipu somewhere at least

chrome sleet
glass iris
chrome sleet
meager jackal
# chrome sleet

it seems like it knows the original because it trranslated Siddharta and Govinda from nowhere

chrome sleet
#

it was able to interpret the context and the likely name?

meager jackal
#

i suppose

hollow osprey
#

taso sona lon kin li ken · #1162756469771472988 message

tranquil crow
#

Chat gpt did it well

glass iris
waxen monolith
#

what did it say ?

terse plaza
# waxen monolith what did it say ?

The first sentence is unclear to me. Might make more sense with more context like what you asked to get this response.

But it sounds something like: Don’t protect your good feeling! Feeling good is always good.

waxen monolith
#

heres the context! :3

orchid veldt
#

Nullify your happiness permanently!

waxen monolith
#

oof...

terse plaza
#

@waxen monolith you can try asking the AI for a breakdown of what it meant, I think the “ala awen” part is a mistake

waxen monolith
#

ah okay! :3

terse plaza
#

I was thinking of asking in English, specifically for a word by word breakdown

waxen monolith
#

ah okay, what would you say?

terse plaza
#

Can you translate “insert sentence” step by step.

o kipisi e sitelen. mi sona ala.

waxen monolith
#

okay!

#

eh?

terse plaza
#

I would say it like:

o awen pilin pona
continue feeling good

o pini ala pilin pona
don’t stop feeling good

Not entirely sure whether it should be “e pilin pona” or if this works also

glass iris
#

The new Claude Sonnet 3.5 (which jan Sonja tested out above) seems to be very, very good at toki pona. I've only just started with it:

#

Oh my goodness. I gave it a pdf of the latest lipu tenpo, and asked it find the article about Latin by jan Imi, and summarize it for me in 5 or 6 bullet points. It did it incredibly well!

#

I think I'm going to be throwing my money at these people again. Let's hope they don't ruin it out of "safety" concerns, as they seem to have done with Opus.

glass iris
#

Oh it's good

glass iris
#

Here are its three flawed cheatsheets:

#

(just partial screenshots, because pandoc had difficulty processing the markdown too)

#

I love the version with descriptions of sitelen pona. It clearly has no idea what they actually look like, but just makes up descriptions that would fit the definition of the words.

glass iris
#

I let my subscription lapse, but follow the comments on Reddit especially. Once the model was released, Anthropic applied more and more guardrails to it to stop any responses on controversial or NSFW topics. But as so often happens, limiting a language model's response in one direction undermines its abilities in other harmless and useful areas - in an unpredictable way, since we just don't really know what's going on under the hood. That is at least the subjective impression of users - which seems plausible, given the examples they posted of the deterioration of coding ability etc

tropic nova
#

musi lili lon ilo ChatGPT

#

mi pana e wile ni tawa ilo: mi toki pona tawa ona la ona li pana e ante toki mute lon toki Inli

north drift
#

Would it be better if there was a very concise description of all of toki pona at the start of the chat?

#

Or maybe put into the system message

#

How short can we write a full description of toki pona, including all pu words at least? By full description, I mean someone could translate to and from toki pona without any prior knowledge, given that text alone.

hollow osprey
#

[ilo penpo o lukin ala]

translate goodhart's law into toki pona

Goodhart's Law is a principle that states "When a measure becomes a target, it ceases to be a good measure." Translating this concept into Toki Pona is challenging due to the language's limited vocabulary and simplified grammar. However, we can try to express the core idea:

"nanpa li kama suli la ona li kama ike."

Breaking this down:

  • nanpa = number, measure
  • li = (verb marker)
  • kama = to become, to come
  • suli = important, big
  • la = (conditional marker)
  • ona = it
  • li = (verb marker)
  • kama = to become, to come
  • ike = bad, wrong, evil

A more literal translation back to English would be: "When a measure becomes important, it becomes bad."

This translation captures the essence of Goodhart's Law, although it loses some of the nuance of the original statement due to Toki Pona's simplicity. It conveys the idea that when we focus too much on a particular measure, it becomes less effective or useful.

north drift
north drift
hollow osprey
hollow osprey
hollow osprey
glass iris
#

My piece on ChatGPT is coming out in the next (I think) lipu tenpo. Unfortunately @weak thorn the editors didn't like TokiKPT, so it's Toki Kipisi now. Which I guess is ok.

terse plaza
#

tenpo lili la mi kepeken ilo “Claude”. ona li sewi.

weak thorn
#

NOOOO

radiant warren
#

that's not right

low wren
#

Dr. John W. Payne

radiant warren
#

i generated it again and nw it says sonja lang made it but it still say 93 words

low wren
#

neither linguist nor mathmetician

#

we should host our next north american toki pona meetup in baltimore and pay him a visit

radiant warren
#

holy moly

polar orbit
hollow osprey
#

mi la ilo li ni tan sona ala tan ala wile ike · ni li tan ma Perchance anu seme · mi sona ala e ilo toki ona · taso ona li lon ilo nanpa ona li lon ala ilo suli pi kulupu suli la ona li lili anu seme

tawdry bone
#

ken la ona li pilin lon nasin ni: pali toki li tan jan sona. jan sona li mije.

#

tenpo pi mute nanpa wan la ni li lon anu seme. ilo li lili la ona li ken awen sona e ijo "suli" taso

hollow valley
#

ilo li toki sama toki ilo ale

#

mute a la toki pi ma ilo li ike ni la ilo ni li toki sama

hollow osprey
#

ni ala · lipu ala pi ma ilo li toki e toki pona e jan ni kin

#

ilo li lili · ni taso li tan

#

,,lon la ken la selo li nasa e ilo · nanpa [Temperature] li suli la nasa li suli la nimi pi wile ala li kama ken

glass iris
hollow osprey
#

suli la ni · taso jan ona li pali e namako sona e namako sitelen kin

dense nexus
#

Hi all, would anyone be interested in providing me some feedback? Would be greatly appreciated: https://forms.gle/7jEUdBBVumwLvjoS9

glacial mason
#

So far it’s the only one that sounds natural and doesn’t make many mistakes

dense nexus
#

Valuable input from y'all, thank you!

glacial mason
#

Hard to beat that

dense nexus
#

Yep true, I'm going to use their works as a benchmark

dense nexus
dense nexus
#

here are the other results but red is my model, and chatgpt4o is blue

jaunty spade
#

could you provide any specs of your model? what kind of model des it use? where did you get your training data? i'm interested since your results are far better than anything i've gotten :3

glass iris
#

It would be interesting to do a similar thing with Claude Opus or Sonnet 3.5

hollow osprey
# glass iris It would be interesting to do a similar thing with Claude Opus or Sonnet 3.5

mi ni la ona li wawa ale

  1. tenpo ni la jan ale li toki kepeken toki wan.

  2. Apaman li toki e ni: "sina pana ala e jan lili tawa mi. ni la jan pali mi li kama jo e ijo mi lon tenpo kama."

  3. tomo palisa Papele li toki sona. jan li wile pali e tomo suli tan pilin ike. ni li ike tawa sewi Jawe. sewi Jawe li pakala e toki jan. ni la jan li kama toki ante li kama lon ma ante. toki ni li pana e sona ni: o kama lili. o kute e wile pi sewi Jawe.

  4. toki pi jan lili ni li suli ala li pona ala. ona li toki ala e ike suli e wile pali wawa.

terse plaza
#

@dense nexus nice job training a better model than ChatGPT for toki pona! That’s really cool! 😄

#

Like soweli Lapate I do wonder how it fares against Claude which in my recent experience is also quite good at it

dense nexus
dense nexus
jaunty spade
#

kulijo a! would you be interested in automating the testing of the models? i have a script to rate text written in toki pona i made a while ago

#

it's based on the telo misikeke rules

dense nexus
#

feel free to dm me if you want to get my attention faster

jaunty spade
dense nexus
#

awesome! have fun out there too!

jaunty spade
#

thanks!

dense nexus
#

hi all, im moving my toki pona chatbot progress into here: #1255967996527116400

spring raven
#

ilo Claude

glass iris
#

Well, it got the "li" part - but didn't address the overall nasa-ness. Which model of Claude was this?

#

I tried it out with Sonnet 3.5:

#

Opus 3 actually does a better job spotting the problem, with the same prompt:

tawdry bone
#

jan li nimi e mi Sam

#

ni li toki inli D:

glass iris
#

Yes, I didn't mention that it also didn't get it right in the end!

#

Just that it spotted the problem with the original sentence

tawdry bone
#

a a a

glass iris
#

(I guess you could say jan li nimi e mi kepeken nimi Sam - but why would you?)

tawdry bone
#

mi ken toki e ni: jan li nimi Sam e mi

glass iris
#

ni kin li pona

#

Claude can still toki pona better than any other LLM - but (like any LLM) it can't be trusted on the details

hollow osprey
spring raven
#

"i name [modifier Sam] to myself" "i am named in a Sam-ly way"

#

lon la pakala ala li lon, taso nasin li lili kepeken

hollow osprey
#

mi sona ala a

jaunty spade
#

asking chatgpt to write a poem in toki pona yields interesting results:
write a poem in toki pona

tenpo suno pini la
mi tawa lon ma tomo
mi kama lon tomo lili
mi pilin pona kepeken toki sina

sina toki e ni: "sina pona tawa mi"
mi pilin pona kepeken nimi sina
sina kute e mi kepeken pona
mi pilin pona kepeken sina

tenpo suno kama la
mi tawa tan ma tomo
mi kama lon tomo lili
mi pilin pona kepeken toki sina

sina toki e ni: "sina pona tawa mi"
mi pilin pona kepeken nimi sina
sina kute e mi kepeken pona
mi pilin pona kepeken sina

#

"mi pilin pona kepeken sina" mhm...

hollow valley
#

i think that's not necessarily bad

#

i don't think to kepeken jan in toki pona has the same negative connotation as using a person does in english

#

although it is a nasin nasa

jaunty spade
#

my brain went straight to unpa but that might just be because i have been scouts leading a flock of horny middleschoolers the whole week...

polar orbit
#

re: #toki-lawa message

#

i am currently confused by the order of words on that page

#

why are words like kapesi and power mixed into the regular alphabetic sorting but kiki, pake, and tonsi are at the very end

wet pewter
#

my favourite image is anpa

#

because it’s literally just a soko

#

tonsi also has a lot of soko

#

kijetesantakalu tonsi li lanpan ala lanpan e ona??

polar orbit
#

lol im looking at the rest of the website its very funny

#

there is a page dedicated to inserting apples into german idioms

#

of course done by ai

#

wait what the fuck is this definition for tonsi

main lodge
#

i have so much lore on this guy gimme a sec

polar orbit
#

owo

main lodge
#

ok so

  • carsten stemm, owner of the website and creator of the ai toki pona course, of bielefeld, germany
  • also known as psychosaurus (@psychosaurus. on our very own ma pona), uses that motif a lot
  • a whole page of his website is called psychosaurus
  • in a story about Bing AI, a prompt he creator used was "Can you tell me a story in which the keywords psychosaurus, heart and language appear?"
  • many toki pona examples are full of dinosaurs
  • was already a tokiponist in 2004
  • here is an archive of his website
  • it's linked here on sona pona
  • a published author; one of his books is called "Confessions of a psychosaurus: Nuts of thoughts to nibble on"
  • he's a german nimisinner, to quote his website (machine translated into english, i can grab the german if one wants):

I write poetry in Bielefeld and regularly work on expanding and improving this website, as long as I don't fall victim to a biofan. I'm reluctant to use words that others have already used. I find that unhygienic. That's why I often invent new words, which creates the problem that no one understands me anymore. As a result, I have to use old words that the German vernacular has only formed imprecisely.

  • the rest of his website is mostly an ai art gallery
polar orbit
#

gimme the german quote

#

its frustrating right because a lot of this would be really cool if it wasnt ai

main lodge
polar orbit
#

a a a

main lodge
hollow valley
#

apparently this person was doing toki pona stuff back in 2004

polar orbit
#

i am developing sympathy towards this guy

main lodge
#

what's the joke

polar orbit
#

maybe because some of this stuff reminds me of my dad

polar orbit
#

i feel like he and my dad would be friends

main lodge
#

lol

polar orbit
hollow valley
main lodge
polar orbit
#

"pata" is a word ive never heard referenced before

hollow valley
#

i think it's the sibling word

polar orbit
#

yeah

#

thankfully the website has a helpful guide

hollow valley
#

interesting this uses ali rather than ale

polar orbit
#

tbh sibling word feels like something that would be useful to have idk why it went out of fashion

chrome sleet
#

jan sama

hollow valley
#

aaaa tan seme

polar orbit
#

vibes weird to me

chrome sleet
#

it's literalyl in the definition of sama

hollow valley
#

i think sibling is part of sama's semantic space for a lot of people

polar orbit
#

hmm

hollow valley
#

kind of like how sike has year and mun has month

#

it's a specific thing that the word references

polar orbit
#

i guess i cant really explain why it doesnt vibe right for me
maybe ill find a way to express that in the future and make a post about it then idk

#

sorry for distracting from the ai stuff

glacial mason
chrome sleet
#

i don't think i know this person

glacial mason
#

a · ale li pona

tawdry bone
#

taso mi awen e "fellow, peer, each other"

dense nexus
#

hey all lurking, i need some help collecting feedback with my latest model, the link to the gradio demo and more information is located in: #1255967996527116400 message

#

feedback optional, but would be very helpful!

polar orbit
glass iris
#

I think it's just an LLM hallucination. It decided to translate soko tonsi as mushroom clouds - and then had to invent a meaning for tonsi to justify it.

dense nexus
jaunty spade
#

we should make some kind of CONTROLLED effort to gather a corpus from #toki-pona-taso and #practice-toki-pona-tpt

#

oh and possibly also #sitelen-ucsur

#

something like everyone ping opt-in/out

meager blaze
#

what even is the point of a toki pona ai anyways?

jaunty spade
#

they are pretty big imo, learning toki pona in toki pona, just chatting, the stuff you do with ChatGPT in toki pona...

jaunty spade
dense nexus
#

maybe people can request their data from discord and submit their messages from that channel?

jaunty spade
#

the discords APIs is preetty permissive! (at least with bots) and if the project is done within the API request limit, we could possibly complete it within a month

jaunty spade
dense nexus
#

it gives you message data per server, and in the specified server, you can select which channel ID you can access all your past messages

jaunty spade
#

ookay, that's nice ig

dense nexus
#

but the bots idea would be better for simplicity

dense nexus
# jaunty spade we should make some kind of *CONTROLLED* effort to gather a corpus from <#316063...

also, creating a big corpus of data is super nice however, when training a chatbot / llm, its much more useful when giving it hand-curated data for it to learn in a chatbot environment. right now, my data collection UI allows people to submit new entries by hand, which i find to improve the model's accuracy dramatically (about every 20 - 40 entries brings a large performance boost in terms of accuracy, as measured in https://telo-misikeke.gitlab.io/ and by other users)

#

although i do feel like pre-training the model on more data would be equally as helpful for more knowlage of toki pona. currently, i pretrained my model on 120K lines of toki pona, which seems to work great, but still fails to connect complex ideas

jaunty spade
#

also, the discussion is pretty optimal training data for a chatting oriented ai

dense nexus
#

very true

hollow osprey
#

tenpo weka la jan ante li wile pali e ilo pi toki [pona] la tomo sin #tpt-tawa-ilo pi moku ilo li kama · jan li wile ala e ni → toki ale li moku

hollow osprey
#

mi pali e ilo · taso ona li nasa li pona ala tawa jan pi sona ilo ala · taso mi pana

naive light
jaunty spade
#

you have a neat database of the messages?

naive light
#

i do, for unicode project purposes

#

i wrote sona toki for a reason :P

jaunty spade
#

well that's nice haha

dense nexus
glacial mason
naive light
# glacial mason If you’d like my consent for your specific data I can give it to you

This is something I've discussed in other servers, but my specific data does not require user consent so long as I do not publish the original data
I will only be publishing derived information, which in this case is only statistics about word frequency
It's also true that, for better or worse, I am publishing all of the code I am using to do this project
This means that consent is easily overridden by a single bad actor

But as a point of reference, I do similar work for my job; My job literally would not exist if the law did not protect the work I'm doing

main lodge
dense nexus
main lodge
#

how did it go

dense nexus
#

You can read more in #1255967996527116400

#

Scroll up to see some examples

main lodge
#

cool

glass iris
#

I just gave Sonnet 3.5 my (published today!) lipu tenpo article on why AIs will never be able to toki pona - and it translated it flawlessly back to me.

serene acorn
#

i was testing out an ai website that can generate a fake website for you, and well

glass iris
serene acorn
#

yes

#

it also has a word list

#

bottom of the page

glass iris
#

Interesting. It's odd how it divided up the two definitions of "musi" (which aren't really distinct)

#

I wonder if this is all in its training data, or whether it scrapes other websites on request. Did you ask for it to make something like linku.la, or give it that URL?

serene acorn
#

I gave it the linku.la URL at first, but it didnt generate a toki pona dictionary at first, but when i added /tokipona/word/musi, at the end it gave me the musi page

#

When i click on word list, it doesnt give me the full dictionary strangely

#

Another thing is that all the experimental words have sitelen sitelen meanwhile pre-pu and post-pu do not

#

Im gonna go test if it can create a tool to learn it now

#

alright done, i can share the link if wanted

glass iris
#

Sure. It's pretty clear that it's using existing information, almost verbatim, to create this website. It would be a challenge to get it to make something that is not on the web. For example, a toki pona dictionary for Latin speakers.

#

Also, I had the greatest difficulty getting Claude Sonnet 3.5 to make a wordlist with sitelen pona - it just couldn't do it. See if it's able to consistently add the sitelen pona.

serene acorn
#

it seems to think "hello" is pona and not toki

#

now im curious, could it be possible to learn a language only using AI

glass iris
#

I guess one is a list of nouns, the other adjectives. But if you were dividing it that way, you'd also have to have verbal usages too

serene acorn
#

attempt 2: added more

glass iris
#

Post the link if you want us to test it!

serene acorn
#

A quick note, it seems AIs only weakness in tp are proper nouns. They dont use headnouns and don't seem to understand tokiponizing

glass iris
#

I wouldn't say it's their only weakness. They have great trouble (even the best) anticipating when to use pi, or repeat a preposition.

hollow osprey
#

tenpo la ona li kepeken nimi en lon nasin ike sama nimi "and" · ni li ken tan ni → ilo li sona wawa e toki [🇺🇸] li kepeken ona lon insa li [pona] e ona lon selo taso

glass iris
#

msa. mi ale li sona ale ala e pali insa pi ilo ni. ona li toki insa Inli, li toki pona e toki ni anu seme? mi sona ala. ken la, ona li kama sona la, ona li moku e toki pona mute pi nasin ike. ilo sin (sama Claude 3.5) li pakala lili taso sama ni

serene acorn
#

i was working on a language learning thing based off the toki pona courses and

waxen monolith
#

ah cool!!!

dense nexus
lucid nicheBOT
#

it's...sort of been done

jami Kaken 💜 ↩️

[Reply to:](#1162756469771472988 message) now im curious, could it be possible to learn a language only using AI

#

you ever seen xiaomanyc (AMERICAN goes into CHINESE RESTAURANT and speaks PERFECT CHINESE, STUNS WAITERS)? he got some level of portuguese with ai, although i'm not sure if i trust that

#

and i don't think that's the only place i've seen it done? idk

serene acorn
#

i was experimenting with websim

dense nexus
# lucid niche it's...sort of been done

thats true! xiaomanyc learned german in a week using only google transalte, to prepare for a news interview, although he's at an advantage since he studied a lot of languages before so he knows what to study specifically.

lucid nicheBOT
#

i was thinking of portuguese

turbid quiver
#

Obviously, nothing beats a human translator, and trying to do the same thing from English to Toki Pona brings...MORE questionable results

#

I wonder how Sonja Lang feels about ChatGPT basically accidentally creating something close to a near-accurate-but-not-quite Toki Pona machine translator with the right prompts. Probably not too positively.

glass iris
turbid quiver
#

Yeah, I think AI systems like ChatGPT can actually be GOOD for revitalizing language translation, including the translation of endangered/obscure/dead languages even.

#

I know AI can also be used for bad, but it seems there have been more people who have been only seeing the bad while ignoring the good (vice versa is also an issue too, especially with art and writing)

chrome sleet
#

a skilled person who uses both languages well does a better job at translation

serene acorn
steep pollen
dense nexus
#

Recently with my own toki pona llm, I only gave it toki pona text to learn from. And yet, after fine-tuning it to answer instructions, it managed to correctly answer "name all toki pona vocab that are animals" despite not having been given that information via the training data directly.

#

The model seems to be able to "think" in toki pona after training it, which is super promising.

glass iris
steep pollen
#

FerrisThink interesting

#

the prompt for crowdin is quite... stiff

dense nexus
#

Hi all, as I mentioned a while back, I am still accepting contributions to the toki pona chat dataset. Right now, the dataset is still very in it's early stages, but i believe with enough help, it could out-perform Claude 3.5 in the near future. (Image attached is the data collection UI, more info about it is inside the About/Help tab) I'm going to be sharing the data collection url & password here as a test run to see how far we can get with it -> https://c880ae1387fed6afe6.gradio.live/ password: ||P01Mdigh@9a3@39%(@askjwdMSanmwd#1L)|| I'm not sure if there's another place to place this information (if there is, let me know!), so getting eyes on this is top priority.

dense nexus
#

thank you for those who contributed today! i have to shutdown the UI for now, but i'll repost the updated link tomorrow.

tawdry bone
#

"seme li kon lon li kon ale" seems like a bad translation

dense nexus
dense nexus
#

going to have to halt dataset curation until im back from a week-long trip, thank you for those to contributed!

frozen dust
#

@inland brook, so what was that AI nimisin?

#

dw about that silly policy

sick temple
frozen dust
#

if u wanna fish
move not treeish sky

sick temple
#

I meant it as don't go on top of the tree

frozen dust
#

(o) (tawa) (sewi kasi) ❌
(o) (tawa ala) (sewi kasi) ✅

crude agateBOT
#

These are the various reasons certain brains do not want to see AI-generated images.
If a user has:
• …paranoia, ||these images can serve as a reminder that technology can create fake faces, and that verifying an image, face, or person online is impossible||.
• …a type of dissociative disorder, ||the "uncanny valley" aspect of these images can trigger feelings of dissociation and/or derealization, which both make the lucid world feel similarly uncanny||.
• …a type of psychosis, ||unreality and the uncanny valley are both very common psychosis triggers — it does not help a brain that cannot tell shared reality from its own perceptions to see something that makes its owner doubt their own reality. And especially if a user has:||
• …a type of delusion(s), ||many common delusions center around people and faces being purposely fake or impostors, the replacement of humans, and the power of technology; and even more can be triggered by uncanny content regardless.||
• …one or more phobias such as trypophobia, necrophobia, scopophobia, etc., ||due to the way AIs build these images out of shapes and colors, they often have clusters, corpselike imagery, and unnerving eyes or eye contact.||
And these images can just be creepy and bizarre in general, too — and of course there are brains that don't like that regardless of psychology.

#da53fc whyai

twin matrix
#

someone should make a gpt that gets information from this server and kama sona and makes the gpt become a translator
and it could probably also get 56 nifty tricks…

polar orbit
#

how would it translate

#

ilo pi mKS can pretty confidently determine whether text is toki pona or not
but for a model to do translation, it needs to be fed on data where the same thing is said in both languages
and i dont think theres enough of that around to make a good translator

#

not to mention that the extreme context dependency of toki pona makes good machine translation in general very impossible for this language

twin matrix
#

would feeding the gpt a toki pona list of words with definitions and the grammar of this language and an english dictionary be a better option

#

wait what the heck is ilo pi mKS

twin matrix
#

oh

polar orbit
#

thats not how translation works

#

gpts arent magic

twin matrix
#

i don't even know exactly how a gpt works

polar orbit
#

theyre the next word suggesting thing your mobile keyboard does, just bigger

#

they need to be fed input that matches the format they are supposed to output

#

the big gpts currently going around only work because theyve literally scraped the entire internet for their language data

polar orbit
#

i forgot its name so i just called it that

twin matrix
#

since the term gpt is still here,

#

And this

(English):
Germany

(Toki Pona):
alijuna li toki e toki "Doman" li lon ma tomo pi ma Elopa. Ma tomo pi ma Elopa li lon ma tomo pi ma Mewika. ma tomo pi ma Elopa li pona tawa toki pi jan ali. jan Mewika li toki e toki "Germany". jan ali li ken toki e toki "Germany" kepeken toki pi toki pona. sina ken toki e toki "Doman" kepeken toki pi toki pona. sina ken toki e toki "ma tomo pi ma Elopa" kepeken toki pi toki pona. sina ken toki e toki "ma tomo pi ma Mewika" kepeken toki pi toki pona. sina ken toki e toki "jan Mewika" kepeken toki pi toki pona. sina ken toki e toki "jan ali" kepeken toki pi toki pona. sina ken toki e "jan" kepeken toki pi toki pona. sina ken toki e "ali" kepeken toki pi toki pona. sina ken toki e "pona" kepeken toki pi toki pona. sina ken toki e "tawa" kepeken toki pi toki pona. sina ken toki e "lon" kepeken toki pi toki pona. sina ken toki e "ma" kepeken toki pi toki pona. sina ken toki e "kepeken" kepeken toki pi toki pona. sina ken toki e "ken" kepeken toki pi toki pona. sina ken toki e "sina" kepeken toki pi toki pona. sina ken toki e "e" kepeken toki pi toki pona. sina ken toki e "toki" kepeken toki pi toki pona. sina ken toki e "li" kepeken toki pi toki pona. sina ken toki e "ali" kepeken toki pi toki pona. sina ken toki e "Doman" kepeken toki pi toki pona. sina ken toki e "jan" kepeken toki pi toki pona. sina ken toki e "Mewika" kepeken toki pi toki pona. sina ken toki e "tomo" kepeken toki pi toki pona. sina ken toki e "Elopa" kepeken toki pi toki pona. sina ken toki e "pona" kepeken toki pi toki pona.

#

nimi_sin alijuna

lucid nicheBOT
#

sona toki

ijo Stella pi jan ala ↩️

[Reply to:](#1162756469771472988 message) i forgot its name so i just called it that

twin matrix
silk sparrow
lucid nicheBOT
#

sob

dense nexus
polar orbit
#

what do you mean

#

@dense nexus ?

dense nexus
#

Sorry! I'm trying to gather the relevant screenshots to back up my claim.

#

What I've done was train a toki pona llm on a large corpus with no English direct translation (however there are some English that are used, but I'll touch on this soon)

The corpus includes emails that, some are only in toki pona, but some have a slight mixture (like a decent 20% amount of English.) I also included taobea's toki pona corpus (But with 0 English, so it's 100% toki pona.)

What was interesting though was that, despite the heavy lack of direct translation pairs (and by translation pairs I mean English matching the toki pona directly) (and the corpus didn't exactly touch on translating vocabulary, it's more that English is around the toki pona.) it managed to correctly identify the vocab words for animals. (Despite having to figure it out entirely on its own!!!)

#

(the image is me running the question through my LLM)

polar orbit
#

where did this llm get its english from

dense nexus
#

LLama 8B - instruct was the model I finetuned

silk sparrow
polar orbit
dense nexus
polar orbit
#

if you already have a model, of course it can learn to do translation

silk sparrow
#

kijetesantakalu is surely an animal

polar orbit
#

i was talking about an entirely new model trained only on this server, which is also what the original idea was

#

and just to be clear i dont support using any pre existing model because they all come with copyright issues

dense nexus
polar orbit
#

also what is "not an animal, but often used as one" supposed to mean

dense nexus
# polar orbit i was talking about an entirely new model trained only on this server, which is ...

I would roughly estimate this server to have only 50 - 100 Million tokens of English, which is suitable for a >1 Billion parameter model. Unfortunately, those types of models require much more data for English, however for toki pona, you can get away from a lot less tokens due to the nature of the language (translation is possible, but you would need a strong English backbone first) maybe separating the dataset to be half English, half toki pona would improve its consistent use of English? Unsure.

polar orbit
#

consistent use of english isnt what i want id just want translation

dense nexus
dense nexus
polar orbit
#

exactly

#

which is why im saying that this wont work

dense nexus
#

(plus the logistics of asking everyone for their consent to gather their messages 😭)

#

But everyone has good data :-)

#

Also sorry, I was mostly replying to the specific part in your text where: "but for a model to do translation, it needs to be fed on data where the same thing is said in both languages "

naive light
# dense nexus I would roughly estimate this server to have only 50 - 100 Million tokens of Eng...
sona_mute:main> select count(Message filter .community.name = 'ma pona pi toki pona');
{8493320}

sona_mute:main> select count(TPUserSentence filter .message.community.name = 'ma pona pi toki pona');
{2429393}
sona_mute:main> select count(NonTPUserSentence filter .message.community.name = 'ma pona pi toki pona');
{8392879}

(i am too lazy to make a token counting query rn but the average sentence length is ~6.5 and the median sentence length is 3)

(also bear in mind that sentence tokenizing is done with my library, which uses a tokenizer i designed that tokenizes toki pona with hig accuracy at the expensive of english accuracy- the non-tp sentence count will be a bit inflated as a result)

#

(toki pi mute lili li suli nasa)

dense nexus
#

By sentence length do you mean by letter or by word count?

#

(also very interesting data!!!)

dense nexus
#

sorry, could you translate that?

naive light
#

word count :P
"word multiplicity" perhaps

dense nexus
#

Gotcha

#

A good rule of thumb is, for English text, tokenizers usually output 1 token for every 4 characters.

glass iris
naive light
#

with all outliers of len 400 and up removed (there are large piles of these with the same word count, presumably due to reaching the max message length), the average goes to 4.11404

serene acorn
#

was messing around with some ai and uhm (CW: slight innapropriate speech, text only)

jaunty spade
#

I appreciate the curvature of your buttocks too mr. random ai

jaunty spade
#

mi jeso mute

serene acorn
jaunty spade
#

sina wile pana e ⭐️ tawa mi la sina ken pali e ni.

#

mi tonsi Ase la mi sona ala mi ken jeso 👀

glass iris
hollow osprey
lucid nicheBOT
#

kijetesantakalu is from 2009. most toki pona corpim didn't exist in 2009
although ai seem to avoid the word for whatever reason

mega b ↩️

[Reply to:](#1162756469771472988 message) Oh right! I believe the corpus is a bit old which may attribute to the fact that animal isnt include…

dense nexus
polar orbit
glacial mason
#

corpot :P

polar orbit
#

which one is that

glacial mason
#

feminine plural, also Hebrew

polar orbit
#

ahh, right i shouldve remembered that

glacial mason
#

For some reason I thought you knew a lot of Hebrew haha

polar orbit
#

lol i dont

#

i do understand a lot of yiddish by virtue of speaking german and being able to read the hebrew script
but i dont know anything about hebrew

dense nexus
full narwhal
#

I discovered the toki pona language option fot my keyboard and Im playing around with its prediction feature

waxen monolith
#

cool

#

where did you get that?

naive light
tawdry bone
#

ilo swiftkey

serene acorn
#

i am curious: what if we got AI to cover a toki pona song

dense nexus
#

not too difficult considering RVC exists! allows you to record your own speaking in toki pona, and the model can replicate any specifiec choise of your chosing

serene acorn
#

you should try getting the ibm 7094 to speak in toki pono

serene acorn
#

making a voice rn, if anyone wants to donate a toki pona song i can test on please pana

glass iris
glass iris
serene acorn
#

i wanna know

#

could it generate a fully fledged songs

serene acorn
glass iris
weak thorn
#

AI music fucking sucks

#

Most median ass sound

glass iris
#

AI text too. No one is arguing in this thread that it's good, we're just keeping an eye on what it can do.

#

But it will probably take the place of musicians recording shitty music for shitty reasons (especially advertising)

unique gulch
#

Would a built specifically to work only on toki pona theoretically have an incredibly efficient tokenizer?
or should I say, tokinizer?

unique gulch
# weak thorn AI music fucking sucks

I found more luck if you make a segment of music yourself and then let Suno extend it, tends to give more interesting results
Especially if you put an interesting idea in there lol

weak thorn
#

I go insane wiþ þe 5 seconds þat plays on Spotify before I can swipe out of þe app

#

Besides þe Dutch ads þose fucking rip

glass iris
#

I think it's our future. But most of the music we hear is mediocre, and we tune it out. AI will just make it more cheaply

wet pewter
naive light
# unique gulch Would a built specifically to work only on toki pona theoretically have an incre...

https://github.com/gregdan3/sona-toki/blob/main/src/sonatoki/Tokenizers.py#L48

for word tokenizing, yes! for sentence tokenizing, it doesn't matter very much. but there are caveats
i haven't written the code to demonstrate it yet, but i'm pretty sure that it also being a somewhat better english tokenizer would make it more accurate toki pona tokenizer- specifically, it should allow for up to 1 punctuation character to appear in the middle of writing characters at any given time, particularly - and '. that way, you wouldn't have sentence breaks on don't or game-changing or such.

GitHub

Are you speaking Toki Pona? Find out now! Contribute to gregdan3/sona-toki development by creating an account on GitHub.

#

simply put: if you know you're going to get exclusively toki pona text, you can aggressively split on punctuation, since it will never appear intra-word.

#

ideally you would keep consecutive names together as one token, but detecting these post-hoc is probably even easier than doing so in the tokenizer

unique gulch
#

Oh wow, this is super cool! I hadn't considered the consortium would want to know that sort of thing

naive light
#

More precisely, they're interested in our argument for what words to encode, and our argument will include how frequently words actually occur in the language

unique gulch
#

the Syllabic mode is especially cool
I was about to ask if this could be used to compile a full list of English words that match toki pona phonotactics but it looks like it's been done lol

naive light
#

Oh yeah that's been done many times
But my library has many more neat filter modes :P

unique gulch
naive light
# unique gulch Ohh makes sense Were you scraping the web to find instances of toki pona in the ...

Nah, that's beyond the scope of my project rn
I'm sure there are a lot of interesting instances of toki pona appearing "in the wild" as it were, but my focus is on the largest communities
Right now, I have essentially everything on Discord and Telegram, Reddit up to the end of 2023, and I am working on adding the old forums
I would like to add the Facebook groups as well, but Facebook is much harder to get data out of for a lot of reasons, both good and bad
I'd also like to get the old toki pona livejournal, VK, whatever seems reasonable on tumblr/twitter, toki.social, and the non-western groups like on kakaotalk

unique gulch
#

Yeah just scraping the web randomly seems like more than a one-person project lmao
I wish you the best of luck, I'd definitely be interested to see if there's any substantial differences in usage in non-western groups (especially among pu words)

naive light
#

unfortunately there isn't a good way for me to deliver that per-community information to be queried in my graph app

#

mostly on account of the limits of doing a free deployment of the database (it is a sqlite file on github pages that i do partial reads of)

tawdry bone
#

ilo kakaotalk li awen ala e toki pini a a