#poki Lapo: Toki Pona library
1 messages · Page 3 of 1
you can just merge whenever you want and as many times as you want
the default behavior deletes the branch you merge from but doesn't have to
so, do a little work, merge it, repeat
oh except ci i suppose, we'll get occasional ❌s in main
i tend to use it to verify before merging
empty nodes are valid in yaml
https://yaml.org/spec/1.2.2/#72-empty-nodes
wawa
how do you merge without deleting it
that's what I was wondering
rebasing i suppose?
waow
this is so cursed. i never knew this
so yaml functionally has 3 variations of null? whyyyyy
insert the trinity diagram
poki pi jan amonsijato has an Unknown date despite being in the 2023-05 folders
that's a validation case i don't think we can fix tho
ill also note that our file naming convention is kinda fucked
transcribers are going with horribly long filenames
i was gonna ask about that too actually
what do you do if there are 2+ stories with the same name in the same month
which mildly defeats the purpose of the yyyy/mm/ prefix imo in readability
fuck you 1000 bytes inside the filename
windows 11 users found dead
filenames should theoretically be stable but rn i can't promise that
United States v. Article Consisting of 50,000 Cardboard Boxes More or Less, Each Containing One Pair of Clacker Balls, 413 F. Supp. 1281 (E.D. Wisc. 1976), is a 1976 United States District Court for the Eastern District of Wisconsin decision regarding a requested order from the United States government to seize and destroy a shipment of approxim...
except its every single file link for us
what the fuck even is this lmao
that's not even the longest Wikipedia article title smh
this article names a case with an even funnier name
United States v. Approximately 64,695 Pounds of Shark Fins (520 F.3d 976) is a 2008 decision of the United States Court of Appeals for the Ninth Circuit concerning civil forfeiture in admiralty law. Judge Stephen Reinhardt wrote for a three-judge panel that ordered that the shark fins be returned to their owners, reversing a decision by the Sout...
the combination of "approximately" with a number declared with 5 significant digits down to its ones place is transcendental
yep
sobbing
@carmine quarry o ni anu semer
no clue what to do with this
https://github.com/kulupu-lapo/poki/pull/18
sure if you want to
what about it
if I merge will it delete the branch :(
I did not ?
I asked for the opposite
unless I spoke toki pakala instead of toki . pona
alr
question:
if a doc is translated, do you set authors with the same list as translators, or are they intended to be mutually exclusive
personally i would not distinguish authors and translators but someone already started to
oh right for like books and stuff
idk
yeah it's a valid distinction
do you set authors with the same list as translators
wdym by that
normally authors is entirely required, as far as existing metadata is concerned
but if translators is set, sometimes authors is not
i translated the man car hook hand car door story and it has no authors listed
i am not the author but the authors field is required
so should authors field be made optional, or should translators be duplicated to authors
tbh i don't think it's accurate to duplicate them
so authors would have to become optional
well, more exactly, one of authors or translators must be defined
@fading plover want perms in kulupu-lapo btw?
oh that would be epic yes
wtf is going on with the date in toki pi kon pona
it's in 2021/05 but dated 2021/12
but that 2nd date is only an updated date, not a publish date
btw pushed
enjoy your lipu
for a lot of very early toki pona for some reason
notes: originally in ucsur sitelen pona, but saved in plaintext for Lapo
note under my jan mun publication
neither the utala mun publication nor mine on mun.la uses UCSUR. it is in sitelen pona via a ligature font.
contributors need to be aware of the difference between UCSUR and ligature fonts
down from 8,000 lines of errors to 283 lines of errors
errors banished but some files have formatting changes that were not intended
this is how i find out my diff viewer hides certain kinds of changes
anyway pr opened about it
merged
Oookay so two good and two bad things related to lukin/nasin ilo
+The test setup worked and was online and healthy for about 48 hours, but I forgor to tell about it here
-The server just went down
-I cannot go check on it, because it is about 1350 km south from where i am right now and my kvm is hosted on the same computer lol
+I am gonna publish the code for them in the near future
awesome to know theres progress happening
@fading plover your merged pr fails ci btw lmao
bruh moment, aha'
oh all the failures are in new files and there are only 3 of them
I really should hve written this in go lol
ERROR: Permission to kulupu-lapo/poki.git denied to gregdan3.
mwahahahhahahhhahaha I have infiltrated the systems
now I am kekan San, with all his perms
and he is... well
/j
[mu PAKALAA]
absurd discovery:
the frontmatter parsing library i'm using will automatically load keys named date as datetime.date if they are unquoted strings
i get this is mimicking normal yaml parsing where 1 is an int and "1" is a string, but respectfully, what the fuck do you mean
ilo muni now supports poki lapo
if you commit frontmatter crime, i will find you.
maybe i should not count this
or this perhaps
I don't know, I think you should
the problem is that these blocks of text are fundamentally different from the rest of the data in ilo muni
first, these are long form media; almost everything else is conversational
second, these specific instances i'm pointing out are specific artistic expressions rather than more apt demonstrations of how language is being used
if i were to generate a graph with these datapoints included, you would see a huge spike in the counts for these words on specific months
and for the rest of the data, that would reasonably mean something happened that month which affected the way the community was using toki pona, like something new to talk about
but here, it doesn't- it reflects a single author's specific artistic expression
that form of misleading presentation is something i want to avoid, since ilo muni is otherwise not able to provide more context
in the case of the example from nasi, the 2nd, i'd liken this to counting every instance of "Click here to log in" once for every webpage it appeared on for the purpose of an english version of ilo muni, because they appeared in the user interface on every page of a website you downloaded
that's an example of language, but it isn't one that's useful to the end of understanding how the language is used- at least, not more than if you counted it a single time.
i should upload my song translations to poki Lapo tbh
uuuuuuuuuugggggggggggggghhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh
I have to upload stuff
I don't understand any of þis
fixed invalid files on main
unfortunately this involves making uncertain predictions about who was the author of texts from 2003
especially with page history being suppressed, presumably due to dead names
@devs can you all make a little page for how to contribute,,,,, with links to guides
how to Git
@carmine quarry 🥺
doesn't @swift imp already have experience with git from committing to (sona) linku?
i agree people need to get tutorials but i don't have a good introduction to give
@jagged burrow i have committed lipu tenpo nanpa loje
if i did any metadata in a way thats inconsistent with yours feel free to change it
adding items to the library is simultaneously so much easier now that we have a lot, and still such a pain
ksldjfghsdfl i open 2024/08/ to see if anything is gonna clash with utala musi 2024
theres a fuckin decked out fanfic
this community is something else
plaintext/2024/08/kiwen-loje-en-kasi-tawa.md
poki Lapo: the plaintext library
texts that need to be put into poki Lapo:
fuck
unpa lapo??
kule
it is plaintext though
oh kala Asi means the color my bad
I have a README.md file for my project underscore-cli, and I want to document the --color flag.
Currently, the only way to do this is with a screenshot (which can be stored in the project repositor...
specifically ${\textsf{\color{lightgreen}Green}}$
we're not targeting githubs md renderer
it should render in many md renderers
in fact i suppose we're not targeting anything in particular for now, until soko Ni puts together a frontend
at that point we'll have to start going through every file and fixing anything that looks odd i suppose lmao
is toki musi pata one work or four works 
4 maybe,,,
as someone who reads only a very small percentage of the literature this community produces
i did not expect one of the utala musi 2024 submissions to be a @fading plover fanfiction?
i guess i should know better
💥
:(
its something to revisit when we have a frontend that we can actually test against
do you guys plan on making a frontend from scratch or use smt like hugo or jekyll or whatever
up to soko Ni
ask @nova gale
@shut flume utala (2024) has been added to the library
yappie
oh yes, it's SO good
lakuse sent it to me as soon as submissions went live
I actually have a working flat markdown cms and api
its just that the frontend is missing lol
what kind of markdown flavor
Idk if I need to do þe same þing and like if I do were do I put stuff I don’t understand
okay tldr. every work is a markdown file. every markdown file consists of metadata at the top, and text at the bottom. the template for metadata is written in the readme. if the work is part of a series of works (usually by different authors), make a collection file, which is basically a title and a list of md file paths. once youre done, make a pr, get a ✅, and we will merge
i recommend you start just by cloning the repo and looking at files
What do you do wiþ images
Do you just transcribe it?
Also idk how to make a collection file
Oh wait maybe I do
we include a link to it but our primary purpose is archiving text
if @exotic ivy makes one for linku, we can reuse the tech for lapo
Why is þere no folder for 2025
no works created in 2025 have been uploaded yet
if a folder doesnt exist and it needs to exist, you make it
OKi
It can be uscur text, right?
okii
just like any regular text file
you need a text editor
textedit on macos and notepad on windows
And þat's it?
yep it’s just a text file with the md extension
How do I add þat
what’s special about it is just how the text is written
i’m not sure about the specifics (depends on your software) but in any case you should be able to save it as .txt then edit that extension
file extensions are simply a part of the file name that signifies what programmes should access this file. you can edit an extension in exactly the same way as you can edit the file name
i merged it
I saw
the only weird thing was that you called the folder 1 instead of 01 but like ye whatever i fixed that in a separate commit
btw you use the web editor to change files
i highly recommend learning the download (pull) -> edit locally -> upload (push) approach instead
alright
mi pali tawa utala la mi sona ala e ni → mi ken sitelen [pona] e pali mi · jan ante li pali tawa utala la ni ↑ · wawa ale
-# tenpo seme la sike nanpa WAMMMLLLTAMMMLLLW pi toki [pona] li kama,,
utala kama la mi o nasa wawa o pakala ale e poki [Lapo] :3
Incorrect - the story was written to function if no JS is activated
there are anchors to jump to sections, and a print version includes the section numbering instead
(It was my intention to make it a static PDF eventually, but I'm not satisfied with a format that wastes so much page space so far or breaks colums weirdly across pages)
if you want to combine those features with the Latin version, some slight CSS modification would be needed
the 3rd place winner that year might be trickier in comparison because it's an image
I was writing Lapo in my notes and rebus-ed this glyph up (left)
the first one is when minecraft bed emoji

i kinda much like the left one lol
ur a silly scribble
I am
I wonder, is there any way to format the files well
prettifier for the files :>
librarby #2
I wonder how to deal with this
https://drive.google.com/file/d/1csNDqOs_yRFZWo_tninZLiS0jf-QO8tb/edit
two of them
I found something similar
https://docs.google.com/document/d/19cxM0qKPf740SQUbqtBFC91mmaBRvvzjEmqcROA0jFI/edit?tab=t.0
“Ilo Supanpo o” - an attempt at toki pona collaborative romantic fiction This was written by many people, including: -jan Sijo -jan Ki, who is also the main love interest -jan Walo, who kickstarted the whole thing Kipisi nanpa wan: toki olin "ilo Supanpo o, mi olin e sina." ilo Supanpo li kute ...
how did you end up dealing with the colours?
@carmine quarry
sobbing
can someone that knows Japanese help with this one
https://soundcloud.com/kikiss-2/ma-sike-laso-kalama-mun?in=wj4278%2Fsets%2Ftoki-pona-music
Music : Tokyo Usagi Music
Lyric : Tokyo Usagi Music+kikiss...zzZ
This lyrics was written in toki pona & Japanese.
got my first title conflict, had to add the author name to one of them
sina pilin seme?
kulupu pi soweli nimi tokyo usagi music
kulupu pi soweli ken kalama
soweli moku loje kili suwi
mun li sike
mun li musi
mun li walo
mun li pimeja
wan tu wan tu wan
wan tu wan tu wan tu mute
lape pona
https://www.youtube.com/watch?v=wbGOV8hM3e0
Wouse MD
how am I supposed to do subtitles with Markdown
ken la o wan e nimi suli e nimi suli lili
# lipu Nasino Sijokapi: toki pi lipu...
a nasin Markdown la nimi suli mute li ken a . nasin YAML a la o wan
ni li sama alaaa
li poka a
pkoa ala
toki tu li ken ala la toki wan li poka pona
to be fair everythinh about this piece flies in the face of copyright
toki a. mi jan pi ilo nanpa la mi wile pona e poki Lapo la, mi toki tawa jan seme mi pali e seme
sina wile poki Lapo e lipu e toki la o pana pi nasin Git · sona lipu o lon open lipu o lon nasin YAML
sina wile pali ilo la o alasa e sona tan lawa Asi tan soko Ni pi pali ilo
mi wile pali ilo. mi ken toki tawa jan ni. sina pona
okay tldr if you want to help with the frontend please first check up with @nova gale whos done a lot of work already but hasnt made it public yet, i dont want you to accidentally duplicate the work
if you want to help with something else lemme know
Got it. Thanks!
thank u for helping in advance :)
we have collected a bunch of "good" publications that are well organised. we have not touched songs. at some point we will want to collect loose works, but this is probably in the future.
soko Ni has done work on the frontend and the api, but its not public yet. a lot of future expansion will likely come after that
mun Kekan San either wants to or has already hooked up Lapo to Muni, so we have a user
i have done so! i believe it's in the current release too, but uh, I forget lol
it certainly will be in the next if not already
we have not touched songs.
wrong
at some point we will want to collect loose works, but this is probably in the future.
I have
blasts @carmine quarry with my laser beam
good we have more progress than i am aware of
still, songs haven't progressed far enough to a point where we can say more of them are in
these two points, I have actually done this for obscure(r) places of the internet where I fear I will forget to come back to
and I just, throw them in with the rest, the toki pona library branch has a lot of these
i've collected some texts from personal websites and keep track of them here https://github.com/kulupu-lapo/poki/issues/17
UnusualEgg's stories in toki pona Source: https://ctrl-c.club/~unusualegg/tp_stories.html Collection file: https://github.com/kulupu-lapo/poki/blob/main/collections/unusualegg-stories-in-toki-p...
yayy
Info on lukin and nasin ilo:
- lukin is still pakala mute
- nasin
- will be published to Github <t:1740347940:R>
- will be online <t:1740866340:R>
- @stoic canyon will be joining the developement ❤️
wawa pona
wawaaa
by the way, what is our consensus on genAI content? are genAI images ok? what about things that have been co-operatively written by genAI and human?
images, sure
text, only coop?
i think we should implement a computer readable system for flagging these (in accessibility-notes?)
that would be good too
tags can work well if you want to warn that [this article has AI images]
actually probably better than accessibility
same for other CWs
it's easier because they are more contained
search for tags, see if matches bad list, if so send warning to user
it's kinda bad that most of the entries don't have tags set up I think..
I've tried to do so with some like adding what medium (song, poetry) and genre (e.g. electronic music)
on the other hand, tags are meant to be used for filtering and stuff
yea, but that would be unsemantic and kinda pain in the ass overall
a
we could make an another field for that
what's the issue with semantic unsemantic again,,,,
i think tags are on the backburner, but we should start studying sites like ao3 for inspiration on doing them correctly
first priority is text preservation, second priority is text discoverability, yknow
@carmine quarry is ARR the default value if an entry has no license defined?
in the api probably just keep it a null or undefined value
in the frontend treat it like ARR, but maybe give it a comment that its presumed
i recently really started digging into a11y so if you would like a second pair of eyes on it for a11y concerns please lmk
Yippee :)
A metric fuck ton of schema fixes related to dates incoming...
oops...
please do not make your dates string (for example date: "2020-12-6" or date:'2020-12-6' instead of just date: 2020-12-6)
in addition to ; and }, yes
we are slowly making the metadata more strict
Update:
- I sadly need to move the publishing to tomorrow, because i have school.
- I will likely get it running tomorrow too, so that's positive 👍
@nova gale bump
should this be split up or joined together..
http://tokipona.alinome.net/tra_gabecquer.eo.html
http://tokipona.alinome.net/tra_egaleano.eo.html
Pluraj poemoj de la hispana 19-jarcenta poeto Gustavo Adolfo Bécquer, el lia verko Rimoj, kun traduko en Esperanto kaj Tokipono.
Pluraj tekstoj el la libro Memoro de la fajro I. La naskiĝoj de la urugvaja verkisto Eduardo Galeano, kun traduko en Esperanto kaj Tokipono.
use ur judgement imo
akesi o, mi wile ala māori!
@jagged burrow lipu penpo kala
mi sitelen jaki e pali ni wile lon lipu pi supa mi
suno li kama lon ma mi la mi pali wile
i think i did like one lipu tenpo? the previous one?
hmm its not very pleasing but until a further notice its better than nothing
o toki sin e ni: nimi Lapo li tan inimi seme pi toki pona
[lipu ale pona .]
nimi li suli ala tawa mi. jan li wile la ni li ken ijo ante
mi pali (lon tenpo suno ala...taso mi pali)
btw how do we feel about archiving just. random stuff from discord
i'd totally go and grab every poem anyone has ever shared in #pali-musi if that's something we wanna do
I mean
I have done that
I've grabbed stuff from #pana
I've grabbed stuff from people's blogs, even if it's a short message
I see no limit besides it being in toki pona and being public
it's already transcribed
I mean it as like, putting into the database, plus mean this jokingly
see this
how do I note that a story has an unknown author
this may be either anonymous or lost to time
either in toki pona spaces or in real life (folk songs, the Bible, etc)
[author unknown]
wait does it need to be in toki pona?
it's because we are looking for machine-readable data
what do we do with blogs and stuff I wonder
https://web.archive.org/web/20220812202853/https://lipu.pona.la/
covers count here I think
yep this is good
but for example o lanpan (pan (nanpa wan)) has a large part of o lape lyrics but is its own thing
yeah
cover means merge
sample means split
o lape has the short and long versions, maybe that gets merged
big
lipu ni li awen lon ilo mi · wile la mi ken alasa jo e toki mama ona
a ken
kalama mama li lon ala nasin pana la ni o lon poka pi nasin pana anu seme
seme
esun [Nintendo] li pana ala e kalama ona lon nasin [CC]
me when the transcription projects requires you to transcribe, argh ..
https://www.youtube.com/watch?v=xQZkZXHfsoA
mi pali pi mute ala taso mi o lon e ona
is there a name for using git braches like this
I maybe should merge because there's so much in my PR
the "face" is too derpy imo
i tried something similar and im not feeling it either
do I split this into several files,,,,,, oh no
https://www.rxddit.com/r/tokipona/comments/cfpyny/haiku_challenge/
As the title suggests, anybody up for the challenge of making a Toki Pona haiku?
The rules are simple, it has to be 5/7/5, and it has to fully express it's theme/meaning.
Edit: With English translations too please, for those of us not yet fluent.
opinion
the default answer is "dont bother, one file is fine"
if someone later on wants to put in the effort, theyre free to do so
oooh but authors are all different here
anyway as long as you get it documented thats already great
oki
💥
I'll keep this for later
another question is, can you discover when 'ma mi li pimeja' was written
Wikisource doesn't have it (yet)
maybe searching on the forums will lead to something
@fading plover
💥
we might want a full dump of the forum db while it still exists
-# I thought it already existed, that's why I bothered Kekan
i actually have a full dump of the public forum
that said, the forum is Supposed to remain up into the indefinite future
can you go into it to show whatever is this link
that link was a random one i knew existed so y'all could see the forum is down, it's not really relevant
a lol
@heady vigil
two different people made a "lets touch everything" commit, and they're not nicely one after the other
so youre gonna have problems
i recommend just dropping this branch and doing this again
argh
but like. cleanly and merge soon
rebasing with a bajillion conflicts is pain
you can try but i don't think youll like it
it's worth to tryyyyyyy
ye go for it
I wonder, is there any way to compare diffs between branches?
unsure
in terms of lessons to learn,
(a) i, as a maintainer, am not paying a close eye to this project, and causing you to do extra work, so sorry
(b) when anyone makes a pr that touches a lot of things, its best to merge quick before someone else starts similar work that conflicts with yours
oh btw CC-BY-SA is not so much a typo as a disagreement what standard we're following
it is a typo actually
if we adopted spdx this is what the license field should look like
grr
sidenote: your pfp reminds me of ijo Stella
that's because we used the same Picrew
seemingly did it
it seems like it worked after all
never doubt my masochism
wawa a
o mu e mi when youre done
after you merge i wanna throw some stuff on top
ye?
the heck is going on here
or how to fix it, as I tried to merge the branches together and it decided.... how about going one commit short
honestly i can't read this kind of graph
lol
i mostly just read diffs
imagine reading
the diff is all sorts of fucked and i vaguely know why but i can't give specific advice on why its like that
also
it wasn't your toki pona library pr that needed merge/rebase/redo from scratch work
it was the formatting one
also this is not a rebase
no
didn't
fucking shit shit shit
@heady vigil i propose the following
propose
- I figure out your toki pona library thing and bring the branch to a usable state
and merge it, to get us out of these troubles for a while
yeah
- you later have to redo the work done in "Formatting updates" likely from scratch
in that case, I would be banned from touching a keyboard again
im guessing what youve done in that branch is mostly a search and replace
yeah, it's not that much I hope
git is such a powerful tool but it sucks ass and shit and I hate it
@heady vigil https://github.com/kulupu-lapo/poki/pull/29
check the history of this one
i think ive got all youve done here?
yay
double check the contents of these files:
2021/02/waso.md
2020/11/kama-pi-poki-ala.md
they may or may not be what you want
most of it
these are missing
what branch are you looking at
a ale li pona
i was perplexed at +10k lines, but turns out 80% of that was legit
sina pali suli
kama pi poka ala is good
waso should be deleted in favour of waso-tanije and waso-likipi
you're free to delete it and commit
trur
okay and once you're happy with the branch:
then it's validation time
read through these errors
yep
breaking the sentence at this point, esp with a colon, is so tokiponapilled
some of them are not your fault - theyre on main
so ill go fix them on main, and push them to you again
lol
ni la o awen lili
can i just say i love this
while fixing errors is a chore, it is also a growing pain
the collection grows
@heady vigil main is clean now!
WIP library / monolingual corpus for Toki Pona. Contribute to kulupu-lapo/poki development by creating an account on GitHub.
these are your actual problems to deal with
yippee
the way uou say that 😭
bug me if anything is impossible to understand
but most error messages seem clear enough
oki!!
@sullen imp i just saw your lipu.pona.la pr. this one really blurs the line between a social media site and a creative works collection
idk what to do lol
@fading plover i want your opinion as well
https://github.com/kulupu-lapo/poki/pull/27/files
like this is a creative work sure
but surely this isn't
toki
- since the site no longer exists, i 100% want this included in ilo muni, and poki lapo is the easiest way for y'all to do that for me
- but poki lapo is not Just an archive of course; it's an archive of creative works
- i'm on board with pruning but somewhat nervous about, effectively, making a minimum definition for what counts as a creative work
- that's ultimately up to y'all though, but note i would prefer the most liberal definition for my purpose
- and that said, a lot of these seem more like introductions and conversations, not so much creative works; you'd prune a lot of them for a stronger definition of art
pretty much my thought as well. im leaning towards making an explicit exemption for all of lipu.pona.la, on the grounds of practicality (and archival)
mi ken awen e ale lon ma ante lon wile · mi sona ala e nasin pi pona ale
-# kin kulupu li wile la mi ken lon sin e lipu
-# mi [Kita] ante
pakala.
poki Lapo is an archive for a very specific thing but it may be integrated into a bigger archive for more works like so
there's a lot of media that can't be added like multimedia, PDFs, presentations, videos, social media posts, works in English about toki pona, so much more
i propose ijapo ,, ijo ale pona
your biggest problem will be finding someone with infinite storage
la ni li seme :p https://github.com/kulupu-lapo/poki/blob/a712a99b14c7ecfeae8ea69d317015e33d9c5f0e/plaintext/2021/12/toki-lili-lon-ma-kasi.md
god, obviously
pilin mi la poki wan li ken jo e ale pi toki [pona] lon tenpo lon
poki pi nanpa WAAAAAA
good news! the entire publicly available output of the whole community could comfortably fit within 12TB.
bad news! you do not have the rights to distribute almost any of it, and a fair amount of it exists on platforms that Will go after you for trying.
toki a! Does anyone happen to have the link to lukin Lapo? I'm ready to start working on it!
sina nanpa a e ni anu toki e pilin taso anu seme
@nova gale
mi nanpa e ni mute la mi ken sona e ni ale
pilin li lon li lili
I love breaking the law
@carmine quarry what do I do with unknown authors?
if you tried to research it and failed, put unknown and explain why you dont know as a comment
but we dont wanna make a habit out of not researching
are you implying that I don't do my research? /lh
also we need a way to input dates like YYYY and YYYY-MM
and also title and original-title being null makes the computer throw a fit
@fading plover what do you think: maybe we force every entry to have a date, but add an optional field for like "level of date uncertainty"?
a frontend would almost certainly want to have a full date object
huh, this is actually a much better solution than a bunch of different date fields
i would call it "date precision" and the options be "year" "month" "day"
then the frontend can omit segments of the date as needed
what would you call the level of precision when we arent certain about the year
given there isn't an obvious thing for the frontend to round to, i would not attempt to put a date on those
put them in the unknown category and leave a comment explaining what the date ranges are
no, i just know my standards would slip
yeah we want all files to have a title
well
what if there is none
poetry sometimes has no defined title
some posts just start and have nothing definitive
for original-title (this can be resolved more simply by just removing it), some works, I can't know what they are about
2020/08/toki-insa.md is a translation of a letter in Esperanto and (1) this author died decades ago so I'm not sure where to find these archives and (1) I don't speak Esperanto
we have to have titles by definition of a file
if that requires selecting a title that the author themselves didnt create then yeah we have to do that
speaking of, im starting to suspect our file naming convention was not a good idea and we should have had short meaningless ids
because there are name conflicts, there are date changes, all of them will break links when we have users
i wont touch that rn tho
nasin [ISO] nanpa MMMMLWAW la ni tu o ken https://ijmacd.github.io/rfc3339-iso8601/
taso nasin pi nasin [YAML] li seme 🤔
-# ona o mu pi toki [pona],, · mi wile ala e lipu nanpa f3d971ab li wile e lipu nanpa Wanipetosiku,,
ken suli a
so fun fact 3-4 syllables is all we need for the foreseeable future
mi pali la mi wile e nimi suli · wan la pali mute li lon la jan pali li ken lukin ala e nimi kepeken pi pali ante · tu la nimi mute li sama li "Lapo" li "Lepo" li "Wapo" la ike
taso sina wile ala ni la pona :3
hello! sry for not being online/meeting the dl i posted last week,,, i caught a quite severe case of corona and influenza b at the same time (avg my luck lol) and was stuck in an ICU for almost a week due to respiratory issues. anyways i'm better now and have almost caught up to my schoolwork... i think i will have nasin ilo published and running monday or tuesday
how about representing them as a range between two dates
like if we know only that the entry was made in 2020, we would have
date-start: 2020-01-01
date-end: 2020-12-31
would also be an easy way to handle stuff like reddit comments/chat logs/competition entries
oh that's nice
oh dear 🫂
pona ale o tawa sina
o kon pona
pona o tawa sina kin! ni la, mi pilin pona. ike nampa wan mi li ni: pali pi kama sona li kama mute :D
ma [Reddit] la selo majuna la sina luka e tenpo la ona li pana e tenpo pi nanpa mute a · selo sin la mi sona ala
lipu utala la tenpo ni li wile anu seme → jan ale li kama ken lukin e lipu
selo sin la sama
heyyyy, shouldn't poki/plaintext/2003/05/ma-tomo-pape.md and poki/plaintext/2003/05/mama-pi-mi-mute.md be credited to the authors of the Bible and not Damian Yerrick?
sewi Jeli 🙏
aren't the authors of The Bible™️ unknown
idk but we have some documents with authors containing "Authors of The Bible" so i guess not?
I'm pretty sure all of the theories on the Bibles authors are just that, theories
they could be true and they could not be true
it's pretty clear stylometrically that it was not written by just one person
interesting
we could have the literal string authors of the Bible to at least GROUP these texts
otherwise have a tag for the Bible
i think "Authors of The Bible" is fine authors field
What it says on the tin. All of these are from Wikipedia's "Deleted articles with freaky titles" page, found here: https://en.wikipedia.org/wiki/Wikipedia:Deleted_articles_with_freaky_titles
Background music:
"Winner Winner!" Kevin MacLeod (incompetech.com)
Licensed under Creative Commons: By Attribution 3.0
http://creativecommons.org/licenses/...
@heady vigil @nova gale i suggest specifying the book from which a given translation is from, this way if its a book from the old testament we can avoid specifying whether its from a christian or jewish perspective
oh true
tho for these specifically they are likely written from a Christian perspective
does that really matter so much we need to make an exception just for one collection
writing "the authors of book x from collection y" instead of just unknown like we do for all of the other entries
i mean i really do not care that much,, would just be more consistent
hm im non religious and idk about yall but we'd benefit from a perspective of someone practicing
unknown usually means "hard to research", with abrahamic religions texts its more "theres been tons of research but its complicated
sure, as i said, i do not have a lot to say about this, as i am neither religious nor have i done a lot of collecting texts
asked about it on the sona.pona.la server, will get back to you all if theres any useful feedback
I feel like the translators for lipu sewi pi toki pona should be credited
alllllrighty, no nasin today, so tomorrow it is. i am fully ready, except for a weird bug where the database drops the table containing all of the poki entries randomly and leaks all of the memory (yes in go; no, i don't know how). the bug is likely in the database, which i could likely debug in about two hours if i wanted to, but tbf i'd rather just go lape and do the debugging tomorrow.
okay so turns out the "database" i have been using is just a wrapper for a massive json document containing all of the entries stored in it...
waow
i have already committed to it so i'll just rewrite the "database" to use sqlite instead of json
probably will make it at least 100x faster too lol
incredible
what is the total word count of all lipu tenpo releases?
0001-nanpa-akesi: 4392
0002-nanpa-mun: 4859
0003-nanpa-soweli: 6453
0004-nanpa-kasi: 5959
0005-nanpa-pan: 6725
0006-nanpa-suno: 5846
0007-nanpa-kule: 6761
0008-nanpa-toki: 6072
0009-nanpa-moli: 6448
0010-nanpa-lete: 5580
0011-nanpa-walo: 5756
0012-nanpa-nimi: 6468
0012.5-nanpa-kijetesantakalu: 156
0013-nanpa-pipi: 5891
0014-nanpa-seli: 5852
0015-nanpa-moku: 5475
0016-nanpa-kulupu: 7370
0017-nanpa-musi: 5931
0018-nanpa-tu: 7438
0019-nanpa-mama: 5811
0020-nanpa-nasin: 6283
0021-nanpa-ma: 6098
0022-nanpa-sin: 6547
0023-nanpa-sewi: 5350
0024-nanpa-tenpo: 5626
0025-nanpa-kalama: 6321
0025.5-nanpa-lili: 315
0026-nanpa-jaki: 4991
0027-nanpa-linja: 5824
0028-nanpa-lawa: 6220
0029-nanpa-jan: 5772
0030-nanpa-loje: 4788
0031-nanpa-kala: 5770
Total: 185148
put together, lipu tenpo is the largest text produced in toki pona, exceeding nasin Lanpan by a factor of three
to, perhaps, no one's surprise
for disclosure, my working definition of a word here is "some letters surrounded by a gap" (\b), which is famously a flawed metric, but my hope is that lipu tenpo is sane enough to not have any xnopyts, aaaaaaajjjjjjjjjs, and hrrkrkrkrwpfrbrbrbrlablblblblblblwhitoo'aps
by "letters surrounded by a gap" do you mean /\b(\w+)\b/gm?
ya pretty much
I dunnooo... i think this is pretty close:
kijetesantaklu!
kijetesantaka-
lu kijetesantakalu
kijetesantakalu.
kijetesantaklu kije-
tesantaklu kijete-
santaklu kijetesan-
taklu? kijetesantaklu
kijetesantaklu kije-
tesantaklu!
-kijetesantaklu So-
natan
:D
you forgot nanpa ala, which has 0 words!
it may be excluded from lapo, i don't remember
largest works in lapo, if counted by this naive method:
name wc
196 nasin-Lanpan 72227
301 nasi 65316
434 jan-sitata 12997
188 lon-anpa-pi-sewi-walo-lon-sewi-pi-telo-suno 11146
191 mi-jan 9219
294 mi-en-waso-kaka 8511
608 tu-kuntu 7452
416 jan-mun-en-nasin-waso 7220
408 nasin-pi-kama-sona 6209
423 nasin-iso 6041
440 nasin-puta 5638
765 o-toki-e-ijo-pi-toki-pona-ala 5203
476 ni-li-nasin 5011
485 kalama-sin-suno-pi-toki-pona 4448
303 nasin-nasa-mupa 4398
209 waso-sona-Ukami-en-monsuta-pi-kiwen-pimeja 3835
464 ali-li-ale 3595
740 musi-pi-kala-ko 3465
413 jan-pi-alasa-kala-en-jan-olin-ona 3363
100 toki-10000000000-pi-nasin-limili 3247
total size (on main) right now: 610k words
nasin Lanpan gets an extra 10k words out of the ether but 🤷
@heady vigil @fading plover check this out, lapo word count per month
i know the two big fuckoff spikes are utala musi
but i wonder why the spikes in 2021-2022 were much smaller and much more spread out throughout the year
neat! collaborative and/or coordinated writing maybe?
also same graph but yearly
so the trend is slowly upwards (just like ilo Muni) but the distribution across the year is lowkey kinda concerning yknow? it feels much easier to ignore toki pona literature when its all august
@shut flume sina pilin seme tawa ni
i am concerned about the void that uta monsuta will leave
wow!
aa
-# what does this mean, is it ending..
the project's maintainer resigned so its up in the air
oh no
i trust someone to step up before long
I wonder how much of this is due to we only including few published works before 2020 and how much is it due to that. we don't have much before it
true..
but i wonder by how much
we remove hyphenated words
well, adopt the project
mute pi wawa pali mi li wawa ala. mi pali pi utala musi. tenpo ni la ni taso li ken. tenpo kama la mi ken pali ante.
mi lukin e pali pi jan Stella lon pali 'Bobelarto' lon kulupu pi toki Epelanto
ona li pali mute. taso kin la ona li toki e ni:
tenpo suli la ona li pali e ijo ante mute tawa ni: pali Bobelarto li ken
pali Bobelarto li suli mute. jan Stella li ken ala 'open taso' e ona. nanpa wan la ona li wile pali e ilo mute mute e nasin mute li wile alasa e jan pali poka.
mi wile ala open e pali suli tawa ni taso: pali mi li ken ala awen.
open li pali lili.
awen li pali suli.
ive split the word count into lipu tenpo and non lipu tenpo components
lipu tenpo is blue
just non-lipu tenpo content:
i think the 2021-2022 bump is mostly kalama sin and lipu kule
which makes sense cause thats what we have documented
@vestal herald whats the biggest bottleneck to making lipu tenpo monthly? writers? artists? various support jobs?
yea, writers and proofreaders, I think, right now?
sona pona
who'd thought that the written work has a bottleneck due to writing
mi pali pi ma [StoryWeaver] · mi o toki ala toki e jan pi sitelen kule lon seme
@heady vigil o · nimi mama pi pipelen kule la pilin pina li peme
nanpa wan
a kin "lipu mama anu ante toki" la · nasin pi tenpo lon li nasa tawa mi · poki authors li ken toki e mama pi lipu anpa li ken toki e mama pi lipu mama · nasin ni li seme tawa sinjale
title: nanpa luka luka luka
authors:
- jan Kita
parent-works:
- relation: translation
metadata:
- title: Burger King Foot Lettuce
authors:
- Top15s
sources:
- https://youtu.be/9PWjqgM_CU8
- relation: remake
path: plaintext/2020/11/nanpa-je-ka.md
mi wile lipu e nasin sewi pi wile mi lon suno kama
# plaintext/2022/10/soweli-kisa.md
title: soweli Kisa
authors:
- type: main author
path: authors/tokiponists/jan-kita.yml
- type: proofreader
from: 2022-12-21
to: 2022-12-21
data:
names:
- waso Keli
- waso kitty
- type: proofreader
from: 2023-06-19
to: 2023-06-19
data: { names: [jan Tepo] }
published-on: 2022-10-28
modified-on: 2023-06-19
tags:
- genre/fairy-tale
- cw/cat
parent-works:
- relation: translation of
data:
title: Kisa the Cat
authors:
- type: main author
path: authors/special/folklore.yml
- type: adapter
data:
names: [Andrew Lang]
links: [https://en.wikipedia.org/wiki/Andrew_Lang]
license: CC-PDM-1.0
sources:
- http://www.mythfolklore.net/andrewlang/265.htm
- relation: inspired by
path: stubs/2020/04/akesi-seli-lili.md
license: CC0-1.0
# authors/tokiponists/jan-kita.yml
names:
- jan Kita
- poni sona
links:
- https://hecko.my.to/
# tags/genre/fairy-tale.yml
names:
- "fairy tale"
- "musi usawi"
description: Folklore story typically featuring magic and/or mythical beings.
implies:
- genre/folklore
# stubs/2020/04/akesi-seli-lili.yml
title: akesi seli lili
authors:
- data: { names: [Fingtam Languages] }
published-on: 2020-04-23
exclusion-reason: commercially available
,,mi kepeken nasin ni lon tenpo lili ni la ona li kama ike lili tawa mi · mi pakala li sitelen e name e names ala lon tenpo · kin mi pana pi wile ala e - lon anpa pi nimi data la selo poki namako ike li kama
waow
its really cool that you can do this but i don't think we as a project can reasonably hold ourselves to this high of a standard
for metadata collection
our first milestone is collecting everything thats feasible to collect, and introducing too much work on the metadata side will sidetrack that
whereas expanding metadata can be done as subsequent work
lon,,
suli la mi wile e ni
anu
title: nanpa luka luka luka
authors: [jan Kita]
translation-of:
title: Burger King Foot Lettuce
authors: [Top15s]
suli la ni li musi mi taso
mi wile toki e ni lon open li weka ike e toki
we really do not need the metadata of the related files tho,, as long as they are in toki pona (aka relevant to us) they will eventually end up in lapo -> we can just retrieve the data from there
also collections etc. would probably be easier to implement with more complex frontmatter, which also would make the likely incoming folder stucture change easier
mi ken sona ala e toki mute pi tomo ni la o toki tawa mi lon ijo suli pi tomo ni.
please elaborate on that. i think we've determined we cant treat collections the same way as tags because it loses collection order
we'll probably advertise more when we have a frontend!
right now its usable but not pretty
o awen pali. mi awen.
if you read all of lipu tenpo and listen to all of kalama sin, youre already ahead of most of the community and cover like 40% of our texts so far
and with utala.pona.la youll get to like 70%
pona a.
ni li wile ala wile https://www.youtube.com/playlist?list=PLLrnNZ0mYpipkRKKjyMOkDpbgl_iryh0y
-# [ilo penpo o lukin ala]
-# while expanding on this reform, thinking what else i might want in it, i realized i'm reinventing wikidata
wile
lol
let's just recreate Wikidata and Wikisource :33
@heady vigil did you find the author of the Ave Maria piece
you might have their english name, Tobias, but i've found a tp name, jan Topaja
wawa
i tried to check myself, but github is a maze to the uninitiated; where are works found? searched for? is there a plan for a frontend?
there's a front end in the works
pona
I mean we could do something like
poki-lipu:
- nimi: foo
nanpa-lon-poki: 3
i find this to be less good because this way you can't have collections of collections, and you also can't guarantee ahead of time that the indices are actually all taken with no gaps or overlaps
but yeah, depends how annoying the current implementation of collections/ is for you
merged @swift imp's ongezellig pr
@vague coral turn your draft prs into open prs when want me to review them
@heady vigil ill work on remaining errors in your toki pona library pr
-# ike la wile pali mi li moli lon pana open :p
@carmine quarry sorry just now i remembered
yeah 
../plaintext/unknown-year/unknown-month/tenpo-pi-seli-lili.md
../plaintext/2021/12/tenpo-suno-kama.md
../plaintext/2021/12/tenpo-pona-pimeja.md
../plaintext/2021/12/sike-pini-en-sike-sin.md
../plaintext/2021/12/pona-o-lon-tenpo-sewi.md
../plaintext/2021/12/o-awen-e-tenpo-pini.md
../plaintext/2021/12/jan-pana-li-kama.md
../plaintext/2021/12/jan-lawa-pona.md
../plaintext/2021/12/ilo-kalama-li-mu.md
../plaintext/2021/08/pona-o-tawa-kulupu.md
../plaintext/2021/08/o-weka-tan-ona.md
../plaintext/2021/08/o-tawa-e-mi.md
../plaintext/2021/08/mi-tawa-ma-mi-pona.md
../plaintext/2021/08/ma-pona-mawi.md
../plaintext/2021/08/ale-o-pali.md
../plaintext/2021/03/telo-supa-li-supa.md
../plaintext/2021/03/kasi-la-waso-lili.md
../plaintext/2021/03/jan-esun-li-kama.md
../plaintext/2020/12/kasi-pi-tenpo-lete-o.md
@heady vigil these need authors
what did we decide about unknown authors. i forgot
so did I
sdflkjgsd
what are all these even
ldfkjsdf
you know what im starting to suspect
our metadata is poorly done for translations:
I believe so
i think we'd actually want to put the translator in the main author field
and make the original author the optional field
at least, in most cases thatd be more relevant, imo
huh
cause like, we always have to know the translator

as the person who actually put the toki pona words on paper
not always
anonymous
and deleted
shmanonymous whatever its like 4 works total maybe
lol
whereas the "original author" has soooooo many issues
idk
and you wouldn't want to search by it by default probably anyway??
@fading plover @vague coral @vestal herald what do yall think?
tl;dr does it perhaps make more sense to use the main (obligatory) author field for translators, rather than original authors, in situations where its a translation?
it does make sense
you're collecting things in toki pona, so the main focus would probably be documenting the toki pona person working on it
maybe?
ill wait for more opinions. also i won't change it in this particular pr, because its a big change and should be done in a window when everythings merged
also we already have
title:
original-title:
so why not
authors:
original-authors:
ni a
wawa namako la
title:
authors:
sources:
original:
title:
authors:
sources:
I'm on board with this change
while we're talking about this,
authors or by?
@heady vigil can you fill out the (original) authors in whatever way you prefer?
i tried getting to it myself and i immediately ran into "research takes forever"
good idea
i think this is really good
authors
what do you mean research!!! I did mine already
yeah exactly. youve done the research so please put down some text strings for the remaining articles. i don't care too much how its phrased, "of folk origins" or whatever, just something that i don't have to research again and duplicate your work
awawwa
ok wait so like
the remaining articles are just these ones?
that you published?
.
I got scared for a second implying that you meant the entire poki
yea
after that, we only got like ~2-4 errors
and then we can merge
ok!!
I will see
oh and also actually
I have tags for these
so it's easier to clean up
awesome
I'll get that done
can I just replace the null values with other ones and you correct these to use the correct labels?
@carmine quarry pushed
@heady vigil i have changed the 2009 dates to 2009-01-01, with a comment that they must use a date-precision field when we add one
assuming everyones happy with that solution
ready to merge? no more chabges you wanna make @heady vigil ?
pona
@heady vigil so your formatting updates pr is gonna be closed, as we discussed before
we can repeat its changes again later, when they apply to everything
@vague coral i can take over your prs if you no longer want to work on them - what do you think?
mi ken a pali pi ma [StoryWeaver] · ni li lili
ma [lipu.pona.la] la mi sona ala · (taso sina pali la o awen ala e ale tan ni taso → ale o lon linjuwi · mi ken pana lon ma ante)
jan [Lentan] la o jo
@heady vigil good news! i have bumped the song pr
yay
its now mergeable if you're done with it, or workable if you want to work more on it
please sanity check kulupu jan tenpo, sike tu, toki pona li toki pona - those had rebase conflicts
i think maybe you included them in the toki pona library
(its unfortunate that youve done duplicate work! sorry)
@heady vigil would you be willing to help me write more documentation on our progress
more documentation like what
we've merged the TP library but it's got no corresponding github issue
it would be good to write about how complete our coverage of it is
and if there are any potential issues with any of it into the future
for this summary!
@nova gale anything i can do to help you get the frontend/api out?
there is
@heady vigil should i reassign song collection work to you?
i think @unkempt oriole had considered working on it but didn't?
also does the toki pona library exist as a collection? maybe it should?
whats the file name? can't find it
a collection would serve as a good way to keep track of which works have been saved and which haven't (and are thus commented out)
you also do that in the issue, but im not sure how up to date it is
it is up to date
wawa
because I have to use to to keep track of my work, so it is
i should at some point go through every file and familiarise myself with them
but i feel like thatll be easier done when we have a frontend
I believe we did say we would, quite a while ago. I would still love to (it's somewhere deep in our to-do lists), but we've been really busy and probably will be for a while, so feel free to reassign
[Reply to:](#1252224729977327647 message) i think @unkempt oriole had considered working on it but didn't?
Okay, i have ingested 2 grams of caffeine, so tomorrow, we will either have a dead mushroom or a working prototype of ilo and lukin
let's hope for the latter
im free-ish in the coming months so i might start helping with code. when its public, ofc
Sure! The API seems okay-ish after i ditched the db,, i'll do a bit more perf testing, write a quick an dirty demo frontend and fix my homelabs nginx setup
should be fine
2 whole grams
isn't 400mg the recomended limit or something
soko li moli
okay we are up 🎉
i just need to wait for my dns records to update
ping me when you release code
btw does anyone know why there is an empty entry at 2022/08/ilo li wile e soweli.md?
true
i'll just clean it up a bit
search with hyphens instead of whitespace too, it might be that sort of mistake
yea perhaps,, i have not implemented collections yet so i dont know
yep thats fine
pona a
forgor
okay dns records are always slow but this is like slow
ike suli
ona o lon seme
aaaalright,, the DNS managed to update 👍
ahh shit
a new problem
ilo linluwi li pilin ike tawa nasin CORS
pakala
pakala pakala pakala
tenpo hotfix li kama
upd: yeah, im pretty sure it will be beneficial for us to treat translators as "authors" and authors as "original-authors"
im currently doing some datavis on lipu tenpo and the list of authors is quite silly
as much as we all love Sappho she probably shouldn't be appearing in the list of authors
it is admittedly a quick fix on my end but still
and giacorno leopardi
I think we should make an exception and let Sappho be on the list
wow! i can't believe Леся Українка (1871-1913) contributed to lipu tenpo
sobbing
We have only 14 ARR licensed entries lol
if we include blanks, we have 134
(out of the 686)
Also, here is a chart i made of entries in poki by date
its quite funny how linear our growth has been, yea
you could verify that by taking a derivative
i think the graph, as it stands rn, is slowing down not speeding up
but that has more to do with what we choose to collect, rather than a genuine reduction in community size
ehhh,, i am not really sure, this represents the main branch and afaik more new texts are in wip branches
also this is quite spiky lol
or actually more like
that spike on 11.11.2020 is weird
ooooooh its utala musi
yeah im pretty sure the vast vast majority of texts we have right now are lipu tenpo + utala musi
you can see any month that didn't have lipu tenpo or utala musi is at 1-2 texts
i am not sure but i think that might just mirror the reality
once we include song translations, for instance, i feel like this will shift quite a bit
@vague coral please change month directory names from 1, 2, 3, ... to 01, 02, 03, ...
li mu e @carmine quarry
decide where (if?) to credit the illustrators
seems to be an open item
write them as a comment inside the frontmatter anu seme
future reviewers may then uncomment them if theres somewhere to put them
a
# illustrators:
# - mu
i have a suspicion illustrators, proofreaders, etc shouldnt create tons of new metadata fields but instead use some sort of "contributor + contribution tag" system
fair ig
cause this feels like the sort of thing thatll keep happening
with some other ways to contribute
it is
i could be wrong tho ofc
i love it when a text library keeps track of sound designers
i get what you mean tho
you can say the same about illustrations
mi lukin ala li ni
authors:
- waso I
# - waso U (illustrator)
written sources have illustrations
wile · taso seme la ni li ken suli ala e pali sitelen
contributors:
- name: waso U
roles: [illustrator, proofreader]
ni li suli ike ala ike
fuck idk
-# nasin toki [KDL] li keeen pona tawa ni · taso i am not advocating for switching to it
the benefit of yaml is that yaml-frontmatter is standard and supported by a relatively wide array of tools
sona a
we can probably afford to grow more "custom" only once the basic usecase is well polished
ni li ike ala ike → poki "roles" li lon ala la ona li sama roles: [main author]
ken
btw have we fixed the original-authors thing yet
no
leaving it for later
- we have open PRs rn
it sucks to do schema changes when some people are relying on the old one
okk
suno ni la mi o pana e [sona sin] pi jan [Alonola] · mi ni ala la o anpa e mi
pona
most episodes have public google docs and i copied those largely verbatim, though i did remove final spaces (and maybe some doubled spaces? unsure)
for episodes without public docs i reformatted th...
mi pan a
mi la ken
@nova gale could you remind me what DNS address you were putting the first prototypes of lukin Lapo up on?
@heady vigil
YAYYY
locally
okay so first impression is
- remember when i told you about wanting to force all dates to be yyyy-mm-dd and then adding a "how precise is the date" field
yea we're 140% doing that
lol
- a lot of images are likely broken so i imagine what will follow is a sweep of all poki files
i had to delete a file because it had an svg import
just to test the front page
which doesn't even show you the articles
also silly time zone
the articles load!
centered
images are apparently not completely fucked!
centered was my bad, fixed that
you know what for a first try its not as broken as i thought itd be
yay
they are on lapo.dy.fi,, not up rn as im installing a new main server on my homelab
