#poki Lapo: Toki Pona library

1 messages · Page 2 of 1

jagged burrow
#

pona a

nova gale
#

i have now started the process of moving the repo over

carmine quarry
#

youll probably have to redeploy the ghpages

nova gale
#

i do not have ghpages for it and i will probably need to host it on my personal server since the searching is done in the backend

carmine quarry
#

ah cool

nova gale
#

i need the github api to get all the pages and the unauthed request limit is only 60 per day

nova gale
carmine quarry
nova gale
#

that is

#

but i need to get the file tree

carmine quarry
#

just have an autogenned file tree

nova gale
#

true

#

like a manifest

carmine quarry
#

speaking of code in poki, we will need to add schema verification eventually

nova gale
#

good we have github actions

carmine quarry
#

ye

nova gale
#

i still likely will have the seatching done on the backend cause i have a burning hate towards javascript haha

carmine quarry
#

lol

nova gale
#

that's why my stack is flask + htmx

carmine quarry
#

whatever works for you, its great to have someone interested in making a frontend for this in the first place

nova gale
#

it's just funi to call myself full stack dev and then tell people i do not javascript

carmine quarry
#

in terms of interactivity i think likely its gonna be quite similar to linku fontends - just content pages + a whole bunch of filters / sorters

nova gale
#

that's what i tought too

#

plus a whole lot of optimisations

#

it's not cheap to download 100s of files and look trough them all

carmine quarry
#

would it make more sense to just have a local copy of the repo?

nova gale
#

i download files by need and then cache them

carmine quarry
#

hmm when you do any sort of search itll have to loop over every file in the repo anyway

#

at least, as of now, until we have any sort of built in metadata dump

nova gale
#

the algorithm is pretty much done already

carmine quarry
#

@heady vigil please continue work not from jnpoJuwan/poki-lapo:Songs but from kulupu-lapo/poki:songs

#

ive merged your progress into there

heady vigil
#

ye

#

MI KASI

nova gale
#

njan Njuwan o, nimi "njan" toki li seme?

#

mi kin kasi :c

steady crestBOT
#
jan
usage

core (pu)

definition

human being, person, somebody

#
nja
usage

obscure (no book)

definition

meow, cat sound

see also

mu, jonke

carmine quarry
#

ona li musi

nova gale
#

a! ona li musi mute, lon

fading plover
#

mi wile sona e ni: seme la ni li lon ala kulupu Linku?
kulupu Linku li ken suli ni li ken lon lukin pi jan mute tan ni: jan mute li sona e ona

carmine quarry
fading plover
#

taso, sina ken suli e wile pi kulupu Linku
mi lon poka ni lon tenpo ni, tan pali mi pi sona nimi

#

ni li nasin pi tenpo ni ala li nasin ken:

  1. kulupu Linku li wile pana e lipu nimi pona
  2. (tenpo ni la) kulupu Linku li wile sona e ni: jan pi mute seme li kepeken nimi (tenpo ni)
  3. (mi la) kulupu Linku li wile sona e ni: nimi li mute seme tawa nimi ante?
  4. kulupu Linku li wile awen e lipu ale (ni li pona e sona lon ni^ ale)
nova gale
#

i do not understand why using data from Lapo would result in a need to put the two projects under the same organisation. also, a single point of failiure in all toki pona web services seems like a bad idea mute!

fading plover
# nova gale i do not understand why using data from Lapo would result in a need to put the t...

this is something i discussed with kala asi in DMs, but

  • being under the same org is for visibility and long term survivability of the project
  • being under the same org does not impose any specific requirements, tools, or designs on the project
  • being under the same org does not have anything to do with where the projects are deployed to or what infrastructure they use or how you access them
  • using the data from this project has nothing to do with how/why i am suggesting it be part of linku
  • (reiterating, but) it's not that using the data from this project means putting them in the same org- it's that the purpose of linku can, without issue, expand the very small amount necessary for this project to fit
  • since neither linku nor this project is paying for their infrastructure, there is no reasonable way to even produce a single point of failure that is any smaller than GitHub, The Largest And Most Important Software Development Platform That Exists, Which Millions Of Companies Depend On The Continued Existence Of

the suggestion is essentially "hey we have this brand that people recognize, and this project is really important. would the brand help the project?"
and i feel confident the answer is yes

nova gale
#

i just wanna say that while i am not necessarily against the merging, and probably not a long enough time contributor to the project that i tought that my word would matter, but i do not see big advantages in moving Lapo to Linku. by the "single point of failiure" i did not necessarily mean a monetary one but one more close to operations and leading. i am in no way suggesting that the lead of Linku is non-professional or can't do their job, but if something would happen, it is better in my opinion that different toki pona projects are not related to each other. this is not a big concern, but i think it's worth bringing it to the table

unkempt oriole
#

the leads of linku and lapo are the same person; kala asi leads both

heady vigil
#

for me, I like to split them because of the different goals and also one is technically copyright infringement and aa

carmine quarry
#

to tl;dr our dm convo, bullet points:

  • i don't mind either outcome in principle
  • i don't think the brand name itself is necessarily a help; rather, our ability to crosspromote whenever we feel like it
  • we might think about cute domain names whenever thats relevant
carmine quarry
unkempt oriole
#

i agree

unkempt oriole
carmine quarry
#

poki Lapo - WIP Toki Pona library

nova gale
# unkempt oriole i'm just saying that it sort of invalidates this point

not necessarily, as linku is a bigger project with more contributors. the fact that kala Asi leads both does not mean that the values/goals algin or that any future leader of either project will not want to change anything (not like that's a huge problem in such small projects/communities)

carmine quarry
nova gale
#

oh by the way, is it OK for me to make a webhook to the repo?

#

i'l use it to issue a re-pull of the repo on commit

carmine quarry
#

sure? ive never seen webhooks be used between repos ngl

nova gale
#

i'm basically gonna use it to reload the local copy of poki Lapo repository on the computer running the backend when a new commit is made

#

new commit is made -> ping trough the webhook -> listener pulls the updated poki Lapo

carmine quarry
#

ye sure

nova gale
#

oh and why are there so many branches of the repo?

heady vigil
#

pali li suli,,

nova gale
#

why aren't all of them on the main branch?

carmine quarry
#

- theres one branch per contributor

#

- we'll merge as work in those respective branches is wrapped up or terminated

nova gale
#

so a feature for searching the other branches isn't needed?

carmine quarry
#

no ofc not, main is production

nova gale
#

ookay

#

wait i cannot add the webhook

#

i need to be a contributor

carmine quarry
#

lemme fix that

nova gale
#

thx

carmine quarry
#

done

heady vigil
#

how do I work theb

#

when I want to pull it

carmine quarry
# heady vigil how do I work theb
  1. clone kulupu-lapo
  2. change your local branch to songs
  3. make sure no progress has been lost, just in case
  4. commit stuff to it and push
  5. when youre satisfied with the work youve done, remove the "draft" note from the PR and ping me and ill check the files and merge them
  6. the branch will be deleted on merge, create a new one and keep working
#

actually maybe deleting and making a new branch isn't necessary, but i don't remember shit

heady vigil
#

awawawawa

carmine quarry
#

its basically the exact same process, just with branches of the same repo instead of a branch of a forked repo

#

onboard work vs third party contributions

nova gale
#

i still cannot see the tab for webhooks...

#

ni li lukin sama ni:

carmine quarry
nova gale
#

ookay

carmine quarry
#

thought webhooks would be under maintain

nova gale
#

now it does haha

unkempt oriole
nova gale
#

i do not think you understood my kon

#

my problem, and the thing i meant with "single point of failiure" is the monolithic organisation with many different goals and ideas colliding in it, could result in problems related to leading down the line

#

actually the same thing that has been happening here in mpptp

kind mesaBOT
#

sona a

jan Niku ↩️

[Reply to:](#1252224729977327647 message) my problem, and the thing i meant with "single point of failiure" is the monolithic organisation wit…

narrow latch
#

I think this is a useful project! my main initial thought: It would be better to store metadata only, and delegate the storing of content to archive.org, archive.is and the like. This puts you in a better spot regarding copyright and ethics. And it avoids the PITA of having to edit git history to remove some contested piece.

I realize that you're several weeks into the project as I write this... so it's absolutely okay to disregard this. But consider the advantage of doing one thing only, and delegating the peripheral aspects to other systems.

carmine quarry
#

that is a good point

#

@fading plover what do you think

fading plover
heady vigil
#

damn

fading plover
#

here's my reasoning

#
  • if we're putting their content on archive.org or archive.is or such, we're violating their copyright anyway
  • editing git history to remove a file that would be edited twice tops is not that hard
  • having everything as consolidated as possible is a valuable goal and worth preserving for as many inputs as we can have it for
nova gale
narrow latch
#

hmm... it might work. Google Books got away with a similar approach, although it did draw criticism.

It still feels a little wrong to me to take people's texts, convert them to markdown etc, without their consent.

Somehow, all of the following feel less bad: telling archive.org to make a copy (at least it's a somewhat faithful copy). Writing a script that automates the conversion and only storing the script (at least it's clear what processing is being done).

Anyway, maybe I'm overthinking this.

heady vigil
#

maybe

#

I love legally and morally dubious acts

narrow latch
#

Do you know https://github.com/google/corpuscrawler ?
It's an example of the "we store links and conversion scripts" approach.

Anyway I'll stop bothering you ;-) Keep up the good work. If you want to include my own little text (https://blog.purpureus.net/posts/monsuta-li-moli-e-jan-500m-li-pini/), I can send you a PR.

GitHub

Crawler for linguistic corpora. Contribute to google/corpuscrawler development by creating an account on GitHub.

carmine quarry
#

@heady vigil can you double check the license on apeja li mi? its not an original song by jan Usawi, the original is by jan Lija

heady vigil
#

a

#

heck .

#

where is the original idk

#

nvm

carmine quarry
heady vigil
#

aa

#

there are some of them

nova gale
#

after we get to a stable and production ready version of lukin Lapo i could start looking for the authors owning the ARR licensed works and ask them for a license to archive their work

kind mesaBOT
#

@breakhon is the correct one iirc

njan Njuwan ↩️

[Reply to:](#1252224729977327647 message) there are some of them 📎

nova gale
#

just fixed a bug in a package that is used by 700+ projects. at least lukin has contributed something positive to the world lol

heady vigil
#

WAWAA

nova gale
#

it was kinda minor but still 🤷

dim spire
#

maybe you stopped the next xz-utils

raven parcel
#

mu!!

nova gale
nova gale
#

so updates on lukin:

  • i have the design elements and base template ready.
  • the search is almost working (i'l commit the changes adding the feature later that day)
  • we will have a complete version by the next week.
  • some form of alpha/beta version will be running by tomorrow or so. i'll announce here when that is.
carmine quarry
#

awesome

nova gale
#

heyy, is it ok if i make the following changes?

  • move the files in plaintext/unknown year to plaintext/unknown year/unknown month for technical reasons
  • fix broken front matter in (mainly putting stuff in quotes)
    • plaintext/2022/08/akesi-li-wile-lon-nena.md
    • plaintext/2022/08/ali-li-ale.md
  • in the (near) future, divide some files to paragraphs to make my searching algorithm's ranking work on them
carmine quarry
nova gale
#

sure 👍

carmine quarry
#

in the (near) future, divide some files to paragraphs to make my searching algorithm's ranking work on them
could you elaborate on that?

#

like, some files that are too long or what

#

or files that lack linebreaks?

#

fixing metadata - always welcome

nova gale
#

uhhh so basically i want fuzzy searching to work, and my searching algorithm of choice, Meilisearch, can only efficiently rank "documents" under the size of about 1 kb (about two or so sentences, a bit more in toki pona as they tend to be shorter) so i divide every file into multiple "documents" with the same metadata based on paragraphs (parts of text separeted by two newlines)

#

most of poki is divided to them, but there are some that aren't

#

first example that comes to mind is akesi li wile lon nena, as i investigated the parsing problems in it's metadata

carmine quarry
#

hmm

#

short term yeah feel free to add paragraph breaks where they were meant to be in the original. where you can at least extrapolate

#

long term i doubt we would have an automatic check that guarantees this to be the case

#

so its better to handle on your end?

nova gale
#

the 1kb is not that strict but at least some paragraphing would be nice

#

i can also automatically try to find find things like first line break after the 1kb limit, but as the algorithm can only search spans of text that are completely inside a "document" that will worsen the results a bit

nova gale
#

Pushed the updated folder structure and fixed metadata 👍

fading plover
nova gale
#

fzf and telescope are command line tools, while i need a production grade tool for searching files based on fuzzy search and many filters. meilisearch manages the database for me and is pretty much the only open source method for doing everything i need it to do, as far as i know. i also have some prior experience with it.

nova gale
jagged burrow
#

@carmine quarry hiii i accidentally commited to the main branch instead of lipu-tenpo could you help me clean up my spilled milk please

carmine quarry
#

will do in a bit

#

no worries

versed void
#

How's this been going?

carmine quarry
heady vigil
#

true

#

and I am exploded

nova gale
#

waait i totally forgor about lukin :D

forest lily
versed void
#

Wondering about making a sona pona article for this also

swift imp
#

Should I give my transcipts for my published translates Y?

heady vigil
#

I saw

vague coral
heady vigil
#

nimi nasin?

vague coral
#

laso

heady vigil
#

majuner

heady vigil
#

pana

carmine quarry
#

ijo vivi got to lipu tenpo nanpa moku

#

wawa

#

theres so much of it huh

fading plover
carmine quarry
#

LMAO

carmine quarry
#

the spreadsheet has been moved to README

#

each source now has an issue which serves as a useful page to document its progress and quirks

nova gale
unkempt oriole
heady vigil
#

it holds no original works, silly

carmine quarry
heady vigil
#

but also, yay also it's all coming well

carmine quarry
#

admittedly its just ijo vivi working on it rn and im busy

heady vigil
#

✊😔

carmine quarry
#

if we ever want to finish it we need just straight up more people

heady vigil
#

that's the issue with everything I suppose

carmine quarry
#

not quite, some projects are considerably less manpower intensive

heady vigil
#

all the cool projects hmph

jagged burrow
#

just came across this in lipu tenpo nanpa musi: a text with no author credit

#

(the author's name is normally under the bold title)

#

i've decided to transcribe this by completely removing the author field from the metadata

#

lmk if i should do like author: (unknown) instead

#

offtopic: its very satisfying to see the message for most months' folders be one of my commits :3

carmine quarry
jagged burrow
#

ona li toki e ni
ijo li pali e toki ni la ona li pana ala e nimi
la ala li sona e nimi

#

mi pana e nimi (anonymous) tawa ma author:

plucky umbra
#

What exactly is this project here?

jagged burrow
#

we're storing as many toki pona texts as we can find and changing them to all be in the same format (markdown + yaml metadata)

#

there's an faq at the top of this thread #1252224729977327647 message

plucky umbra
#

I guess you know that for the newer lipu tenpos we have the markdown files on our github

#

Well interesting! Cool project, good luck! (Am too busy with other stuff to help but I‘ll take a peak xD)

heady vigil
#

indeed

#

and this also helps with adding older ones

jagged burrow
#

looks like the oldest one that's on your github is also the oldest one i haven't gotten around to transcribing yet
i basically couldnt've learned this at a better time

plucky umbra
carmine quarry
heady vigil
#

djdhdlhdd

fading plover
#

lipu pi mute seme li lon poki Lapo lon tenpo ni?
kin la, seme la mi ken kama jo e lipu ale?

carmine quarry
carmine quarry
#

nanpa lon poki ona li ...

#

TWAMMMLLLT

fading plover
#

wawa

carmine quarry
#

just looking at nasi makes me sad

#

good fucking luck adapting this to plaintext

#

lol

carmine quarry
#

@shut flume sina sin e lipu utala la o mu e mi

#

mi o pana e ona ale tawa poki Lapo

#

pana li wile e nimi

carmine quarry
#

@forest lily kalama sin pi tenpo weka la, sina jan Deni
mi pana e lipu li kalama sin tawa poki Lapo la, mi o ante ala ante e nimi sina?

#

ken la mi ni:

authors:
  - waso suno Alana
notes: jan Deni li nimi weka pi waso suno Alana.
#

TIL kalama sin from 2021-11 reads a story from lipu kule from 2021-03. since the text is the same, im treating it the same as i would with a cover: adding more sources to the same md file

if we make collections lists, we can't even do chronological order, lol

#

@fading plover @heady vigil @jagged burrow @nova gale when you have time, lets talk about good ways to represent collections of works

carmine quarry
#

god data entry is such a wild job, you encounter so much weirdness

#

one(?) kalama sin appearance by jan Telakoman is up on his github repo, which has separate md files(!) for several different writing systems

#

and i think the build process for his site automatically substitutes the langcode in every file?

#

this is elaborate, jeez

#

"would we really run into the same file name in the same month of the same year?"

#

yes

#

jan Juli (i assume thats kili pan Juli?) wrote about pu Tosi in lipu kule and then talked about pu Tosi on kalama sin

#

and its two separate texts

carmine quarry
#

kalama sin #22 is 2 days later than kalama sin #23, if wikisource is to be believed

carmine quarry
#

2022-10 o moku pona! is in lipu monsuta as well as kalama sin

#

i. completely forgot lipu monsuta exists

#

@finite tulip who maintains lipu monsuta?

carmine quarry
#

@toxic thorn kalama sin la, mi pana e lipu ale tawa poki
lipu sin la jan li pali ala e toki sitelen. ona li kama la mi pana e ona kin.

vestal herald
carmine quarry
#

kala pona Tonyus collection is kinda wild

carmine quarry
jagged burrow
carmine quarry
#

ken

jagged burrow
#

lipu [tenpo] la mi nimi tags: sama e lipu ale pi kulupu wan

#

nimi collections: kin li lon li ken pona

#

(mi kepeken ala nimi collections:)

carmine quarry
#

the problem with collections/tags is it doesnt preserve intended reading order, at least not always

#

cause the same text may feature on multiple platforms, and be stored as one file

#

making it impossible to determine the correct order from the date

#

in that case we might want to point from collections to files, rather than from files to collections

jagged burrow
#

so kind of like m3u files for music playlists?

#

you'd have one file like lipu-tenpo-nanpa-walo.txt and it looks like

1. path/to/lipu-nanpa-open.md
2. path/to/lipu-nanpa-tu.md
···
10. path/to/lipu-nanpa-pini.md
heady vigil
carmine quarry
heady vigil
#

damn

carmine quarry
#

the same file could be in more than one collection

#

kalama sin + lipu kule

heady vigil
#

I mean

#

separate them ,,

carmine quarry
#

duplicate texts

heady vigil
#

ye

#

:3

carmine quarry
#

its the same problem as song covers

heady vigil
#

argh

jagged burrow
#

i guess it's also a question of, should this project be somewhere you go and read the texts? or is it purely a corpus for data analysis

#

like what's the purpose of pLapo

heady vigil
#

to be a silly box

jagged burrow
#

-# nimi_sin Lapo – silly

heady vigil
#

it comes from LARP

carmine quarry
jagged burrow
#

(hence that wip frontend, i suppose)

carmine quarry
#

ye

#

unless the library takes root in the wider community, it might get stuck in an awkward middle ground where both existing services (wikisource, utala musi, lipu tenpo etc) maintain their own pages and we also keep mirrors of them

#

the ideal outcome imo is becoming the ground truth for others to just grab texts from. but that requires a lot of trust building / social outreach, and wont happen any time soon

#

did work out for linku tho

jagged burrow
carmine quarry
#

as of rn i would assume none of the people we took texts from would agree to this

jagged burrow
#

🙂‍↕️ mi wile ala anpa e lipu ale ante

carmine quarry
#

it would just have an external dependency

#

anyway, this is literally only a concern for maintenance

#

if we assume old texts wont ever be edited, then having multiple copies in different places doesnt matter

jagged burrow
carmine quarry
#

by being a community project where enough people have admin rights

#

compare how linku doesnt make definitions that just suit one nasin, instead theyre decided collectively and disagreements are noted or added to sona pona

jagged burrow
#

so ideally folks would want their texts in poki lapo for the same reasons they might want an RSS feed of their website

#

i.e. it makes the texts more accessible

carmine quarry
#

yep

jagged burrow
#

that clicks

#

[ǂ]

carmine quarry
#

and even more ideally, collections would want their texts in poki lapo because it would reduce maintenance

#

for example lipu tenpo

#

it is currently stored in their own repo, and built from md

#

the alternative would be to read from our md which is, on the surface, exactly the same thing which doesn't seem helpful

#

except if poki lapo gains enough community trust/support/maintainers, we could make tooling around it

#

to, for instance, automatically build cc by-sa texts to wikitext

#

and upload them to wikisource

#

in this case, lipu tenpo gains wikisource mirrors for free

jagged burrow
#

we could make tooling around it
and that frontend, to browse the texts

carmine quarry
#

yep, which would then be a third way to access the text

#

lipu tenpo itself, wikisource, lukin lapo

#

this is of course excessive, but this is like the "maximum community success" path

#

i expect realistically we become a collection of loose items, plus a mirror of already established strong brands like lipu tenpo and utala musi

jagged burrow
#

mhm

carmine quarry
#

or they can make a PR to add their word to linku sandbox, which is more steps, but is then visible on Discord (ilo Linku) and two sites (nimi.li, linku.la)

#

and may end up translated on crowdin if people care

jagged burrow
#

the analogy here is
writing a quote = publishing on your own website/wherever
linku PR = Lapo PR

carmine quarry
#

yes

#

its like, this requires a critical mass

jagged burrow
#

enough folks who Care about having their shit in lapo

carmine quarry
#

yeah

#

and where having things in lapo is easier than not having things in lapo

#

or for another example, someone might make a songs-only frontend to lapo
and if it becomes the prime way to find songs, then new songs will start getting submitted to us

#

is the hope at least

jagged burrow
#

pipe dreams,,

#

uhm, how about that collections thing

carmine quarry
jagged burrow
#

but i want others' opinions before i just Do That

carmine quarry
#

do we want any metadata associated with each collection

jagged burrow
#

might b a good idea

#

for instance, all texts in a single lipu tenpo issue share the same - date - copyright - source urls

carmine quarry
#

do we want every collection to exist in the same folder, or is there any subdivision of directories we can employ

carmine quarry
#

utala musi includes copyrighted and cc'd works in the same collection

#

lipu kule includes articles published at different times

jagged burrow
#

we could include just the information that's the same for all texts?

carmine quarry
#

but we might want info like

  • name
  • maintainer
  • sources (but like, for the whole collection as opposed to individual pages)
jagged burrow
#

that's nearly useless tho

carmine quarry
#

also

#

we might want the list of elements of a collection to allow comments

#

so we can talk about works that are omitted

#

like e.g. kalama sin episodes that haven't been transcribed yet

jagged burrow
carmine quarry
#

something like that yeah

#

note: we shouldn't have collections that are just like, "all works by this person"

#

this is down to frontends to make searchable

#

oh btw

#

@jagged burrow want me to merge the current lipu tenpo progress, so that i can do this collection stuff on everything all at once?

jagged burrow
#

dfgjdfkjg i kinda wanna finish itttt

carmine quarry
#

kk

#

lets not do the collections rework until then

jagged burrow
#

gives others time to say their opinions :b

#

do you wanna do like, a tl;dr of your suggestions

#

so that said others can say their opinions easierly

carmine quarry
#
  • remove collections from schema and every current file
  • add new folder collections in repo root
  • for each (yaml? toml?) file in collections:
- name: 
- maintainer: 
- sources:
  -
- elements:
  # can be plaintext/ file link
  # or collections/ file link
  -

this preserves intended reading order; allows nesting of collections
downside: can't tell at a glance which collection an md file is part of

carmine quarry
#

just remembered how to make README.md not completely terrible!

carmine quarry
#

i honestly feel kinda bad for ilo Muni for having to, in the future, read this

nova gale
carmine quarry
nova gale
#

what are some examples on theese cases?

carmine quarry
#

jan Juli submits a text to lipu kule, then records it for kalama sin months later

#

we either duplicate text (bad) or cant reconstruct kalama sin ordering (bad)

nova gale
#

and how would this collections folder structure solve the duplication?

carmine quarry
#

kalama-sin.yaml and lipu-kule.yaml, both containing lists that include the same file

nova gale
#

ooh

#

well that kinda makes sense

#

i was also thinking if we were trying to optimise for machine readability or human readability

carmine quarry
#

what options are you considering?

nova gale
#

if we were aiming for the first one, wouldn't we ne better off with just a monolithic "folder structure" of the current front matters and file paths to the texts inside a text file and the actual texts without the metadata inside a folder with no inner structure

#

it would make it waaaay faster for machines to read as you do not need to open a separate file stream for every file every time you scan the front matters

carmine quarry
nova gale
#

that is the way it is currently implemented in the mess that is lukin

carmine quarry
#

the current arrangement is more editor friendly, in the sense that we expect texts and metadata to be edited at the same time

nova gale
#

that is true... i was also wondering why the filestructure is based on the dates. it seems kinda arbitruary to just choose one piece of metadata to group them by.

carmine quarry
#

the default option is a completely flat directory, which would get bad as (a) filename collisions will happen a lot, (b) some frontends like github wont show you all files in a folder if there are thousands of them

#

hence, a way of organising is needed

#

dates were chosen as the least questionable

nova gale
#

hmmmm

boreal viper
#

how far behind am i, let's see

#

oh DAYUM

#

8

#

gonna start catching up on that soon, methinks

#

i'm cheerleading y'all from the sidelines on this project btw!!

carmine quarry
fading plover
#

that's my problem tho not yours

carmine quarry
#

true

carmine quarry
fading plover
shut flume
#

haven't forgotten about this

jagged burrow
#

poki lapo out of context

jagged burrow
#

lipu tenpo transcriptions are all done bby

heady vigil
#

woo

#

the lipu was tenpod

jagged burrow
#

pakala ni tu li lon lipu mute o weka la pali mi en pali kulupu o wan

  • mi pana ike sona tawa poki preprocessing: · ona li ni ↓
preprocessing:
  - wrote alt text

o kama ni ↓

preprocessing: wrote alt text
  • sona open lipu la mi pana e sitelen ike U+2013 e sitelen pona U+002D ala · o pona e ona ale
#

@carmine quarry ^

carmine quarry
jagged burrow
#

work for me to do:
start transcribing kijetesantakalu o!
create collection files for these series within lipu tenpo
tawa pi poki monsi pi ma Asija
jan kule pi tenpo pini
toki tu by jan Ke Tami
seme li mi
kijetesantakalu o (which i'll have to start transcribing, ehe)
and things to possibly tag:
leko nimi
comics
texts written in sitelen sitelen? (mostly seme li mi again)

carmine quarry
#

@fading plover 575 md files in the repo now

#

@jagged burrow ill do collections now

#

and by now i mean damn ill have to write code for this, 570 files are too many to do manually

nova gale
#

@carmine quarry if you haven't written it yet, i already have that implemented in lukin

carmine quarry
#

ooh!

#

if you can dump me a list of every collection's contents thatd be useful

#

it can't just be frontend-side because collection ordering is unrecoverable, and i want us to be able to preserve it

#

but this will be a good start

nova gale
#

just a secc...

nocturne rain
#

mi sitelen e lipu musi la lipu li ken ala ken lon poki ni?

heady vigil
#

ona li lon kulupu lipu ante anu seme?

#

ni la ken

nocturne rain
#

||i mean, if i made a new story (say an original novel or translation), would it be added to the library¿||

heady vigil
#

||please publish it in somewhere else first, so then we can archive it. anywhere really||

nocturne rain
#

pona!

carmine quarry
heady vigil
#

^

jagged burrow
#

mi ken ala ken pana Commit e ijo sin kepeken ala ilo Github Desktop kepeken ilo Git pi Hub Desktop ala

#

mi wile ala kepeken ilo Github Desktopppp >3<

carmine quarry
#

to push to github using the git cli you need to authorise yourself with ssh

#

or tokens, but tokens are shittier, don't recommend

jagged burrow
#

ilo ssh la mi sona e ala nwn

carmine quarry
#

short list:
- generate a private/public key locally using ssh-keygen
- add it to github, in github account settings
- test that your key works on both ends, by running ssh -T [email protected]
- git push should work now; if it doesn't, ping me ill elaborate

jagged burrow
#

looks promising

#

it looks like i need to run ssh-add /path/to/key everytime; how can i set up my shell to do that automatically upon opening?

#

eyyy i got a commit in!

jagged burrow
#

@carmine quarry how're the collections files going

carmine quarry
jagged burrow
#

what heccin license should i apply to texts from ao3

#

most authors don't seem to specify a license

heady vigil
#

huh

#

then all rights reserved by default

#

-# unless AO3 has a license of its own

carmine quarry
#

i only put the string "all rights reserved" when the source says so explicitly

#

the field being empty shpuld be interpreted as "all rights reserved, unless later confirmed otherwise"

nova gale
#

also, the copyrightability of fanfics is kinda questionable, as is ao3 being "a tangible format" required by the copyright laws

jagged burrow
#

wondering whether/how to store fics' tags

carmine quarry
#

decide for yourself

#

he polroblem is thatt tags are usually fairly platform specific

#

and mergijng together tags from different platforms creates a mess

vague coral
# nova gale also, the copyrightability of fanfics is kinda questionable, as is ao3 being "a ...

[ilo penpo o lukin ala]

"Fans own copyright in their own original contributions to a fanwork — they don’t own anything about the underlying work it’s based on, but they do own what they have made," Rosenblatt says.
https://www.syfy.com/syfy-wire/how-to-keep-fanfiction-legal-and-avoid-trouble-with-lawyers
and a text file is a tangible enough format, just like how a png file is for visual artwork :p

jagged burrow
#

i've got some stories with unpa ike (specifically ||pedophilia||) in them

#

should i follow my own morals and exclude them, or put on the archivist hat and include them anyway

carmine quarry
#

or at least wasnt the author removed?

jagged burrow
#

these are from ao3

carmine quarry
#

oh

#

yeah just dont touch them

#

at most we can create a collection and then add a comment linking to that place and explaining we dont archive it

jagged burrow
#

alright

jagged burrow
#

stacking like 4 commands onto each other makes me feel very powerful

[]@[] MINGW64 /f/files-08-18/toki pona/poki Lapo/ao3
$ find . -maxdepth 1 -type f -not -name 't_*' | head -n 1 | xargs cat - | head
---
title: meli tu li lon tomo tawa suli
description: jan Fu Hua en jan Kiana li lon tomo tawa suli. jan Kiana li lape.
authors:
  - asona
date: '2021-10-21'
tags:
  - 'ao3'
sources:
  - https://archiveofourown.org/works/34637383

[]@[] MINGW64 /f/files-08-18/toki pona/poki Lapo/ao3
$ find . -maxdepth 1 -type f -not -name 't_*' | head -n 1 | xargs -i mv {} ~/Documents/poki-lapo-repo/plaintext/2021/10/
#

that find thing is like ls but it ignores directories. i just copied it off the internet

carmine quarry
#

wawa

jagged burrow
#

@nova gale o pana e sona pi kulupu Collections

vague coral
jagged burrow
nova gale
# jagged burrow <@883467091133489193> o pana e sona pi kulupu Collections

nnnnn, so about that, my puter's storage got corrupted, and i hadn't pushed the version that had the recursive front matter indexer done, so i kinda need to rewrite it... on the positive side i do have more time now as my current school period is more lean, so i have time and motivation to contribute to lukin, so it (really this time) should be in working condition soon!

#

i also didn't have backups to my nas set up correctly on the laptop that i used to program lukin so i only have backups of my root partition, which are kinda useless lol

nova gale
#

oooookay, here is a newline separated list of the collections:

Deltarune toki pona translations by RiemannHyperthesis
Fail Blue Dot
Tokipono: La lingvo de bono
jan Eka and jan Pani get up to stuff by janseme
kalama sin
kijetesantakalu o!
lipu kule
lipu monsuta
lipu tenpo
lipu tenpo nanpa akesi
lipu tenpo nanpa jaki
lipu tenpo nanpa kalama
lipu tenpo nanpa kasi
lipu tenpo nanpa kijetesantakalu
lipu tenpo nanpa kule
lipu tenpo nanpa kulupu
lipu tenpo nanpa lawa
lipu tenpo nanpa lete
lipu tenpo nanpa lili
lipu tenpo nanpa linja
lipu tenpo nanpa ma
lipu tenpo nanpa mama
lipu tenpo nanpa moku
lipu tenpo nanpa moli
lipu tenpo nanpa mun
lipu tenpo nanpa musi
lipu tenpo nanpa nasin
lipu tenpo nanpa nimi
lipu tenpo nanpa pan
lipu tenpo nanpa pipi
lipu tenpo nanpa seli
lipu tenpo nanpa sewi
lipu tenpo nanpa sin
lipu tenpo nanpa soweli
lipu tenpo nanpa suno
lipu tenpo nanpa tenpo
lipu tenpo nanpa toki
lipu tenpo nanpa tu
lipu tenpo nanpa walo
mun monsuta (2022)
utala musi pi toki lili pi lipu suli (2023)
utala pi lipu kalama tawa (2021)
utala pi lipu musi lili (2022)
utala pi toki musi (2020)
utala.pona.la
#

also, there are some syntax problems with the front matters of these files:

File number 196 at poki/plaintext/2021/06/toki-pona-li-toki-pi-ma-ale-anu-seme.md is problematic...
File number 245 at poki/plaintext/2021/05/toki-pi-kon-pona.md is problematic...
File number 326 at poki/plaintext/2021/08/kijetesantakalu-o-nanpa-luka-wan.md is problematic...
File number 442 at poki/plaintext/2022/02/kijetesantakalu-o-nanpa-luka-luka-wan.md is problematic...
File number 477 at poki/plaintext/2022/07/ijo-pi-suli-pi-nanpa-kipisi.md is problematic...
File number 583 at poki/plaintext/2024/02/o-kepeken-ala-ilo-ike-ni.md is problematic...
File number 586 at poki/plaintext/2024/02/o-mama-e-kasi-moku-lon-tomo-sina.md is problematic...
File number 596 at poki/plaintext/2024/02/toki-sona-lipu-lili-pi-kasi-ma.md is problematic...

#

im gonna fix them now

carmine quarry
nova gale
#

..?

carmine quarry
#

we need lists of files that each of these collections contains

nova gale
#

ooooh

#

just a sec...

#

how do you want them formatted?

carmine quarry
#

a newline-separated list of full filepaths

carmine quarry
#

one file per collection

nova gale
#

ookay

carmine quarry
#

afterwards i will do the manual step of sorting the collections in the right order

#

which (as a reminder) is why we needed this rework in the first place

nova gale
#

( @carmine quarry )

carmine quarry
#

yep, awesome

nova gale
#

nice 👍

carmine quarry
#

god theres a lot of files

nova gale
#

:D

nova gale
carmine quarry
#

so now your frontend will look at a list of collections in the collections folder

nova gale
#

oookay

#

do you want me to make a script that removes all of the collections tags

#

probably push it to a different branch?

carmine quarry
#

yeah that works

nova gale
#

okay

carmine quarry
#

merge when im done with new collections files

#

@nova gale i pushed the collections files (without manual sorting for now) - you can delete the collections field on main now

nova gale
#

ok! i'll edit the README too

carmine quarry
#

awesome

#

if you get the time / energy, try:

  • writing a check that every file link in collections is a yaml file that exists
  • printing out any "orphan" work (a work that is not in any collection)
heady vigil
#

when the poki is lapo

nova gale
carmine quarry
#

@jagged burrow you may or may not have to manually sort lipu tenpo,,,,

#

(i hope my energy lasts long enough to do that but its a Possibility™️)

nova gale
#

btw i am technically ready with this but my front matter editor insists on alphabetically sorting the tags lol :D

#
---
archives:
- https://web.archive.org/web/20140305064535/http://failbluedot.com/toki_pona/chuang_tzu
authors:
- Zhuang Zhou
date: Unknown
license: CC BY-NC 3.0
original-title: The Book of Chuang Tzu
sources:
- http://failbluedot.com/toki_pona/chuang_tzu
tags:
- translation
- prose
title: toki tan lipu pi jan Suansu
translators:
- Martin Palmer
- Elizabeth Breuilly
- Michael F.
---```
carmine quarry
#

hmmmm

#

on the one hand, fair, its a dict, it doesn't have an order

nova gale
#

yea...

jagged burrow
#

pali pali paliiii

carmine quarry
#

on the other, this is so unreadable a a

jagged burrow
#

should we use Ordered Lists for collection files? collections are in fact ordered

#

is that even a thing in yaml

#

like 1. 2. 3. instead of bullet points

carmine quarry
carmine quarry
#

false alarm, all good

#

- creates a list

#

it preserves order

jagged burrow
#

i spose yaml doesn't have the same <ol> vs. <ul> distinction as html haha

carmine quarry
#

thats up to the frontend ye

#

on our part, we just need to make sure its not a set (orderless)

#

and we're good on that

#

btw, look at how the new files help us organise:

#
# TODO: sort collection!

name: utala.pona.la
# source: https://utala.pona.la/
items:
- collections/utala-musi/2020-toki-musi-lili.yaml       # 2020-11-11
- collections/utala-musi/2021-lipu-kalama-tawa.yaml     # 2021-02-08
# https://utala.pona.la/sitelen-ma/                     # 2021-07-01   # A server icon contest for ma pona
- collections/utala-musi/2022-lipu-lili.yaml            # 2022-08-06
# https://utala.pona.la/musi-mu/                        # 2023-05-16   # A meme(?) contest
# https://utala.pona.la/sitelen-ma-nanpa-tu/            # 2023-02-01   # A server icon contest for ma pona
- collections/utala-musi/2023-toki-en-lipu.yaml         # 2023-08-15

# 2024: as of 2024-10-22, the 2024 contest results are not released yet, but it exists
jagged burrow
#

collection-ception

carmine quarry
#

yes

#

we only have a couple of those

#

haha i used items but its forbidden in yaml

#

ill fix it eventually

jagged burrow
#

reserved keyword or what

carmine quarry
#

something of that sort

nova gale
#

should we store the images for entries like the ones in o kijetesantakalu!...

carmine quarry
#

if you two (as a scraper and a frontend dev) can agree on how to handle em, sure??

#

i mostly care about text and searchability

nova gale
#

/images/*?

jagged burrow
#

i'm happy with keeping them as direct links to commons, or barring that web.archive

carmine quarry
#

oh nope we're not storing multimedia files in the repo

#

links sure

jagged burrow
#

i think it's messy to store images in the repo

nova gale
#

why tho..?

jagged burrow
#

-# gerund is pushing out infinitive in my english nasin

jagged burrow
# nova gale why tho..?

idk i dont like it. it feels weird
if you have a counter-argument i'm willing to budge lol

carmine quarry
# nova gale why tho..?

every version of a binary file is forever stored in git history inflating the repo size significantly

#

keep git for text

nova gale
#

I do not think we need to store multiple versions onf the images tho

carmine quarry
nova gale
#

well, we could also make an image archive

#

i can host it

carmine quarry
#

like vivi has been doing already iirc

nova gale
#

i just think that leaving dangling out pointing links kinda goes against the idea of poki lapo...

carmine quarry
#

the point is guarantee the preservation of text and its metadata foremost, make the rest automatically archivable (by having links that can be scooped up and submitted to, say, internet archive (if it wasn't down itself))

#

oh, neat, its up now

carmine quarry
jagged burrow
nova gale
#

i gotta say, you've certainly got a point, but i still think it would be more ""nice"" to have the images archived by kulupu Lapo

#

just something like a simple api endpoint like sitelen.[kulupu lapo domain]/image.png

jagged burrow
#

idk. why do the work of image hosting/archiving when others (commons & the internet archive) are already doing it and better

#

ijo toki ante la · nimi "ao3" ale o kama nimi "fan fiction" · ilo seme li pona tawa pali ni

#

@carmine quarry ^

carmine quarry
#

of the above i would rather give up the "controlled by us" part

nova gale
#

the collections tag is now removed 👍

carmine quarry
#

awesome

nova gale
#

side effect is that our lists are now unidented but i think that's pretty minor

carmine quarry
#

upd: no, i should be fine, ill sort lipu tenpo

muted forge
#

hell yeah, preservation! good luck maintainers!

jagged burrow
fading plover
carmine quarry
nova gale
#

i just use python scripts and the python-frontmatter library

carmine quarry
#

note to self in the future: make a tool that moves a file + renames every mention of it in collections

#

i cant believe we have a filename with ż

carmine quarry
#

we can (and should) make collections for lipu-tenpo-published series

#

like kijetesantakalu o

#

wow ive never even seen lipu tenpos new style. thats says a lot about me and reading

nova gale
#

:D

carmine quarry
#

okay lipu tenpo is sorted

carmine quarry
carmine quarry
#

@jagged burrow when you have time:
look through collections/lipu-tenpo/, whenever you see my comment, check that it matches with your intentions; if it doesn't, make any edits necessary

nova gale
#

i added the date format used by Lapo to README.md and removed date: Unknowns from plaintext/unknown-year/unknown-month/*, as the fact that we do not know the dates is already heavily implied and removing them makes parsing the files easier

#

(every entry uses yyyy-mm-dd, i checked)

nova gale
#

lipu ale pona logo proposal :D

#

and here is the related svg file

carmine quarry
#

you know how pu ku su look

#

pu is obvs "lipu + toki + pona" but the other two got more creative

carmine quarry
#

@nova gale good news, im attempting a free backend for lapo which should make frontend life simpler

#

can even do github pages

carmine quarry
#

upd: no i suck at this (at serverless to be specific)

nova gale
#

i am like half done with an api for lapo

carmine quarry
#

ay awesome

nova gale
#

i have the abstractions, file scanning and database ready so i only need to make the api itself

#

im probably just gonna use FastAPI

carmine quarry
#

are you gonna be self hosting?

nova gale
#

yea, i have a quite solid homelab build with the networking in place already

carmine quarry
#

oh sick

#

given that my serverless attempt sucked we'll probably transfer api.lapo.pona.la to you when youre ready

#

unless you have a different dns in mind

nova gale
#

how about lipu.pona.la or lipu-ale.pona.la tho?

#

i think they would make way more sense than lapo.pona.la

carmine quarry
nova gale
#

☝️

carmine quarry
#

lipu is good in the sense that people already know what it means, but lapo is good in the sense of being unique & not hogging valuable subdomains someone else might need

#

so i understand either way

nova gale
carmine quarry
#

thats true

vague coral
#

la.pona.la

heady vigil
nova gale
fading plover
carmine quarry
nova gale
#

yea, i am just storing the entries in memory as python objects and recalculating them every time the server is restarted or it receives an update from my post-push hook over on github

#

it doesn't really matter as the size of the entire poki Lapo is only like 4.5mb

fading plover
carmine quarry
#

alr ill look into it

carmine quarry
vague coral
#

"better organised" li seme lon ni · lukin mi la mute la toki li sama · taso nimi ni li ante · kin lipu pi jan [Lentan] la toki poka namako li lon (li wile ala anu seme)

carmine quarry
vague coral
#

a · sina pon

vague coral
carmine quarry
#

mi la ale li pona

#

mun Kekan San has provided the community with a really good toki-pona-filter so if someone wants to bulk-analyse the data it shouldn't be an issue

#

and if this is a normal user reading, they shouldn't see html comments anyway

#

you can mark the file as "having a lot of english" in metadata notes

vague coral
#

la mi ni

#

n · nasin mute pi pana ken li lon

#

la mi o

license:
  - MIT
  - CC-BY-SA 3.0
  - CC-BY-SA 4.0

anu

license: MIT OR CC-BY-SA-3.0 OR CC-BY-SA-4.0

anu seme

carmine quarry
#

ni nanpa tu

#

ijo license li nimi li poki ala

vague coral
#

aa mi alasa sama e nasin la mi kepeken nimi open !!! note · taso ni li nasin pi poki [Lapo] ala li nasin pi ma [AO3] anu seme aaa

carmine quarry
#

a a

nova gale
#

please no inconsistancies in the schema 🥺🙏🙏

vague coral
#

ona li toki e nasin pi pali lipu li pana e ijo nasin lon ona la pakala · kin lipu open li jo ala e linja tawa ona · taso ona li lon poki la,,?

carmine quarry
#

is that jan telakoman or am i misrememvering

vague coral
#

ni

carmine quarry
#

i think i encountered the sp-centric files while moving kalama sin(?) and ye this sucks

#

o pali wile

nocturne compass
#

sending message to follow thread

#

this looks interesting

vague coral
#

ante la pali li lili

#

a taso · lipu li toki e nimi pi poki kama lon open la ni o weka ala weka
sama ni

## ijo lipu
- [mu]()
- [tu]()

## mu

## tu
vague coral
#

poki pi sitelen [YouTube] la mi pana e linja taso lon tenpo lon · jan [Telakoman] li pana e poki wawa la sitelen li lon insa lipu ona

vague coral
#

ni https://en.wikipedia.org/wiki/Luke_15 o Luke 15 anu Gospel of Luke, chapter 15 anu seme

Luke 15 is the fifteenth chapter of the Gospel of Luke in the New Testament of the Christian Bible. The book containing this chapter is anonymous, but early Christian tradition uniformly affirmed that Luke the Evangelist composed this Gospel as well as the Acts of the Apostles. This chapter records three parables of Jesus Christ: the lost sheep,...

vague coral
vague coral
vague coral
#

a

#

pali mi la mi weka e ale open e ale pini · ken la mi o awen e open · taso ona li sama mute lon lipu ale la mi sona ala

kind mesaBOT
#

poka mi li pali ala tawa poki ni lon tenpo suli...ken la mi o open ni sin lon tenpo kama

carmine quarry
#

@jagged burrow when you add works now, please remember to add them in collections/!

#

oh interesting, you did kijetesantakalu o and seme li mi, but not the lipu tenpo collection

carmine quarry
#

oh my god

#

theres toki pona hermitcraft smut

nocturne compass
#

oh and its smut too

carmine quarry
#
sources:
- https://liputenpo.org/pdfs/0018tu.pdf
- https://liputenpo.org/lipu/nanpa-tu/
- https://commons.wikimedia.org/wiki/File:Lipu_tenpo_nanpa_tu_-_luka_Juke.png
- null
#

yknow null just in case

#

@plucky umbra
seme li pali e ni
mama li lon ala toki

kind mesaBOT
#

there’s what

kala Asi ↩️

[Reply to:](#1252224729977327647 message) theres toki pona hermitcraft smut

carmine quarry
#

doing work on ci, schemas will be validateable (tomorrow i hope)

vestal herald
nova gale
#

First search results from lukin 👀

#

(the ui isn't final (and the search algorithm probably needs tweaking too lol))

#

this means nasin ilo is feature complete and will not have breaking changes so often anymore

#

Oh and the funny thing is that for now at least, the frontend does not have any javascript :DD

plucky umbra
nova gale
#

opinions on this landing page design?

#

And by the way, all of the numbers etc are just placeholders :D

carmine quarry
#

i normally have opinions on ui/ux, but a library is fresh territory for me! i guess we'll figure out good ways of arranging information eventually

#

btw nanpa is spelled with an n

nova gale
#

oh! dyslexia strikes again :D

vague coral
#

a ken la sitelen sinpin lipu li wile · sama poki [Gutenberg]

#

sama ni

carmine quarry
vague coral
#

la o pali kepeken ilo

carmine quarry
#

@vague coral rebase your Telakoman pr please, we have ci now

vague coral
#

mi ni

carmine quarry
#

damn, astro is really powerful

#

itd take me like a solid few months to figure it out tho

heady vigil
#

great work folks

heady vigil
#

if you want a simpler solution for images, try the method used by the lipu kule UI

vague coral
#

mi sona lili e nasin

#

taso toki ilo seme li wile

heady vigil
#

mi sona ala

vague coral
#

lukin mi la toki [TypeScript] li kepeken

carmine quarry
#

@jagged burrow btw the schema is currently very loose, almost every field allows a value + null + undefined (= field missing from doc)

#

we might want to make it stricter

#

i did it this way for now because otherwise a lottt of files start failing

nova gale
#

(also python kinda makes it easy as Null, Undefined etc. are just "NoneType")

carmine quarry
#

easy isnt necessarily good

#

e.g. before i schemaed everything didnt just have Nones in there, we also had a couple [None]s

#

its good to have to handle only as many edge cases as we actually need

nova gale
#

True that...

jagged burrow
nova gale
carmine quarry
#

i thought i fixed a lot of those on main? huh

jagged burrow
#

Problem child file
ijo ni li sama jan pi ike taso sk_musi

jagged burrow
#

surely you can just find-and-replace in every bad file

nova gale
#

no lol i need to manually fix them :D

jagged burrow
#

ucsur la, should it be changed to Latin alphabet or left unchanged?

heady vigil
#

personally I'd choose to keep UCSUR

#

possibly creating a copy in Latin? ?

carmine quarry
#

nope, we doin latin

#

attach a note that its an adaptation

heady vigil
#

oki

heady vigil
jagged burrow
toxic thorn
nocturne compass
#

for UCSUR, if there are non-standard characters; it should have some kind of note stating what charecter it was in the original text; and it should state the contents of cartouches as they were in the original UCSUR (along side the latin transcription)

jagged burrow
heady vigil
#

mu

carmine quarry
#

im not doing much because im busy working on wasona btw

heady vigil
#

what's wasona

toxic thorn
#

lukin la ilo pi pana sona

#

a ona li sama selo sama ilo tujolinko anu seme?

#

ilo sona pi waso laso a a

carmine quarry
toxic thorn
#

pona

opal otter
#

I looked through some old files and found two versions of a toki pona short story written in 2003. It's not of high literary quality but by now the date makes it valuable.

#

I was told to upload it to poki Lapo, but how?

fading plover
carmine quarry
#

tenpo lon la ona li jo ala

#

tan ni: jan ala li open pali e ni

fading plover
#

sona

#

lon la ni li pali suli anu seme...

carmine quarry
#

mi poki e lipu tenpo e lipu kule e lipu pi utala musi ... la, ona ale li pali pi suli pini

#

jan mute li mama e kalama musi lon tenpo ale

#

ni la pali ni li pali pi tenpo kama

fading plover
#

sona a

vague coral
heady vigil
#

likujo

#

o luka tu o kulupu poki e lipu

heady vigil
#

maybe I will do this instead of doing my homework

heady vigil
#
carmine quarry
#

if you ever get works you're excluding from a collection, make sure to list their name and reason for exclusion in the collection file

heady vigil
#

I don't want to exclude it

carmine quarry
#

as for whether this should be included, i ... don't know

heady vigil
#

yeah ...

#

💥

#

I can simply like

#

cut the English out

carmine quarry
#

explain in notes that this is unfinished and never will be finished

heady vigil
#

the question is how to deal with the rest idk

#

true

#

indefinitely unfinished

#

I forgot how to do issues and PRs

heady vigil
#

now I wonder, how do you find images in PDFs

heady vigil
#

@carmine quarry how do I note a date where the year and month are known, but not the day

carmine quarry
#

write yyyy-mm and then hope the schema validator is chill

#

if its not chill ill uhh physically abuse it tomorrow ish

heady vigil
#

damn

#

related, what do I do if I don't know the author..........

#

it is not anonymous, I just don't know.

carmine quarry
#

dont think thats allowed

heady vigil
#

💥

#

this is a blog run by two people, it's just not noted who wrote a single post

carmine quarry
#

oh duh put the name of the blog

#

kulupu li ken author too

heady vigil
#

oki

#

gorgeous

#

-# don't look at the date, it's fucked up I just noticed

#

he makes a good point

nocturne compass
heady vigil
vague coral
heady vigil
#

ken.

#

also

#

how do you make captions in MD

vague coral
#

nnn nasin ni li lon
![mu mu mu.](https://mu.mu/mu.stln)
taso ni li toki len "alt" li lon ala anpa

heady vigil
#

exactly

vague coral
#

ante la nasin li lon ala · ken la o poki > e sitelen e toki anpa

opal otter
#

@carmine quarry I've been told I've got a few things to add to the library -- two drafts of a bad story written in toki pona in 2003 and not published to any extant site -- but don't know how I'd start doing something like that.

vestal herald
#

nice

jagged burrow
#

i deleted the lipu-tenpo branch since i don't use it anymore

#

deleting it only took one click, no "are you sure" or anything.

#

scary

jagged burrow
# opal otter <@183528471031447552> I've been told I've got a few things to add to the library...

the contribution process should be familiar if you've used git—fork, make changes, create a pull request
you're presumably asking about the "not published anywhere" part. i think it makes sense to remove the sources: and archives: fields from the metadata, and explain in the notes: that this isn't published elsewhere. but wait for kala Asi's opinion bc i don't trust myself to make good choices lol

carmine quarry
#

afaik

tulip whale
#

i have my original songs and translations

#
jagged burrow
#

@carmine quarry some check failed on my latest commit, you know better than me what this means

heady vigil
heady vigil
#

how do I format the gap at the last line?

carmine quarry
#

@fading plover bringing you in for consultation
whats the least evil:

  1. accepting yyyy-mm alongside yyyy-mm-dd in the schema, making life difficult for the frontend thatll have to juggle two different types of dates
  2. forcing yyyy-mm to be extended to yyyy-mm-dd and just lying about the day (making it 01)
#

im leaning towards (2) but i want your thoughts

fading plover
#

hmm, this is kinda challenging but honestly i think it's better to convey the data as accurately as we have it rather than imply an accuracy we do not have

#

let the frontend choose how to handle that; ilo Muni would fill in the 1st of the month bc it rounds to months anyway, while the search page in this tool would convey the presented accuracy

#

bc otherwise, an uninformed reader of the data is ofc gonna assume the date is as accurate as presented without another indicator otherwise, and if the field doesn't allow lower accuracy, the only other option would be an awkward annotation or comment

#

solving that presentation problem feels worse than just being truthful at the outset

carmine quarry
#

@heady vigil fixed schema issues on your branch

vague coral
#

mi la ni

carmine quarry
#

pona

#

@nova gale hows your web app going on

nova gale
#

Oh yea, i kinda forgot about Lapo as there is quite a lot of things going on in my life lol,, the api is ready and i only need to do the frontend, which is not that big of a job when i get into it

#

also, as you prolly know as a fellow programmer, there will likely be a couple of major bugs in the api, that i'll find as i work on the frontend, but no breaking changes will be made to the v1 api anymore

carmine quarry
#

ye ofc

#

rn you can safely assume you can iterate on your api without worrying youre breaking third party apps

#

getting a frontend up and starting to get users is the important bit

#

stablity can come later

heady vigil
#

now I can't push despair

#
> git push origin toki-library
To https://github.com/kulupu-lapo/poki.git
 ! [rejected]        toki-library -> toki-library (non-fast-forward)
error: failed to push some refs to 'https://github.com/kulupu-lapo/poki.git'
hint: Updates were rejected because the tip of your current branch is behind
hint: its remote counterpart. Integrate the remote changes (e.g.
hint: 'git pull ...') before pushing again.
hint: See the 'Note about fast-forwards' in 'git push --help' for details.
#

lipamanka's works are all labeled as 'unknown year/unknown month'
what about just asking them....

carmine quarry
#

before pulling, your local repo has no idea ive done anything to the branch on the cloud

carmine quarry
# heady vigil I did do that!!!

if you cant figure this out on your own, we can do this:

  • you push your progress (the parts you cant add to `origin/toki-library) into a new branch
  • ill move those changes into origin/toki-library
  • you delete your local toki-library and pull it from origin
#

but def try on your own first, maybe its not that painful to fix

heady vigil
#
> git pull --tags origin toki-library
From https://github.com/kulupu-lapo/poki
 * branch            toki-library -> FETCH_HEAD
hint: You have divergent branches and need to specify how to reconcile them.
hint: You can do so by running one of the following commands sometime before
hint: your next pull:
hint: 
hint:   git config pull.rebase false  # merge (the default strategy)
hint:   git config pull.rebase true   # rebase
hint:   git config pull.ff only       # fast-forward only
hint: 
hint: You can replace "git config" with "git config --global" to set a default
hint: preference for all repositories. You can also pass --rebase, --no-rebase,
hint: or --ff-only on the command line to override the configured default per
hint: invocation.
fatal: Need to specify how to reconcile divergent branches.
#

@carmine quarry my pull message

#

oh rebasing worked :>

carmine quarry
#

lemme know when you want this merged

heady vigil
#

when I merge it

carmine quarry
#

kk

heady vigil
#

just noticed that many of the files have their licence labelled as 'null'

#

and also the date wrapped in quotations

#

is this expected or

#

@fading plover

carmine quarry
#

as of right now, we are using all three interchangeably

heady vigil
#

a

carmine quarry
#

this is something that would be good to standardise

#

and we can enforce the standardisation thorugh schemas

heady vigil
#

in my branch I labelled as 'all rights reserved' because that's how copyright normally works--

#

(pushing a correction in a bit)

carmine quarry
#

i don't mean for licenses specifically

#

i mean any field

heady vigil
#

yeah that's true

fading plover
heady vigil
#

mi pana lon tenpo lili

fading plover
#

wawa

#

mi o kama pana e ona ale tawa ilo Muni.

heady vigil
#

first time I see pu meaning something similar to a book but. not the one

carmine quarry
heady vigil
#

yeah

#

makes sense

fading plover
#

oh you are kinda saying that

#

ok fair distinction

heady vigil
#

does it seem a bit common among beginners to interpret 'X ijo' as 'a bit of X'

carmine quarry
#

never seen that before tbh

fading plover
#

do y'all think the livejournal is in scope for this project
there's a fair amount of posts there that are a pain to archive in any method i would prefer

heady vigil
#

I don't know what Livejournal is

#

I googled and the first thing that showed up was

'livejournal fanfiction'

#

sure. fucki it

fading plover
#

first gen social media alongside Myspace, where Twitter/Facebook are 2nd gen

heady vigil
#

ahh

#

I see

#

I don't see why not

#

well. I could

#

it's complicated to archive, but we'll find a way~

fading plover
heady vigil
#

oh it has titles. thank youu

fading plover
#

i mean if you're doing it manually it isn't That big of a deal, there is maybe a day's worth of work to copying and pasting
but i am lazy lol

heady vigil
#

it has comments

fading plover
#

another consideration tho
yeah comments

heady vigil
#

explodes

#

I AM archiving a blog right now

fading plover
#

this was why i was thinking it might be out of scope
but blogger has comments

heady vigil
fading plover
#

comments are weird in the context of poki lapo

carmine quarry
heady vigil
#

the comments are not in scope for the CURRENT WORK I AM DOING
so intead I just took the ones that are important (that explain the old grammar mistakes) and commented it out

fading plover
heady vigil
#

💥

fading plover
#

y'all want me to scrounge around for random documents needing saving
there are probably a lot by pije

heady vigil
#

I don't speak Polish so I'm just half understanding

fading plover
#

it's a translation

#

of the tower of babel story

heady vigil
carmine quarry
#

and then theyre forever in the repo until someone comes along and downloads the text and uncomments the item

fading plover
#

this is presumably a staging ground for somebody to come by and archive better later?

#

nice i see the comment now

carmine quarry
#

p much

heady vigil
#

ye

carmine quarry
#

its good to have a bit of specialisation

#

some folks want to dump links some folks want to process them

heady vigil
#

sdgsdfsd

#

[jaki] ||we're like animals and decomposers
one dumps and another processes||

fading plover
heady vigil
#

holy shit

heady vigil
#

because I have seemingly this same text from 2011

#

wait I could actually just check Wikipedia. the two texts point there...

#

toki ponans

#

also @vestal herald Ke jumpscare

#

Kwamigami edited this article a whole lot huh

#

oh hi there

heady vigil
#

@carmine quarry what do we do for duplicates

carmine quarry
#

if there are slight differences in texts, use whichever version you think is more polished, and leave a comment about that

heady vigil
#

the intro has a sample just with the poem

#

that's cool I think

heady vigil
#

as always none of my pushes are good 😔

carmine quarry
#

@heady vigil you have persuaded me to work on better error messages, thank you

#

can you read these easily now

#

(youll have to pull / rebase again btw)

heady vigil
#

putting you to work

heady vigil
#

@carmine quarry can one run the validation code locally
. how

carmine quarry
#

you might also need node assuming you dont have it installed

heady vigil
#

yay

#

ok what now

nova gale
#

we cannot prove that the tree wouldn't lead to multiple roots if we have not scanned every node tho

#

also it could lead into circles

carmine quarry
#

its a good joke but in reality when you get into decently good territory, "computers" stops being a unified skill and the people you go to become specialised

#

your class has someone whos best at sports but the world does not have a single best at sports category

nocturne compass
heady vigil
#

stop rationalising my awesome post

toxic thorn
#

a pakala kala asi li toki e toki mi

nova gale
#

yes,, of course if we made a specific problem we could get some results

nova gale
#

I think it would be quite interesting to sample text from around when ku and more importantly pu were published and make an analysis of what kinds of differences are there

nova gale
#

Some info on lukin (and nasin ilo)

  • lukin was split to nasin ilo (the backend and api; will be hosted on my homelab) and lukin (the frontend; static website, will probably be hosted on github pages)
  • As I have said, the API and backend are in working-ish alpha phase and also almost feature complete
  • We will have a working test frontend by EOD today
    • The uptime WILL be under 75%
    • It WILL change often
    • It WILL be buggy
  • We will enter beta probably before next year
heady vigil
#

!!!

#

that makes sense tho

#

so we have like

#

poki Lapo
ilo Lapo
lukin Lapo

#

dkhdkdhdkf

carmine quarry
#

awesome!

fading plover
#

hi, the frontmatter typing needs some help

#

i'm gonna make a thing to double check all of it really quick jk i'll use the validation script but stuff i've spotted while transcribing a type:

  • the readme really needs "not required" labels for some of these
  • preprocessing is a string in most places, but it's a list of strings in tenpo suno mama meli by jan alonola
  • licenses probably is a list of discrete possible strings, no? not that it makes a difference, but for example, you have separate CC-... and CC ... licenses throughout all the docs
    • on that note, the null vs all rights reserved distinction needs an explanation in the readme probably
  • the notes and accessibility notes are null as opposed to missing in around 5 documents
  • sources are null rather than missing in the jan pi kamalawala doc

also

  • you should probably include non-collection/stray sources in your sources list. only omission i'm immediatley aware of is mun.la but there are plenty of independent sites that i don't think are on this list
#

zod has a nullish type??? as in something can be either null or undefined i guess??

carmine quarry
#

the readme really needs "not required" labels for some of these
we've been pretty loose on the schema, letting the transcribers set policy more or less. we could for instance force all fields to be filled or : nulled, but we haven't yet

#

preprocessing is a string in most places, but it's a list of strings in tenpo suno mama meli by jan alonola
needs fixing yea. we only recently got a schema ci

fading plover
#

benefit of null is that contributors need to explicitly acknowledge it
benefit of undefined is them not needing to worry about it
but mixing them is a bit of a nightmare

carmine quarry
#

might not be called undefined i don't remember

#

licenses probably is a list of discrete possible strings, no? not that it makes a difference, but for example, you have separate CC-... and CC ... licenses throughout all the docs
would be good to force an spdx compatibility check

fading plover
#

i'm pretty sure it isn't valid yaml to declare a field but not define it

carmine quarry
#

tools seem to read it fine

fading plover
#

that's horrifying

carmine quarry
#

either way we can force schemas to reject it

fading plover
#

i would highly recommend enforcing exactly one of these options

carmine quarry
#

the only reason i haven't done so is because it requires touching hundreds of files

#

and that im not currently hyperfixed on lapo to do that

#

on that note, the null vs all rights reserved distinction needs an explanation in the readme probably
it does need to be, yep

#

you should probably include non-collection/stray sources in your sources list
one of the tools id like to have but never got around to

carmine quarry
fading plover
#

in the meantime, i git checkout -b validation-improvements because this is a pre-requisite to me adding poki lapo to ilo muni

#

unless there's active work on that i am missing

#

also what's the story with the outstanding 2 PRs @heady vigil

#

are they / have they been ready to go and they're waiting on review or

carmine quarry
#

but ofc we want to anyway

heady vigil
#

I mean like, they are in progress

#

and will be for a long time

#

because it's a big ass collection

fading plover
#

do you have to do them all up front in order to merge them

heady vigil
#

but apart form veing incomplete, they're all good

#

idk

#

I am bad at computer (Git)

fading plover
#

it might be better for morale, expediency, and completeness to merge as you go
it's not as though you'll lose progress; poki lapo is the list of done things itself

carmine quarry
#

my recommendation is

  • write a collections file first, comment out all the future files that youll be making
  • uncomment lines in collections as you add new files
heady vigil
#

can you do that like

#

how do you do merging little by little

fading plover
#

that's a solid plan