#Codex

1 messages · Page 3 of 1

grizzled granite
#

The only caveat being that this naming convention was also considered for weierstrass P (pee), due to the convention for ell

#

And I'm not sure how many characters are encompassed by that. Do you remember which other snowflake symbols like the weierstrass P there are?

heady fulcrum
#

personally I really like the current dif

#

it's really semantic

#

or at least it feels to me

grizzled granite
heady fulcrum
#

I was about to say that

midnight tangle
heady fulcrum
#

even in integrals it feels semantic to me, for some reason

#

as general infinitesimals

grizzled granite
#

That's why dee would be good. It's how it's actually pronounced typically for both derivatives and integrals

grizzled granite
heady fulcrum
#

yeah yeah

#

I know

#

I did my measure theory

grizzled granite
heady fulcrum
#

I understand the reasoning, but for some reason dee isn't exactly clicking over here

grizzled granite
#

If not then I'd prefer dif over del (which is not a good name for reasons previously discussed)

heady fulcrum
#

don't have a logical reason for it though

#

so

#

crazy idea

#

what if it was dd?

grizzled granite
heady fulcrum
#

I don't think so?

#

I think the blackboard aliases are just for uppercase

grizzled granite
#

I still think we should reserve the possibility of using them for lowercase symbols

#

In the future

heady fulcrum
#

to be fair, I often personally map doubled lowercase chars to uppercase caligraphic letters

#

it's kinda handy

#

hmm

grizzled granite
#

I guess oo is infinity, so that ship has already sailed

heady fulcrum
#

Maybe part of the reason for why dee doesn't fit for me is that that's not how I (mentally) pronounce it (due to language reasons)

grizzled granite
heady fulcrum
#

yeah

#

but for some reason this is sticking out more in my mind

#

trying to figure out why

#

also, incidentally

#

looks like dif doesn't show up in the symbol search despite being mentioned in the paragraph preceding it...

grizzled granite
#

Because it's not a symbol

heady fulcrum
#

yeah, I thought so

#

but then we should remove the paragraph before the search

#

"The d in an integral's dx can be written as $dif x$..."

#

Anyways, back to dee

#

Was this mentioned already somewhere (e.g. forum)?

grizzled granite
#

No

#

Not to my knowledge at least

#

Anyway dif isn't terrible. (Unlike diff)

heady fulcrum
#

yeah diff was bad

midnight tangle
grizzled granite
#

Maybe it should be included in the symbol list even if its not technically a symbol

heady fulcrum
#

but we absolutely can't have the paragraph there and not have the symbol in the search, I'd say

lapis moth
past python
#

I intentionally didn't deprecate diff right away in https://github.com/typst/codex/pull/44 because there was some pushback and I wanted to at least have one more discussion about it. There's quite a bit discussion about dif above here, but not a lot about diff. I really can't judge this as I'm not a mathematician, but is everybody on the same page that diff -> partial is the right way?

flat pagoda
#

Not only does it align with LaTeX, which reduces friction for people who are transferring to Typst, the LaTeX choice of the symbol name is not terrible.

grizzled granite
#

Regarding https://github.com/typst/typst/issues/5695, how do you feel about ang? It's short like abs, but it neither conflicts with angle nor is overly semantic (which is a problem when it has multiple semantic meanings)

#

we could potentially also rename all the angle brackets to be under ang as well, since they're actually fundamentally different from the angle symbols

#

The html names are lang and rang by the way

midnight tangle
#

I feel like this introduces a subtle, arbitrary, difference between two names, which is of the same kind as dif/diff (and we are trying to change that).
But at the same time, it works, and solves the issue. So it's probably fine.
I still prefer angles though.

tall quail
heady fulcrum
#

In my personal code I use #let inner(u, v) = $lr(angle.l #u, #v angle.r)$

#

now, if we add this, whichever the name, we need to make sure that the user can add commas inside of the function call and that it will be processed properly

#

(which my inner helper fails at)

#

maybe it's just a matter of taking in varargs

#

as for the name, I personally lean more towards something like inner since it's very semantic. But maybe the bra-ket people won't appreciate that name as much (though, to be fair, I would imagine this function by itself wouldn't be the most useful to them, since it's always just the angle brackets...)

#

there's also the matter that sometimes angle brackets are used beyond inner products, e.g., I've seen people use would-be inner(v) to denote the 'sequence of vs' (where one would already have v_1, v_2, ....). That said, the usage as an inner product is by far the cannonical one, as far as I am aware, and other usages (e.g. the sequence one) are kind of on-the-fly and thus still feel like they fit in with the name inner, in my opinion

heady fulcrum
#

We could also rename the angle symbol, since it's much more uncommon. So then angle could be the bracket function, angle.l and angle.r as they are today, and something like angle.geom (very tentative?) the current angle symbol.

tall quail
heady fulcrum
tall quail
heady fulcrum
heady fulcrum
heady fulcrum
tall quail
heady fulcrum
#

wait, really??????

#

No you can't?

#

uhm, wait

#

Oh, I think I see it

#

this works, and x is then (1,) and y is (2,)

#

yeah, so it might be just lr then

tall quail
#

yep

heady fulcrum
#

but I mean, it could be just a case of the function taking in varargs

#

#let angle(..args) = $lr(angle.l #args.join($, $) angle.r)$, or something like that

#

(maybe I'm missing a method call in the usage of the args)

#

#let angle(..args) = $lr(angle.l #args.pos().join($, $) angle.r)$

#

there we go, fixed

tall quail
#

I guess that could work... Is that how lr does it?

#

it does feel like a hack

#

Ideally it would just be easy to input the characters directly, then you get the lr behavior for free:

#

?r

$ ⟨ x/y ⟩ $
gaunt archBOT
tall quail
#

The best solution could be to find shorthands for these characters. This would solve both the comma issue and the naming issue

#

but I can't find good shorthand ideas

heady fulcrum
#

I kinda like the idea of shorthands for angle brackets

#

but yeah, it's not obvious what would be a good shorthand

grizzled granite
tall quail
#

that makes sense semantically

#

the angle bracket function could do the same thing as binom I guess

#

maybe a better name than angles would be angled?

grizzled granite
#

I'm not a fan of names longer than 4 letters for this (like in norm), or at worst 5.

#

Either choose a name that people will actually use when it's used repeatedly (like ang), or drop all pretense

#

That being said, if you put a gun to my head then angled is preferable to the other options mentioned

lapis moth
#

I'm unequivocally in favor of ang, since that's literally what I called it in my personal math package

#

Well, I guess there's a bit of a difference between "I like using it" and "I think this would be good for everyone". Maybe I'm not 100% on that second one🤔

steel chasm
#

#discussions message

#

How can I add slash.circle? Apparently there is backslack.circle, but nothing for slash. That's odd, as the latter is more popular.

grizzled granite
grizzled granite
#

?r $ \u{2298} \u{29b8} $

gaunt archBOT
grizzled granite
#

I don't know if they're more similar in other fonts, but that's the closest thing that exists

#

Yeah in stix two math they appear to be mirror images @steel chasm

#

You can create a feature request on the codex repository. It may possibly already be covered by one of @midnight tangle 's proposals though

midnight tangle
#

It's not explicitly covered as a proposal in the form of an open issue, but the specific character is probably somewhere in my drafts. Regardless, if you want the symbol to be added in the future version of Typst instead of keeping it on the side to be added as part of a later work on circled symbols, you should open an issue or PR to typst/codex.

storm whale
#

Why do alef, bet, dalet and gimel map to the Hebrew letter and not the mathematical one directly?

#

Like these are specifically for use as 'math symbols', and the whole Hebrew alphabet isn't mapped. I think then the unstyled variant in math should just be the mathematical glyph, not the Hebrew one

grizzled granite
#

The idea is that they then work in regular text too

midnight tangle
storm whale
#

As in, the upright math Greek alphabet is just the Greek alphabet in Unicode (use the same codepoints), unlike these four hebrew characters

#

Slightly unrelated, what is the letter shin used for? It is the odd one out of the Hebrew alphabet in codex (it has no math glyph)

#

I've traced back when it was added to before 0.1.0, where it appeared along with all the other symbols first added

flat pagoda
heady fulcrum
#

Thinking of the dirac comb, specifically

#

There's some disagreement on what letter is actually used for it. Most people say it's a cyrillic letter, but I've seen (and was originally taught) that shin is often used too

grizzled granite
heady fulcrum
#

Yeah, I agree that it's more usually represented by sha. But if I'm not mistaken there is a nontrivial amount of people that use shin for it

#

they are pretty similar

#

that's the only explanation I have for why shin is included in Typst 🤷

grizzled granite
#

That makes no sense though, since sha is the only one that actually looks like a dirac comb 😅

heady fulcrum
#

You do have a point... but when I first met the Dirac comb, it actually was introduced to me as being the letter shin

#

Probably someone along the way wanted another hebrew letter in math 🙃

#

anyways, I don't have any strong feelings about it

#

btw, do we have sha in codex?

#

doesn't seem like it

grizzled granite
#

Yeah, you're probably right about the shin explanation

heady fulcrum
#

also, tbf handwritten shin is a bit different from typeset shin (as most letters are)

#

when handwritten it's almost exactly equal to sha, it's just the middle stick that sometimes gets slightly slanted

#

so maybe that's why

grizzled granite
heady fulcrum
#

agreed

#

just sha should do fine, right?

#

or maybe it should be Sha? 🤔

grizzled granite
#

I don't know if everyone uses lowercase or uppercase

heady fulcrum
#

sure

grizzled granite
#

Unicode never ceases to surprise me

midnight tangle
#

At least I am aware of its existence

#

But it is essentially a Cyrillic sha, except for use in math with slightly different semantics I guess

grizzled granite
midnight tangle
#

Ok then maybe I should stop saying stuff without checking first

grizzled granite
#

I think we're all guilty of that

storm whale
#

If shin isn't really used, I think it would be fine to map the other letters to their math glyphs instead then. On the other hand, if it were used frequently, I think it'd then be better not to so that its consistent atm (and possibly in the future if more Hebrew letters are added)

lapis moth
#

Afaik their names are phonetic, so especially for those like Be, where they're the same as the latin ones, we'd probably have to group them in a cyrillic module or sth

storm whale
grizzled granite
storm whale
grizzled granite
#

I think we should avoid adding all cyrillic and hebrew letters unless they have a specific use

past python
#

can someone give me a TLDR why we want to replace alef, bet, etc. with the mathematical codepoints instead of keeping the hebrew ones and letting math layout take care of the auto-conversion like we do in other cases?

grizzled granite
#

I don't necessarily see why we should do that if they're converted anyway

#

Neither New Computer Modern Math, Stix Two Math, nor Noto Sans Math contain shin, so I think removing it is a good idea though @storm whale @heady fulcrum

#

unicode-math also doesn't contain it

#

Stix Two Math and Noto Sans math do contain sha (both lower and upper case)

#

surprisingly New Computer Modern Math does not contain sha

#

May want to suggest that to the maintainer

#

@vapid osprey you've been in contact with him right?

vapid osprey
#

i have but tbh i don't really like writing mails to him 😅

grizzled granite
#

I wish they had an issue tracker

vapid osprey
#

sending patches is even worse since he doesn't really use git

midnight tangle
lapis moth
#

fair enough

heady fulcrum
grizzled granite
#

Probably a good idea yes

midnight tangle
storm whale
# past python can someone give me a TLDR why we want to replace alef, bet, etc. with the mathe...

It came up as I was doing the changes on the Typst side for that PR and saw that the Hebrew mappings were there, but they don't really fit in with any of the styles (and would be odd to not move to Codex along with the rest I think). Those four Hebrew math letters aren't really "styles" of their original Hebrew letters, but different glyphs with different meaning entirely. Comparing it to say Greek, the upright letters (which are actually used in math) are just the Greek alphabet in Unicode (uses the same codepoints). This is unlike these four Hebrew characters and their math codepoint which are the same letter (and the same style), just encoded differently because they have different meaning. There's also maybe the point that the math ones are LTR characters, but the Hebrew letters aren't, so mapping them would maybe be a bit weird.

grizzled granite
#

I'm not sure how that's significantly different from greek

#

the italic greek symbols (which are the ones you'll see unless you call upright) are mathematical symbols

#

The fact that they map to the text symbols means you can use sym.gimel in text and get the correct letter instead of the mathematical symbol (which is unlikely to be present in a text font)

storm whale
grizzled granite
#

In both cases therefore the Greek and Hebrew symbols point to the text versions, which get mapped to the math versions in math (though for Greek that mapping is trivial for upright)

#

It seems perfectly consistent to me

storm whale
#

Though you actually can't get the Hebrew letter back in math, as it is not treated as a style

#

so that's kinda where I am coming from as well

grizzled granite
#

Iirc

midnight tangle
#

I don't think this was mentioned earlier, but unicode-math defines \aleph, \beth, \gimel, and \daleth, which are mapped to the symbols (not the Hebrew letters).

storm whale
storm whale
#

Do the lowercase Greek map to the italic math symbol?

midnight tangle
#

There is no, e.g., \alpha in unicode-math apparently

#

Instead, there is a name for each style variant

grizzled granite
#

You can do it for Greek because they're the same codepoints

#

Upright that is

storm whale
#

Why not? So with the Hebrew then since they're not the same codepoints what's the point in even mapping to the non mathematical ones then?

midnight tangle
#

Two pull requests were opened to Codex.
I approved the first one because there seems to be an actual use for the symbols and the names don't conflict with anything. https://github.com/typst/codex/pull/57
Regarding the second one, I'm not sure how much currency symbols we should support, but cedi is a pretty unique name as well (Wikipedia redirects "Cedi" to the corresponding page, with no disambiguation) so it's probably fine. https://github.com/typst/codex/pull/58

GitHub

adds U+2322 ⌢ (frown) and U+2323 ⌣ (smile). it is used in certain cohomology theories for cap and cup products. in latex, it is represented by \frown and \smile.

GitHub

The ¢ "cent" symbol is used when referring to fractional dollars in US currency. The ₵ "cedi" (Ghanaian Cedi) is the legal tender of the Republic of Ghana.

storm whale
#

I mean we already have a fair few currency symbols and they are all quite distinct names (as in, can't really see them ever overlapping with something in the future), so I'd agree it is fine (I've approved it). With the other one, I think that's also fine, but might be good to leave it a little longer to give others a chance to comment (so I won't approve it atm)

grizzled granite
grizzled granite
grizzled granite
grizzled granite
midnight tangle
#

Given the PR was made by an "outsider", and there is some work to do to determine which currency symbols should be included, and under which name, I would say it's better to do it in a separate PR

#

I would feel bad asking them to do this work when they just wanted to add two symbols

grizzled granite
#

I'm actually working through the list of currencies right now @midnight tangle

midnight tangle
#

Then you can probably make a PR of your own if that's okay for you

grizzled granite
#

I think some discussion is required, but eventually sure

#

Actually I also realized we should deprecate a currency symbol we currently have

grizzled granite
#

the symbol was never even officially adopted

grizzled granite
#

@midnight tangle do you know what the deal is with all the "fullwidth" symbols?

midnight tangle
#

Without context I can't say for sure, but that sounds like the versions of latin notations meant to be used in CJK text, displayed in a square the same size as other CJK characters

grizzled granite
midnight tangle
#

I don't think they should be added to Codex. Most people will misuse them, and I would assume those who need them already have a way to insert them because they need to do it in other software

grizzled granite
#

Armenian Dram Sign - dram (Currency of Armenia)
Afghani Sign - afghani (Currency of Aghanistan)
Bengali Rupee Sign - taka (Currency of Bangladesh)
Tamil Rupee Sign - rupee.tamil (Alternative symbol for Indian rupee in the Tamil alphabet)
Thai Currency Symbol Baht - baht (Currency of Thailand)
Khmer Currency Symbol Riel - riel (Currency of Cambodia)
Colon Sign - colon.currency (Currency of Costa Rica)
Naira Sign - naira (Currency of Nigeria)
Rupee sign - rupee (Currency of Mauritius, Nepal, Pakistan, Seychelles, Sri Lanka)
New Sheqel Sign - shekel (Currency of Israel)
Dong Sign - dong (Currency of Vietnam)
Kip Sign - kip (Currency of Laos)
Tugrik sign - tugrik (Currency of Mongolia)
Guarani Sign - guarani (Currency of Paraguay)
Hryvnia Sign - hryvnia (Currency of Ukraine)
Tenge Sign - tenge (Currency of Kazakhstan)
Manat Sign - manat (Currency of Azerbaijan)
Lari Sign - lari (Currency of Georgia)
Wancho Ngun Sign - rupee.wancho (Alternative symbol for Indian rupee in the Wancho alphabet)

No longer in use (or have never been used):
Bengali Rupee Mark
Bengali Ganda Mark
Euro-Currency Sign
Cruzeiro Sign
Lira Sign
Mill Sign
Peseta sign
Drachma Sign
German Penny Sign
Austral Sign
Livre Tournois
Spesmilo Sign
Nordic Mark Sign
Rial Sign
Tamil Sign Kaacu
Tamil Sign Panam
Tamil Sign Pon
Tamil Sign Varaakan

Proposed deprecated:
Gujarati Rupee Sign, see https://unicode.org/L2/L2009/09331-gujarati-rupee-sign-deprec.pdf and https://www.unicode.org/charts/nameslist/n_0A80.html

I am confuse:
Nko Dorome Sign
Nko Taman Sign
North Indic Rupee Mark
Small Dollar Sign

Currently in typst, but should be renamed:
Indian Rupee Sign - rupee.indian (Rupees are used in several countries, and this symbol is only used in India)

Currently included in Typst, but should be removed:
French Franc Sign (Apparently proposed but never actually adopted or used)

#

I think that should be everything @midnight tangle

#

I guess there is some argument to be made for having obsolete currencies available at some point, but that's less pressing

#

The Wancho language has ~50 000 speakers, and the written language is only 15 years old, but if it's notable enough to be added to unicode I don't see why typst couldn't contribute to keeping endangered languages alive

#

I was most unsure about the naming of colon, since there is an obvious conflict. I ended up with colon.currency

midnight tangle
grizzled granite
midnight tangle
#

Unless we decide to put all those currencies under a currency submodule, but that would make them all longer to type so it's probably not a great idea

midnight tangle
grizzled granite
#

Okay I found the proposal for the Nko symbols I was confused about

#

They are described as letters for the "dorome" and the "taman", but I can't find currencies under that name

grizzled granite
#

Codex

#

How come this is just a comment? And not a line with @ at the beginning like the other deprecations?

midnight tangle
#

Because it is a variant

#

We only support deprecating symbols for now

grizzled granite
#

I created a PR for the currency symbols @midnight tangle

#

hopefully I did it correctly

#

I've only created on once before

midnight tangle
#

It looks right at a first glance. I won't be able to properly review it for some time, but a pull requests needs two approvals to be merged so it is fine

grizzled granite
#

no hurry

midnight tangle
#

@grizzled granite regarding your comment on the currency symbols PR ("I don't think there are any plans to have more categories than general symbols and emojis."), we have the ability as of now to create sym and emoji submodules, and are planning to do it for gender symbols (https://github.com/typst/codex/pull/2). This PR is the oldest still waiting for approval btw

grizzled granite
midnight tangle
#

I'm not sure what you mean by that?

#

What is serving what purpose?

grizzled granite
#

Unless I misunderstood

#

Oh nevermind, I mixed two prs

#

Sorry

#

So what exactly is the reason that they are a submodule? @midnight tangle

midnight tangle
#

Because it does not really make sense for gender alone to be a valid symbol. What would it be?

#

Essentially, unless there is a generic symbol for the concept of "gender", not making a submodule and instead having gender be a symbol by itself would have required choosing a default gender, which I kinda don't want to do

grizzled granite
#

Okay, I understand

grizzled granite
midnight tangle
#

Because then you can do #import sym.gender: x

#

In the case of currencies, e.g., #import sym.currency: euro

grizzled granite
#

Okay. That does make sense, though I suspect non-programmers will not do that

#

Currency is a very long word

midnight tangle
#

Yes this is a valid concern, but a separate issue. This could be explained somewhere in the doc, but it is probably hard to explain properly (from the user's pov, the difference between submodules and symbols is barely noticeable)

grizzled granite
#

Well it's not really separate, because it effectively means that anyone not using import will have to type a much longer name

midnight tangle
#

To be clear, I meant #1277628305142452306 message as a response to #1277628305142452306 message, not "currency is a very long word"

grizzled granite
#

With the exception of colon, the names are also weird enough that they shouldn't be interfering with anything.

midnight tangle
#

And also the day the do we can probably figure out a (possibly breaking) solution, it's not like those currency symbols are gonna become the most used symbols in Typst

grizzled granite
flat pagoda
grizzled granite
#

I've been incredibly swamped with work, so I haven't had the overhead to look into the codex PRs. Now I'm done with teaching, so hopefully I can find some time

grizzled granite
# flat pagoda What? 😅

It's a long word to add before every single currency. If I want to use the euro symbol on text I'd have to type #sym.currency.euro

#

How do you guys feel about maintaining more information about each symbol we add? I feel like there's a lot of contextual information that could be useful to have available in the documentation. This could include references to other relevant symbols.

flat pagoda
grizzled granite
#

We should strive to make names actually usable by default when possible

flat pagoda
#

If course, but a part of usability is being intuitive. For example, the Unix habit of shortening program names by just removing all vowels is rather unintuitive, until you figure out the pattern.

grizzled granite
#

I don't think Unix naming is something we should look to for inspiration

flat pagoda
#

I agree. The most intuitive thing to do is to just call things by their real names, without any transformations applied. Unless of course a name is like 20 symbols long. That would be annoying (but only without autocompletion).

midnight tangle
grizzled granite
midnight tangle
#

You mean in the user doc?

grizzled granite
grizzled granite
midnight tangle
#

Like adding information about what symbols are, how they are used?

grizzled granite
#

Yes

midnight tangle
#

That could make sense but it would be quite a lot of work

grizzled granite
#

Indeed. I'm talking long term

midnight tangle
#

I guess this information could be sourced from Unicode

grizzled granite
#

Some, but it would be hidden in thousand of proposal documents

midnight tangle
#

Although doing that automatically could be a bit hard, because the information that is easy to get programmatically is not always the most relevant

grizzled granite
#

It doesn't have to be that detailed though, at least not as a start.

midnight tangle
#

I wish Unicode did that themselves honestly: a place where each codepoint links to all relevant information from the spec, charts, and proposal docs

grizzled granite
#

Maybe there is an internal database for members, but what I've found has just been all over the place

lapis moth
grizzled granite
#

It would aid discoverability too, since it would improve the symbol search

flat pagoda
proven birch
midnight tangle
#

Well, this is mainly just the technical information (appart from the Wikipedia integration). I would like links to relevant parts of the spec, not just data from the UCD. But sadly I don't think this is easy to automate, it would have to be done manually

grizzled granite
#

The Unicode convention seems very strange. The top 6 are numbered 1-3 on the left and 4-6 on the right, then the bottom two are 7 and 8

#

Is it because regular braille only has 6 dots?

#

Answer: yes

#

I guess it would be done similar to the math styling PR?

midnight tangle
#

If we decide to add them (which I'm unsure there is a need for, as people who use braille already use different input methods and probably don't want to display braille letters), the first thing I think about is that this is where the modifier system really shines! We could have braille, with .tl, .tr, .l, .r, .bl, and .br modifiers that could be combined in any way.

#

This would probably have to be automated in the build script though

grizzled granite
#

The naming convention for the dots here seem to be a standard though

midnight tangle
#

Then it's kinda unfortunate that we can't use single digits as identifiers

grizzled granite
#

I guess not adding them would be the simplest option, but it feels like something that should exist at some point

midnight tangle
grizzled granite
#

We've probably discussed this before, but I'm not a big fan of the naming of dash()

flat pagoda
tall quail
midnight tangle
#

overline is already used for something else

grizzled granite
grizzled granite
#

I think that bar would be a better name than dash. It's short, descriptive, and should be intuitive for latex users.

#

The only pitfall is that it would not be the same as \bar in latex, since that corresponds to typst's macron

#

I think most people prefer the extensible variant though

#

So as I see it, there are two options:

  1. Rename dash to bar
  2. Rename macron to bar and find another better name for dash
#

I've never heard of dashes referring to anything other than hyphens, en- and em-dashes

midnight tangle
midnight tangle
grizzled granite
midnight tangle
#

Then I think it is

#

But I have not checked

#

Also, some parameters (such as line thickness) may be obtained from font metrics

grizzled granite
#

Has there been any discussion about how to deal with gender and skin color for human emojis?

midnight tangle
#

Not that I know of. Regardless, this would require supporting arbitrary strings as symbols.

#

I am waiting for the modifier resolving PR to be merged to implement that

#

(to prevent conflicts)

grizzled granite
midnight tangle
#

Indeed

lapis moth
grizzled granite
#

does u+2215 get turned into u+002f ? The former is a division slash

midnight tangle
#

slash.circle is addressed in Section 3.4 of the document. I did not add it for now because it would be nice to resolve #34 first

GitHub

Context For circled symbols, we currently use the .circle modifier. This creates a name conflict in the case of U+2A22 ⨢ "Plus sign with small circle above" and U+2295 ⊕ "Circled plu...

grizzled granite
midnight tangle
#

It's in Section 3.13 of the document, along with a gazillion other slashes apparently

grizzled granite
midnight tangle
#

I think it might cause some fonts to display the surrounding digits in sup-/sub-scripts

grizzled granite
midnight tangle
#

Oh yeah indeed

#

Then I think it's probably just a slash specifically for use as a division symbol, the same way "MINUS SIGN" is not the same as "HYPHEN MINUS"

#

But that would need to be confirmed

#

So "just a semantic thing" as you said earlier

#
  • ofc fonts may display it differently
grizzled granite
#

How do you feel about switching to circled, but also adding "o” as an alias for that modifier?

#

These are the kind of symbols that are used kind of often

midnight tangle
#

LaTeX kinda uses that, and .o does not already exist as a modifier AFAIK so it would not conflict with anything I believe

#

To be clear, in my mind, .o would be to access a circled version of a symbol, not any version that has a circle somewhere

grizzled granite
grizzled granite
#

I guess it wouldn't even need to be an alias. It could be the canonical way

midnight tangle
#

I was gonna say it would be good to keep .circled for consistency with .triangled and the other ones, but actually keeping .o only makes sense, the same way we already have .t, .r, etc..

grizzled granite
#

It was mentioned that we have two different spellings for Hebrew letters. Is there any particular reason for that?

#

I don't see that as necessary to be honest. We should just pick a spelling and stick with it

grizzled granite
#

alef seems to be the more modern romanization

#

Though that would be inconsistent with Greek, since if one follows romanization it would be alfa instead of alpha

#

So deprecate alef etc?

midnight tangle
#

I don't think it's problematic to have multiple names for a single symbol. Both spellings exist, why only accept one of them? Let's accept whatever reasonable input users may try

#

Something I loved with Typst symbols when I started using Typst was that they were very easy to predict, and when I was using Typst for the first months, I would often be able to just guess the symbols I wanted to insert. Removing some common alternate spellings goes against that

grizzled granite
#

where one of the spellings is universal (aleph)

midnight tangle
#

Do you have sources for only aleph being used in math?

grizzled granite
#

The only reason why we have both is that they were added before we had the ability to deprecate

lapis moth
#

I'm biased as the author of the alias PR, but I think having two spellings is totally fine as long as neither one conflicts with something else

grizzled granite
#

that's why we only have 4 (well, 5 for some weird reason) hebrew letters

midnight tangle
grizzled granite
#

I'm opposed to having different spellings unless there is a very clear reason for it

grizzled granite
midnight tangle
#

I just wanted to be convinced that it was more than what you and other people in your field are used to

#

Still, I don't see a major issue with having alternate spellings

grizzled granite
#

I don't use hebrew letters in my field

grizzled granite
midnight tangle
#

Well... I would not see an issue with adding all reasonnable transliterations here as well

grizzled granite
#

tugrik, tughrik, tugrug, togrog, tögrög

midnight tangle
grizzled granite
#

Different names when they're fundamentally different? Sure

#

But spelling differences is unnecessary

grizzled granite
#

@lapis moth what do you think about the semi-alternate proposal to https://github.com/typst/codex/issues/34 ? Using "triangled" and "squared" like you proposed, but the simpler modifier "o" for circled. Some circled symbols are extremely common, and there's already precedence for using "o" to refer to circular things. See "oo" for infinity

GitHub

Context For circled symbols, we currently use the .circle modifier. This creates a name conflict in the case of U+2A22 ⨢ "Plus sign with small circle above" and U+2295 ⊕ "Circled plu...

#

plus, it's cute!

midnight tangle
#

Just as a note, although of course everyone's opinion is welcome, this proposal was extracted from the document so it was not initiated by @lapis moth

grizzled granite
#

by the way, I discovered some weirdness while investigating this

midnight tangle
grizzled granite
#

like how "ast.circle" really should've been "ast.op.circle" and there was no corresponding "convolve.circle"

#

there's a couple of instances of this

midnight tangle
grizzled granite
#

also "ast.small" is in the small form variants block, which are compatibility symbols for chinese

midnight tangle
#

Maybe we should go over all the existing symbols and check for weirdness like that at some point, but that will take a lot of time

grizzled granite
#

"plus.small" is also a compatibility symbol

#

and "lt.small"

#

and "gt.small"

#

and "eq.small"

#

those are all of them

midnight tangle
#

so not in.small?

grizzled granite
#

no

midnight tangle
#

weird

grizzled granite
#

Should we have a deprecation period, or just remove them?

#

I don't think there's a formal deprecation mechanism for modifiers

midnight tangle
#

No there's not

grizzled granite
grizzled granite
#

can a modifier be repeated more than once?

#

it's really weird that unicode has ⦼ but not the uncircled one

midnight tangle
#

I.e., not ordered, and repeats are ignored (maybe warned or even cause an error in recent versions)

#

But at some point I was seriously considering making my own similar website, because most string inspector/character lists online do not show all the information I want AND relevant links to Unicode charts, etc..
Then I realized I do not have the time to do that...

grizzled granite
#

I created a PR. I opted not to add any additional symbols right now, for the reasons mentioned in the issue

#

In particular, many of the circled symbols are variants of symbols we do not currently have at all. Such as the division slash and bullet operator

#

Then I realized I do not have the time to do that...
Story of my life

grizzled granite
#

Is there a motivation behind some symbols being written with a hex code instead of the actual symbol?

#

in sym.txt

lapis moth
storm whale
grizzled granite
storm whale
grizzled granite
#

The only deprecation mechanism for modifiers right now is a comment

#

Would it be hard to introduce something like the current thing we can do at the top level?

#

I'm clueless about rust

#

Today I learned that North Korea once requested that unicode add separate codepoints for korean letters that are used to spell out the names of kim il sung and kim jong il

#

what a loss

storm whale
grizzled granite
#

oh I thought you were the one that added it

storm whale
#

No lol, I think it was @midnight tangle

grizzled granite
storm whale
grizzled granite
# grizzled granite https://www.unicode.org/L2/L2001/01349-N2374-DPRK-AddSymbols.pdf

for compatibility with this standard https://en.wikipedia.org/wiki/KPS_9566 which they apparently keep updating

KPS 9566 ("DPRK Standard Korean Graphic Character Set for Information Interchange") is a North Korean standard specifying a character encoding for the Chosŏn'gŭl (Hangul) writing system used for the Korean language. The edition of 1997 specified an ISO 2022-compliant 94×94 two-byte coded character set. Subsequent editions have added additiona...

#

they added kim jong un's letters in 2011

#

man there's a lot of lore on that page

midnight tangle
storm whale
midnight tangle
#

Wait do we now know what Angzarr is!?

#

The angzarr () is an obscure typographical symbol representing azimuth
https://en.wikipedia.org/wiki/Angzarr

The angzarr (⍼) is an obscure typographical symbol representing azimuth, dating back to at least the mid 20th century, which became notorious during the first half of the 2020s for its obscurity and lack of a widely recognised meaning (compare ghost characters).
The name is from an abbreviation of its ISO 9573-13 name, "Angle with Down Zig-zag...

grizzled granite
#

Seems like it

grizzled granite
#

since right now the whole angle thing is weird

midnight tangle
#

yes

grizzled granite
#

regarding https://github.com/typst/codex/pull/65 how does one handle symbols that can be both a regular glyph and an emoji?

#

this is the case for instance for ☾ I believe

midnight tangle
grizzled granite
#

@heady fulcrum it seems like most math fonts either have all the cyrillic letters, or none

#

the default font (ncmm) doesn't have any

heady fulcrum
#

Ah. Ouch.

grizzled granite
#

unlike for greek, there are no specific codepoints for math cyrillic, since they're very rarely used

#

with the exception of perhaps sha

heady fulcrum
#

how does latex do it? the font over there has sha?

#

or is it some package that does trickery?

grizzled granite
#

I think adding sha makes sense, though we should probably ask if the maintainer of ncmm could add it

heady fulcrum
#

sounds good to me

#

they were a bit unresponsive though, right? iirc

grizzled granite
#

not sure

#

is it just the uppercase letter that is used?

heady fulcrum
grizzled granite
#

oh well, I added both for good measure

heady fulcrum
#

I'm sure there are people out there who decide to use lowercase, but I really don't thiink it's major, so no need to support both I think

heady fulcrum
grizzled granite
#

I saw some references to https://en.wikipedia.org/wiki/El_(Cyrillic) having been used, but that seems very obscure

El (Л л or Ʌ ʌ; italics: Л л or Ʌ ʌ) is a letter of the Cyrillic script.
El commonly represents the alveolar lateral approximant /l/. In Slavic languages it may be either palatalized or slightly velarized; see below.

storm whale
heady fulcrum
#

(the letters even look alike, again!)

heady fulcrum
storm whale
#

Actually NewCM has Cyrillic in the text font, so should just be a matter of copying the glyphs into the math fonts

grizzled granite
#

I found the solution to the angle bracket situation

#

brokets /s

#

It's not ideal, but how about "anglebracket", with a predefined lr-function "ang"? Presumably using the delimiter without creating some form of definition will be rare

#

Currently we have all sorts of entries for angle brackets and angle (as in acute) smushed together

#

Someone suggested chevron

#

Perhaps more ambiguous, but it's not bad as long as searching for angle brackets brings them up

heady fulcrum
#

I'm really not a fan of chevron

#

it's a fairly complex word (in terms of writing as well as pronunciation) and is hardly discoverable

grizzled granite
#

presumably not a lot of people are accessing delimiters by the symbol names directly

heady fulcrum
heady fulcrum
grizzled granite
#

the current situation where we have two entirely different families under angle is very awkward

heady fulcrum
#

the other family being the angle from geometry?

grizzled granite
#

yes

heady fulcrum
#

I agree it's a bit awkward, but I'm still not convinced that it's worth moving the angle brackets to a more obscure name

#

(I remember the previous discussion on this only somewhat)

grizzled granite
#

it's not like it's an unknown name

heady fulcrum
#

disagree

#

it's a technically correct name but very people call them chevrons, nevermind think of 'chevron' when they see the symbols

grizzled granite
#

I'd say that's even more obscure

heady fulcrum
#

no, but I also think we should change that

heady fulcrum
#

IMO it should be bar and the current bar should become something else

#

I think we discussed something like this a long while back, but I don't remember what came of it

grizzled granite
heady fulcrum
#

I mean, if the 'entry-point' for this notation was an lr-like function, then I wouldn't be too bothered about a more obscure name

#

but this lr-like function does not exist yet

grizzled granite
grizzled granite
#

it's a terrible name

#

?r $ dash(x y z) $

gaunt archBOT
heady fulcrum
#

hm, I see; I'd say I agree with that (dash -> bar)

#

though that's totally orthogonal

grizzled granite
#

I think most people want the extensible version regardless, so I think it's best to give them that as bar, even if it doesn't technically match latex

heady fulcrum
heady fulcrum
#

crazy idea, but maybe we could do something like bar.short for the current macron then? (In a world where bar is the current dash)

#

(not sure the parser would work well with this)

#

?ast $bar.short(a)$

gaunt archBOT
# heady fulcrum ?ast `$bar.short(a)$`
Markup: 14 [
    Equation: 14 [
        Dollar: "$",
        Math: 12 [
            FuncCall: 12 [
                FieldAccess: 9 [
                    MathIdent: "bar",
                    Dot: ".",
                    Ident: "short",
                ],
                Args: 3 [
                    LeftParen: "(",
                    MathText: "a",
                    RightParen: ")",
                ],
            ],
        ],
        Dollar: "$",
    ],
]```
grizzled granite
heady fulcrum
#

hm

#

maybe so

grizzled granite
#

oops*

heady fulcrum
#

just figured that could be slightly more discoverable

#

since macron is such as weird name

grizzled granite
heady fulcrum
#

yeah

#

but if we don't have something better there, then I think it's worth having the geometry angle in the same hierarchy for the added discoverability

grizzled granite
#

I thought ang was a relatively neutral term,

heady fulcrum
#

when I see ang for some reason I think angstrom

grizzled granite
#

that would be å¨ng

#

ång*

heady fulcrum
#

yeah yeah, I know

#

(don't have the keyboard to type that)

#

(oh wait I do, I just don't know how to do it 😅 )

grizzled granite
#

I don't think we can make everyone happy, but we can try to make many people the least unhappy? xD

heady fulcrum
#

why not angles?

#

(I feel like we already discussed this option)

grizzled granite
heady fulcrum
#

for the lr function

grizzled granite
#

that sounds even more angle-like

heady fulcrum
#

hm, I suppose we could also do angles.l and angles.r for the symbols, though I'm not sure that's a good idea or even trivial with the current architecture

heady fulcrum
#

which is the issue, no?

#

I mean, people call these angle brackets

grizzled granite
#

I find angles even worse than angle to be honest

heady fulcrum
#

I find it natural that angle would be in the name

heady fulcrum
grizzled granite
#

because it's just a plural of the word angle, not the plural of angle bracket

grizzled granite
#

ok I'm done with PRs for a while I think :p

tall quail
#

the "angle" will still be there in the full symbol name (Mathematical Left Angle Bracket), so it will still be fairly discoverable

#

e.g. searching "angle" in the symbol documentation page will find it

tall quail
#

Also it's good practice to avoid user variables that shadow standard names. That gets harder if we populate the standard library with redundant names

#

@midnight tangle has a point regarding discoverability but this should rather be addressed in the tooling: the alternative names should be somewhere in the symbol metadata, and the documentation, tinymist, etc. should use that to help the user find the canonical name

heady fulcrum
heady fulcrum
heady fulcrum
#

but again, I don't think it's worth having a more "organized name" if it hurts usability because it's less discoverable. And angle brackets are common, as we've already discussed, so I think they merit some care.

grizzled granite
#

Although they're obviously related

heady fulcrum
#

oh yeah, you're right. which I then agree with

lapis moth
#

@midnight tangle @grizzled granite Re #62:
compose.o but not circle.stroked.tiny.o existing is an unprecedented addition of a unique variant to an alias. I could add that to the alias system from #27, but that would also make it (a bit) more complicated.
What are your thoughts on this?

midnight tangle
#

To me, aliases make sense for things that are different spellings of the same underlying concept. For different things that happen to be represented by the same symbol, not so much.

lapis moth
#

ah, so nabla and gradient are aliases, but since circle.stroked.tiny is only about the shape of the symbol, it's not considered the same thing. Makes sense👍

midnight tangle
#

When making PRs that contain or plan breaking changes (i.e., remove or deprecate stuff), don't forget to add the breaking tag (if you have the permission). Also, remember that PRs containing breaking changes should be reviewed by Laurenz before being merged.

grizzled granite
#

I'll try to keep that in mind

#

Can a PR be unmerged?

midnight tangle
#

We can merge a PR that reverts it, but we can't unmerge it

#

It's not that big of a deal. Laurenz will see the changes at some point (when making the changelog if not before), and we can easily revert the PRs at this point if needed

past python
#

I'll take a look at the merged breaking PRs when I get back to PR review

storm whale
#

Completely forgot about that, sorry for merging some of those PRs too eagerly!

grizzled granite
#

@storm whale I'm not sure why the accents are present as symbols in the first place? When is a standalone non-combining accent useful?

storm whale
lapis moth
#

oh lol, that makes a surprising amount of sense

storm whale
#

I think were doing it for all of them. Grave for example is \u{0060} and not U+0300 (but it 0060 doesn't seem to have any decomposition?)

grizzled granite
#

does anyone have opinions about the creative commons symbols?

#

I'm wondering if they can be cc.by, cc.nc, cc.nd, cc.sa, cc.zero and cc.public

#

or if we want to use longer names

#

Like, cc.attribution is fine, but cc.noderivativeworks is a mouthful

#

I was considering cc.not.derivative and cc.not.commercial

storm whale
#

It will work if we use the right codepoint!

storm whale
grizzled granite
#

I was thinking pdm would be hard to decipher

#

so that's why I went with cc.public

storm whale
#

Oh yeah public and zero I think are fine as they're short as well

#

What happened to no more PRs for a while? Aha

grizzled granite
#

I stumbled into some unicode rabbit hole, and this felt fairly isolated and uncontroversial

storm whale
#

That's how it starts...you'll have ten more open by the end of the day!

grizzled granite
#

my god there's a lot of chess symbols in unicode

flat pagoda
midnight tangle
midnight tangle
grizzled granite
#

I've been thinking. Should we separate math symbols from those that are not math symbols?

#

Because for regular users this distinction is not evident

midnight tangle
#

I've been wondering about that for some time, and it might make sense to at least have some sort of marker that a symbol is a math symbol. But I guess that's what the Unicode MathClass is for.

#

Could be used in the doc to have a more consistent font as well. For now, symbols in the symbol lists use the default website's font, with a fallback to NewCM Math. This means some symbols look quite ugly, such as → and ←, especially when next to, e.g., ↑.

grizzled granite
#

I was thinking like sym and mathsym or something along this lines

#

with only the latter being imported in math by default

midnight tangle
#

The thing is, is the distinction really useful? Any symbol can be a math symbol, and some symbols can be both math and non-math (e.g., punctuation marks)

#

why would we want to not import all symbols in math by default?

grizzled granite
#

It is useful yes, because it tells you which symbols are likely to be present in a math font

midnight tangle
#

Dice symbols are clearly not "math symbols", but I could see myself using them in math. Same for playing card suit symbols

grizzled granite
#

You would still be able to use them, they would just not be present by default

midnight tangle
grizzled granite
#

When codex is growing you would get auto-completion suggestions for symbols that are extremely unlikely to be present in a math font, but seem like they should be

midnight tangle
#

Regarding this specific symbol, maybe we shouldn't add it in the first place. In general, this seems like an autocompletion issue that could be tackled separately

grizzled granite
midnight tangle
grizzled granite
#

I see no reason to arbitrarily limit symbols, unless they're clearly compatibility characters and such

midnight tangle
#

But I would say we shouldn't add symbols that aren't useful, and this double parenthesis thing isn't useful imo because it's not meant for use in math, and whoever needs it in text probably has a better way of inputting it

grizzled granite
#

But that's why I'm also saying that there needs to be some way to tell math symbols apart from non-math symbols

#

and whoever needs it in text probably has a better way of inputting it
How would they do that?

midnight tangle
grizzled granite
#

without digging into proposals

midnight tangle
#

I agree this is quite annoying

grizzled granite
#

And they don't even make it easy to find the relevant proposal

grizzled granite
#

(but who's to say they don't want to use typst?)

#

Anyway, I'm not proposing adding them now, I just wanted to provide an example

midnight tangle
#

I think it would be better not to merge PRs immediately, in order to let time for everyone to give their opinion. Ideally, I would wait a week until the PR was opened, but I understand that it can slow down development. At the very least, PRs should probably stay open for 24h.

river berry
lapis moth
#

yeah I don't merge PRs even when I'm the second approver for this exact reason

grizzled granite
#

there's no hurry, it just didn't occur to me

grizzled granite
#

Is there some setting that could prevent the merge button to show up until a week has passed?

midnight tangle
#

maybe using Actions (which i don't know anything about), but it's probably not that big of a deal as long as we just commonly remember not to merge PRs too eagerly

grizzled granite
#

another reason for multi-character symbols: many ipa symbols consist of sequences with combining accents

#

so no unicode 14+

grizzled granite
#

dash.wave (U+301c, 〜) and dash.wave.double (u+3030, 〰) are both cjk punctuation characters, which we've tried to avoid so far

#

the latter is also an emoji (〰️)

grizzled granite
#

It seems like every batch of symbols I come across has at least one symbol in it that needs a text variation selector

grizzled granite
#

Do we have all emojis apart from skin color variants and such?

#

the answer is no

grizzled granite
#

Does anyone have a good way of sorting emoji.txt and sym.txt?

midnight tangle
#

Not really. The thing is we kinda want to sort by meaning, but also by alphabetical order within some groups, so it's a bit messy at the moment

#

I feel like the best that can be done without putting unnecessary work into it is just reorganizing entries around the liness that are affected by a given PR when working on said PR

grizzled granite
#

it's like 99% alphabetical with some odd ones out

midnight tangle
#

If it's easy to fix by hand I would just do that. Otherwise, idk

grizzled granite
#

The ones I found would be easy, but if I were to actually make a PR I'd wanna make sure I caught all

grizzled granite
#

Does anyone know why we have "kai"? it's essentially a greek ampersand, and would likely not be useful for anyone not writing greek text

#

and they presumably have better ways to enter it

grizzled granite
midnight tangle
grizzled granite
midnight tangle
#

No

#

Specifically, it is explicitely not similar to the Omega / Ohm situation (see the link above)

grizzled granite
#

in stix two math it looks exactly like upright mu

midnight tangle
#

The ohm sign is canonically equivalent to the capital omega, and normalization would remove any distinction. Its use is therefore discouraged in favor of capital omega. The same equivalence does not exist between micro sign and mu, and use of either character as a micro sign is common. For Greek text, only the mu should be used.

grizzled granite
#

but I see there are differences in serifs for new computer modern

midnight tangle
grizzled granite
#

I honestly suspect that may not even be intentional

midnight tangle
#

Btw ths character is present on all AZERTY (i.e., French) keyboards AFAIK

#

The visual difference?

grizzled granite
#

Yeah

midnight tangle
#

Above is NewCM, below is NewCM maths
Left is Mu, right is Micro

#

With fallback disabled

#

I agree it's probably not intentional

#

It's so annoying that NewCM does not have a public bug tracker where we could just report that instead of writing an e-mail

grizzled granite
#

It might be intentional that micro is the same for the text and math fonts

midnight tangle
#

It looks good near an upright "m" in both cases

#

So probably intentional actually

grizzled granite
#

Looking at the list of symbols defined by Unicode math, it doesn't seem to include a name for micro

midnight tangle
#

Maybe because it's not really a math symbol, although it is still a scientific symbol

grizzled granite
#

To me it makes more sense to typeset units using the text font and not the math font

midnight tangle
#

Above: Libertinus Serif, below: Libertinus Sans
Left: Mu, right: Micro
There is a slight difference between the Micros, but not between the Mus I think

midnight tangle
midnight tangle
#

I don't know LaTeX but I think it was said multiple times that it uses the text for for upright math at least in some scenarios

grizzled granite
#

That's the text font yes. Though I think you can typeset upright math with the math font at least in the Unicode engines

midnight tangle
#

I remember someone (maybe even you actually) saying that LaTeX uses the text font more than they expected in math

grizzled granite
#

Like \symrm or something

#

I would have to look it up

vapid osprey
#

i thought \mathrm would be upright math font while \text would be text font

grizzled granite
#

No, mathrm is the text font.

#

By the way, epsilon and epsilon.alt are reversed relative to latex

#

I'm not sure if it's intentional or not

midnight tangle
#

It is probably, because the LaTeX non-alt epsilon is a terrible choice

#

Dare I say, it was made by a 🌙 lunatic 🌙

grizzled granite
#

I mean, that's subjective. It's not uncommon to use both in the same paper for different things

#

Anyway, I don't think it's a big deal necessarily. I just happened to notice it because of epsilon.alt having a reversed variant, but not epsilon

#

It'll definitely trip some people up

weary bear
#

Isn't phi also reversed the same way. vs \phi and \varphi

grizzled granite
#

Yeah

#

Varphi is so much prettier than phi 🥰

#

How much effort does the inclusion of multi character symbols require?

#

I wish I understood rust

midnight tangle
#

The main reason I haven't implemented multi-character symbols in Codex yet is because multiple meta PRs are waiting to be merged, and I don't want to have to deal with conflicts

midnight tangle
#

PRs changing the thing that parses the symbol files

#

they have the meta tag on GitHub

storm whale
grizzled granite
#

I found a bug

storm whale
#

I know we said to leave PRs open for a bit but with something like digamma that seems fairly uncontroversial?

storm whale
grizzled granite
#

scratch that

#

or actually

#

I'm not sure

#

I can't find a font that has both

#

but the first symbol is an inverted L, and the last is a sans serif inverted L

#

the sans function should presumably be mapping it correctly

storm whale
#

There's an inverted L?

grizzled granite
storm whale
#

I am 99% certain it is not in any of the Mathematical alphabetic blocks in Unicode then

grizzled granite
#

probably used for inverse functions and such

grizzled granite
#

actually the sans serif one says math symbol there

storm whale
#

oh the sans ones are in the letter like symbols block

grizzled granite
#

sans serif inverted G, L, Y and reversed L

#

presumably the corresponding non-sans serif symbols should map to them?

#

otherwise there's no way to add them as symbols unless we use a sans modifier, which is inconsistent

#

I'll create an issue

storm whale
#

I can add them to my codex pr

grizzled granite
#

ok, I'll create the issue regardless

storm whale
grizzled granite
#

By the way, there are a few instances of inverted used in a different way in unicode than we do

#

In typst currently, inverted means rotated by 180 degrees, while here it is used in the sense of vertical mirroring

storm whale
#

We use .rev instead right?

grizzled granite
#

no rev is horizontal mirroring

#

I guess it would be .inv.rev

storm whale
#

ah

grizzled granite
#

actually I think the turned L is literally the only one that has a non-sans serif version @storm whale

#

why do you have to be so damn inconsistent unicode

#

In the days of printing with metal type sorts, it was common to rotate letters and digits 180° to create new symbols. This was a cheap way to extend the alphabet that didn't require purchasing or cutting custom sorts. The method was used for example with the Palaeotype alphabet, the International Phonetic Alphabet, the Fraser script, and for so...

#

the whole situation is a shitshow

#

there's a whole duplicated region for "fraser script" too

#

I gave up for now

earnest phoenix
past python
#

Also for phi

tall quail
#

I think that was a good choice, it's annoying in latex having to write var to get the better looking and less confusing symbols (as the others can be confused within and empyset)

grizzled granite
past python
#

We have a new naming problem, which fits well with codex I think: Numbering systems. We're considering to use the existing CSS counter style names, but they are a bit inconsistent. For example, I find it a bit odd that CSS spells it "greek-lower-modern", but "lower-roman" (lower once before and once after the writing system). There is a bit of preexisting discussion in https://github.com/typst/typst/pull/5622 and new work in https://github.com/typst/typst/pull/6379.

We were also discussing, in the case of adding many more numbering systems, to move them into a separate crate. Perhaps this crate could be codex! Essentially (already with the math styling), we could expand codex from just naming symbols to more generally supporting Unicode- and internationalization-related efforts in the Typst ecosystem.

Thoughts?

pastel violet
#

I think I agree with categorising this under Codex—we're kinda like a Working Group/Task Force for i18n in Typst ;). I think it could be easier to organise everyone's thoughts if we opened a new document under the Codex organisation where we can look at all the numbering systems at large

midnight tangle
past python
midnight tangle
#

Done

pastel violet
#

This reminds me of this half baked RFC/discussion I've had floating around since Oct 2023 😛

Dedicated template syntax

The what

I believe Typst would benefit from dedicated syntax to create templates. These would be objects that generate content. Where today standard library APIs have to take in crude function parameters (like numberings beyond strings) or lack in flexibility (like supplements)

The why

A very large portion of all use-cases of advanced numberings, outline entries, reference syntax, and re-usable formatting in general would greatly benefit from such a capability being convenient and consistent across the ecosystem. This can also apply to .display() interfaces such as state's, counter's

References:

  • Issues: #2485
  • Discussions: #2479, #2353, #2243

The how

This is why this discussion exists of course! Please share your opinions and ideas!

Let's start by considering some prior art; these are mostly programming languages with f-string adjacent syntax:

  • "string {interpolation:format}": Rust, Python, C++
  • "string sigil{interpolation}": JavaScript, Java
  • "string sigilinterpolation": Perl, Most shells
    These are not used much outside of string interpolation in the aforementioned works, but their potential could extend to arbitrary content in both normal and maths mode

This is very incomplete and not fully thought through but make of it what you will

pastel violet
#

but if we do decide to adopt numberings I think a design doc just for them will be of great use

grizzled granite
#

How viable would it be to load codex dynamically? Breaking changes wouldn't be as much of a problem if people could fix a document to a particular version of codex

lapis moth
#

imo breaking changes have been few enough to not warrant further consideration. We're still in 0.x after all.

grizzled granite
#

Yes it's 0.x, but people do in fact use it. Having symbols stop working or change meaning in a document isn't ideal

midnight tangle
#

As long as Typst is in beta, I prefer a breaking change to an ecosystem split

#

Also, we might as well invest time into making modifiers deprecable instead

past python
midnight tangle
#

Yes. i don't remember why I did not do that in the first place. It probably required changing too many things. But retrospectively this ws a bad decision.

#

I will try to work on that when the existing meta PRs are merged (I don't want to deal with conflicts)

pastel violet
midnight tangle
#

Yeah I was just referring to 0.x

grizzled granite
grizzled granite
#

also useful package @midnight tangle

#

thanks

lapis moth
#

Re #46 (comment): Anyone have any good ideas for alternatives to _unchecked?

midnight tangle
#

why not from_dotted_str?

#

I think this makes the intended format it quite clear

midnight tangle
#

For insert_unchecked, you could maybe use insert_dotted?

past python
#

I think we need to find a different solution for PRs with breaking changes as I don't have time to think myself into each one. Perhaps we should just ensure we're more sure, e.g. with 3 or 4 approvals?

grizzled granite
#

A longer grace period could also work

#

Should also be less problematic as long as everything is deprecated properly. Worst case scenario there's always "undeprecation"...

midnight tangle
midnight tangle
past python
past python
grizzled granite
#

But I do understand if you'd like to avoid non-meta PRs altogether

grizzled granite
midnight tangle
#

I just approved the PR so that we can respect the new idea of requiring more than two approvals for breaking PRs. It's been open for quite a while (two months!?), so let's merge it!

grizzled granite
#

how can I see a list of maintainers?

#

I know of you, mkorje and t0mstone

#

@heady fulcrum maybe?

heady fulcrum
#

on it

grizzled granite
#

xD

heady fulcrum
#

ah

#

ooops

#

I thought I was being asked to review it LOL

#

but yeah I'm also a maintainer

grizzled granite
#

we already merged that one, but there are some other lingering ones. I'm just wondering who I can request reviews from

heady fulcrum
#

(even if I'm not that active...)

grizzled granite
#

no problem

heady fulcrum
grizzled granite
#

I'll just request reviews, but don't feel obligated to do all of them. It's just that I can't figure out a way to see who actually has maintainer status atm

heady fulcrum
#

is the deprecation note in the pr correct?

#

in particular, Laurenz mentions a t->top change, but the PR (including title) are about top->t, right?

#

either way is fine by me (iirc this would make things more consistent, right?), just wanna be sure which one was decided on

grizzled granite
grizzled granite
midnight tangle
past python
grizzled granite
#

I'm also wondering if it would be better to indeed include ¤ as currency after all, and make ₡ currency.colon instead of colon.currency.

#

though I think it would be more easily discoverable as colon.currency

lapis moth
grizzled granite
#

none of the currencies are under currency

lapis moth
#

Exactly, but with your proposed change, one of them would be

grizzled granite
#

well it's going to be inconsistent regardless, since none of the other currencies are under colon either :p

lapis moth
#

True ig, but it feels less inconsistent to have colon just have a name collision. Otherwise I can imagine a new user being confused why the currencies are not under currency.

grizzled granite
#

anyway if we're not moving colon then I see no reason to include currency. I've yet to see evidence that it's actually used.

lapis moth
grizzled granite
#

used for what?

lapis moth
#

It was for examples of monetary formatting. An excerpt:

/// The sign string comes before everything else.
///
/// Examples: `±¤1`, `±1¤`
grizzled granite
#

Ok, fair enough.

#

how about the rupee question?

lapis moth
#

generic feels slightly better. Idk if we have a convention for this or something

grizzled granite
#

we don't yet

lapis moth
#

Maybe we should make a document that keeps track of some established conventions that we're already using...🤔

#

(in the codex repo, to be clear)

midnight tangle
#

Idk maybe this comparison is stupid

lapis moth
#

I'm talking about conventions, not clearly established rules. It would be more of a list of guidelines and a collection of precedents, so you don't have to keep track of that in your head, because idk about you, but I certainly can't remember every modifier that exists on my own and I don't want to have to read the entirety of sym.txt to check for inconsistencies every time there's a new change.

grizzled granite
#

@midnight tangle I've done some thinking about the above/below and over/under thing

#

it's very relevant for bottom accents

#

What I ended up on is that under would be preferable for the reason that the short form b is already taken for bottom, while u is free.

#

that is, no modifier would default to above, while u would mean under

#

so macron and macron.u etc, and the same would apply to the operator decorations

#

does that make sense?

midnight tangle
#

I think bottom accents and the dot above/below equal sign can use different modifiers

#

But I haven't thought about it a lot

#

Also, I'm sure I love .u. Very short modifiers make sense for common symbols and when there is no ambiguity. .t, .b, .r, and .l are good because I knew what they meant when I saw them for the first time, as a user. .u, I fear not so much

#

But I may be wrong

grizzled granite
#

I think .u would be very clear for diacritics at least. There aren't that many modifiers

midnight tangle
#

Although I guess why not use .b in this case?

grizzled granite
midnight tangle
#

Ok

grizzled granite
#

though this is obviously a worst case scenario...

midnight tangle
#

what the hell

#

How is this in Unicode but not this, which I needed recently

grizzled granite
midnight tangle
midnight tangle
grizzled granite
midnight tangle
#

Then I think we may want to not worry about it for a while

grizzled granite
#

the distinction between math and text isn't that easy. there are many more diacritics available in math fonts than those that have been assigned a math class

midnight tangle
midnight tangle
grizzled granite
#

(note the names are just a rough draft, I just wanted to demonstrate font support)

#

oops

#

this is the correct link

grizzled granite
midnight tangle
#

Yes, but the symbols we add should have a use. If this is only used in a specific script, users of that script already have a way to input it so they don't need us (it's not like they are gonna write every word by combining Codex symbols). If this is used in the IPA, then I guess there is an argument to be made, but for now we have left IPA symbols as a future work

grizzled granite
#

I think this falls more in the IPA camp, I can't imagine any natural language using this

grizzled granite
#

obviously some things are more pressing to add than others, but I still think we should have these kinds of things in mind when naming so we don't run into trouble later

midnight tangle
grizzled granite
#

my initial goal was adding names for everything in strix two math, which is a more manageable subset

#

at least almost everything

#

Cardinal directions are also available.

#

Would it make sense to use l,r,t,b for position, but n,s,e,w for direction?

#

Or l,r,u,d

#

Or the reverse, meaning l,r,t,b / l,r,u,d for direction but n,s,e,w for position

#

I'm not sure which would incur more breaking changes

midnight tangle
#

I think I need to see some concrete use cases

#

Right now, I have the feeling that the distinction between l/r/t/b and n/s/e/w would be too subtle and feel like an inconsistency to non-power users

grizzled granite
#

I was just thinking out loud. We should probably take our time to figure it out before adding additional symbols where this distinction is important

grizzled granite
#

By the way, I'm not sure if there are any modifiers removed between 0.13.1 and when we settled on a method of deprecation, but we should probably retroactively add them if there are

grizzled granite
#

naming would be so much simpler and more consistent if modifiers were order dependent..

midnight tangle
grizzled granite
#

I didn't mean to kick the hornets nest, but I think it's helpful to have fully fleshed out proposals to look at

grizzled granite
#

I meant making the pr 😉

#

I expected pushback

midnight tangle
#

Yeah honestly for a while I did not make PRs because I wanted to reach a consensus before, but your recent PRs have shown that consensus is easier to reach when we have actual changes to discuss in the form of a PR

grizzled granite
midnight tangle
#

yeah

grizzled granite
#

the only way we have to remedy that right now are submodules, but that's only for the top modifier (or can you have nested submodules?)

midnight tangle
#

But I don't think turning every symbol into a module would be good. Submodules are great for what we use them for currently I think (i.e., grouping multiple closely related distinct symbols)

grizzled granite
#

I'm just thinking that it could be extremely useful to have the opportunity to use order to disambiguate in a few instances,

#

take ⪋ and ⪑

#

If we allowed the names lt.equiv.gt and lt.gt.equiv we wouldn't have to bend over backwards to come up with a solution

midnight tangle
#

Maybe a solution could be to allow modifiers to be ordered, but if only a single order exists for a set of modifiers, then the variant can be accessed in any order

grizzled granite
#

This would also open up the possibility of repeated modifiers.

midnight tangle
#

It would make things more weird and opaque for the user I guess, but I don't think this is very important. I would tend to think users don't care about exactly how symbols work. When they want a symbol, they search for its name, and if the name make sense they will remember it

grizzled granite
#

Like you said, I don't think most users even think about this

midnight tangle
grizzled granite
#

your famed ⩶ can be eq.eq.eq (only half joking)

grizzled granite
midnight tangle
#

At least it's not non-sensical

grizzled granite
#

How hard would it be to make sym.txt writable in a hierarchical way (independent of whether we change how modifiers actually work)?

#

As in

#

I think

triangle
  .stroked
    .t △
    .b ▽
    .r ▷
    .l ◁
    .bl ◺
    .br ◿
    .tl ◸
    .tr ◹
    .small
      .t ▵
      .b ▿
      .r ▹
      .l ◃
    .rounded 🛆
    .nested ⟁
    .dot ◬
  .filled
    .t ▲
    .b ▼
    .r ▶
    .l ◀
    .bl ◣
    .br ◢
    .tl ◤
    .tr ◥
    .small
      .t ▴
      .b ▾
      .r ▸
      .l ◂
#

is more maintainable than

#
triangle
  .stroked.t △
  .stroked.b ▽
  .stroked.r ▷
  .stroked.l ◁
  .stroked.bl ◺
  .stroked.br ◿
  .stroked.tl ◸
  .stroked.tr ◹
  .stroked.small.t ▵
  .stroked.small.b ▿
  .stroked.small.r ▹
  .stroked.small.l ◃
  .stroked.rounded 🛆
  .stroked.nested ⟁
  .stroked.dot ◬
  .filled.t ▲
  .filled.b ▼
  .filled.r ▶
  .filled.l ◀
  .filled.bl ◣
  .filled.br ◢
  .filled.tl ◤
  .filled.tr ◥
  .filled.small.t ▴
  .filled.small.b ▾
  .filled.small.r ▸
  .filled.small.l ◂
midnight tangle
#

Although for now I think the parser could (but doesn't) not rely on whitespace

grizzled granite
#

It doesn't have to be that specific syntax, it was just for illustration purposes

midnight tangle
#

The roles are swapped, now I am the one to advocate for less whitespace dependence

midnight tangle
#

It is also more readable than having the same thing repeated everywhere

grizzled granite
lapis moth
#

if we don't want to depend on whitespace, it could also be something like

triangle
  .stroked
    ..t
(etc)

i.e. one additional . per level

midnight tangle
#

I don't think whitespace dependence is a problem in this case, especially for an internal syntax what we are the only ones to use

lapis moth
#

I don't particularly care either way, just pointing out options

grizzled granite
#

We can leave out bullet.hole for now if you think it's problematic

#

@midnight tangle

midnight tangle
#

I don't necessarily think it is problematic, but I'm not sure exactly what it is meant to be used for

midnight tangle
#

Interesting. Then if it is used as a bullet point in lists, let's add it as well

past python
midnight tangle
#

One option is to just list symbols from nested modules the same way other symbols are listed (prefixing the name with the module of course)

grizzled granite
#

I don't think the fact that it is a submodule needs to be user facing at all, except perhaps an indication that it can be imported?

past python
midnight tangle
#

This is interesting, but I'm not sure this use of modules is "right". Extending the functionalities of symbols seems like a cleaner option to me for Codex.

grizzled granite
midnight tangle
#

You mean, Codex modules?

#

If so, indeed. I'm only saying that if we want to add symbols where the order of modifiers matters, we shouldn't do it through modules but rather by extending the functionalities of symbols

grizzled granite
midnight tangle
#

I think what Laurenz was saying is that a similar technique (i.e., having modules with content) could be used in Codex

grizzled granite
midnight tangle
#

I don't think it was connected to earlier discussions in this channel

grizzled granite
#

Oh I understood what you meant now

#

Anyway my initial point was that codex exclusively deals with actual Unicode symbols no? Which negative spaces are not

midnight tangle
#

Indeed, spaces in the math module, and in particular the proposed negative spaces, do not use Unicodde characters

past python
#

The fact that I'll have to rerig the docs generator to somehow hide this from users is a symptom of the inconsistency

midnight tangle
past python
#

I dislike that it feels just like a symbol without being one

midnight tangle
#

#73 defines a sym.chess module, which groups different symbols without itself being a symbol (or looking like one)

GitHub

There's an ungodly number of these symbols, so I'd be surprised if I got everything right.
I mostly followed @MDLC01, though:

I used only black/white and black/red instead of havin...

past python
#

That feels different to me than the gender one for some reason

midnight tangle
#

I agree that this one feels more strongly like it should be a submodule, but in the end the conclusion is that there are scenarios where sybmodules make sense (responding to #1277628305142452306 message)

grizzled granite
#

Maybe it was a mistake to create so many PRs at the same time. Should I close the ones that are currently blocked?

midnight tangle
#

Blocked PRs are kind of annoying because they clutter the PR page. I think you can close those that are blocked on multi-character symbols for now (or just remove the part of the PR that blocks it)

#

The bullet one is fine to keep open I would say

lapis moth
grizzled granite
#

no, he doesn't want to do that

#

so anything breaking needs 3 reviews

lapis moth
#

ah, must've missed that

grizzled granite
#

I cant find the message now, but it's in this thread somewhere

midnight tangle
#

The current thing is:

  • Non-breaking symbol changes: 2 community review
  • Breaking symbol changes: 3 community reviews
  • Meta changes: review from Laurenz
grizzled granite
#

they were already deprecated, just not documented

midnight tangle
#

Yes it's non-breaking

grizzled granite
midnight tangle
#

Good luck

grizzled granite
#

it's the addition of "envelope"

#

I believe

#

so the question is if I just remove the top level symbol for now, or just close it

#

I'll just close it, there's not consensus anyway

midnight tangle
#

@lapis moth regarding #51 (comment), I tend to think minimal modifier sets are less of a priority than multi-character symbols.
Since the codebase is quite small and all meta PRs end up conflicting with ach other, it's probably better to do them one at a time. So if you want to work on something, implementing multi-character symbols would be a better contribution imo.
Of course this is just my opinion so if you are more motivated by minimal modifier sets, feel free to work on that either way.

GitHub

Motivation Currently, when selecting a variant from a set of modifiers, the first variant from the list that contains all the modifiers, and a minimal amount of additional modifiers, is chosen.12 T...

#

For now, multi-character symbols would not be laid out properly in math AFAIK, but they could already be useful for emojis and other symbols that are not primarily intended for use in math.

grizzled granite
#

There's also the question of what we discussed earlier, about order dependence.

midnight tangle
#

yeah

#

The future of modifiers probably needs a little more thoughts