Performance | Typst | Page 7

sturdy sequoia Jun 24, 2024, 8:52 AM

#

happythink

left night Jun 24, 2024, 8:52 AM

#

it's in a layout_grid in the last iteration

#

very interesting: since locate() provides context, I removed all uses of loc and replace with here() etc. and it's still fast

#

but flipping the switch and replacing locate with context makes it slow

sturdy sequoia Jun 24, 2024, 8:53 AM

#

yeah I know, I tried it too!

#

It's like something in the evalof a ast::Context is slow maybe?

#

but that doesn't make sense

left night Jun 24, 2024, 8:54 AM

#

it might be that the functions produced by context are harder to disambiguate for the locator or something like that

sturdy sequoia Jun 24, 2024, 8:55 AM

#

Interestingly, with the VM, both techniques take exactly the same time (which I would expect tbh)

left night Jun 24, 2024, 8:57 AM

#

yeah, probably you didn't port the minor difference that creates the problem

sturdy sequoia Jun 24, 2024, 8:57 AM

#

Maybe it's the CaptureVisitor?

#

it has a separate capturer

#

no, doesn't look like that would explain it

sturdy sequoia Jun 24, 2024, 10:05 AM

#

Ok, I found an awesome heuristics for whether to enable caching with the VM or not: the number of registers it uses

#

works surprisingly well

#

@left night I just had a realization: the VM isn't slower in incremental, it looks like multithreading has slowed down incremental even on main thinkies

#

Maybe the VM isn't that bad after all

feral imp Jun 24, 2024, 10:19 AM

#

What a twist.

sturdy sequoia Jun 24, 2024, 10:20 AM

#

maybe we should be clever about when we do multithreading: if the page count is slow we can maybe do it single threaded to avoid the costs of syncing?

sturdy sequoia Jun 24, 2024, 10:20 AM

#

feral imp What a twist.

indeed

#

@feral imp Mind you, it's not perfect and there are definitely areas of improvement

#

compile time are still too slow imo

slim sequoia Jun 24, 2024, 10:54 AM

#

sturdy sequoia Maybe the VM isn't that bad after all

Back to work then :p

sly pecan Jun 24, 2024, 11:06 AM

#

slim sequoia Back to work then :p

Wouldn't it make more sense to look back at a vm once other things have been optimized and stabilized first?

#

Sunk cost fallacy is real

slim sequoia Jun 24, 2024, 11:08 AM

#

👍

#

(but also it was obviously said in jest)

sturdy sequoia Jun 24, 2024, 11:14 AM

#

sly pecan Wouldn't it make more sense to look back at a vm once other things have been opt...

what things? thinkies

sly pecan Jun 24, 2024, 11:15 AM

#

sturdy sequoia what things? <:thinkies:974370878949294090>

sturdy sequoia Jun 24, 2024, 11:15 AM

#

Yeah okay, but specifics here 😄

#

I can't optimize everything all at once

sly pecan Jun 24, 2024, 11:16 AM

#

I'm just trying to save you from sinking more time into the vm only to realize it's not worth it again 😂

sturdy sequoia Jun 24, 2024, 11:17 AM

#

sly pecan I'm just trying to save you from sinking more time into the vm only to realize i...

oh don't worry I won't

#

but I feel like there's a gotcha somewhere

#

I mean it used to be 5x faster and now it's only 2x faster

#

so clearly I made something slooooooooooooooooooooooow

left night Jun 24, 2024, 11:19 AM

#

sturdy sequoia <@311948531835469827> I just had a realization: the VM isn't slower in increment...

what difference have you observed?

sturdy sequoia Jun 24, 2024, 11:22 AM

#

left night what difference have you observed?

on main, from right before mt it was ~300ms for my standard test, and went to ~500ms afterwards (likely due to overhead of mt), and the VM takes the same ~500ms incremental due to re-compiling

#

mt = multithreading but I am too lazy to write it out

sly pecan Jun 24, 2024, 11:34 AM

#

sturdy sequoia on main, from right before mt it was ~300ms for my standard test, and went to ~5...

Memory bandwidth?

sturdy sequoia Jun 24, 2024, 11:35 AM

#

sly pecan Memory bandwidth?

no, most likely synchronization, lemme test with a -j 8 if it's lower then I am right

#

Well no, this ain't it chief

#

angrythunk

left night Jun 24, 2024, 11:37 AM

#

sturdy sequoia on main, from right before mt it was ~300ms for my standard test, and went to ~5...

conceptually, multi-threading shouldn't really have overhead in this case though

#

only page runs are multi-threaded, so if just one page run changes -> no multithreading

sturdy sequoia Jun 24, 2024, 11:37 AM

#

left night conceptually, multi-threading shouldn't really have overhead in this case though

it's still slower mind you

sturdy sequoia Jun 24, 2024, 11:38 AM

#

left night only page runs are multi-threaded, so if just one page run changes -> no multith...

yes but if you look at the --timings it does show you that it's doing... something:

left night Jun 24, 2024, 11:44 AM

#

sturdy sequoia it's still slower mind you

not on my machine. a68a24157 is ~340ms incremental, 7fa86eed0 is ~326ms incremental for an edit in some paragraph in 4_phos.typ.

sturdy sequoia Jun 24, 2024, 11:45 AM

#

left night not on my machine. a68a24157 is ~340ms incremental, 7fa86eed0 is ~326ms incremen...

weird

#

I would need to do a bisect on a longer period then

#

I'll try and do that once I'm done working

#

'cause it is slower than it used to be

#

(or I am going insane)

sly pecan Jun 24, 2024, 11:56 AM

#

Platform dependent?

#

Bottleneck may shift

left night Jun 24, 2024, 11:58 AM

#

sturdy sequoia I would need to do a bisect on a longer period then

why bisect? did you test main vs before or immediately after multi-threading vs before?

left night Jun 24, 2024, 11:59 AM

#

sly pecan Platform dependent?

It may be, but I would be surprised since the workload is really the same

#

There are a few threads spawned that do basically no work (fully memoized) and one thread does the work of the changed page run

sturdy sequoia Jun 24, 2024, 12:03 PM

#

left night why bisect? did you test main vs before or immediately after multi-threading vs ...

I was on some commit before, and right after

#

but I don't know which commit before

#

so I need to do a proper bisect between v0.10.0 and main

#

which won't be fun 😭

sturdy sequoia Jun 24, 2024, 12:04 PM

#

sly pecan Platform dependent?

it's true that AMD CPUs struggly with memory latency (although not bandwidth) whereas the Apple Silicon M1 had unified memory

#

and I am on Windows

#

which is yucky

sturdy sequoia Jun 24, 2024, 12:22 PM

#

Almost 50% of the total time on my thesis is eval

#

🧐

#

(this is the wonder of the Intel VTune oneapi)

#

And guess where most of that is spent

#

||you guessed it, it goes in the hashing hole||

#

https://tenor.com/view/the-square-hole-square-hole-shapes-piklas-gif-24625312

Tenor

#

The fact that actually running code take 23ms is worrying frankly (out of 4s)

left night Jun 24, 2024, 1:06 PM

#

sturdy sequoia The fact that actually running code take 23ms is worrying frankly (out of 4s)

Is this actually true? When I looked at timings from the VM recently, time spent on compile vs eval was relatively balanced. It was just the final eval that was fast because most eval was in imports and thus a child of a compile_module (or in context)

sturdy sequoia Jun 24, 2024, 1:22 PM

#

left night Is this actually true? When I looked at timings from the VM recently, time spent...

yes it's true, actually running opcode takes 23ms, it's hashing that takes all of the rest according to a VTune run I did where I added "tasks" which allows tracking more precisely where time is spent

#

Mind you, there are other calls it does that take longer, but overall, hashing accounts for the vast (~80%) of time spent using hardware-based counters so I would expect it to be somewhat accurate

sly pecan Jun 24, 2024, 1:37 PM

#

sturdy sequoia Mind you, there are other calls it does that take longer, but overall, hashing a...

What does typst use for hashing now?

sturdy sequoia Jun 24, 2024, 1:37 PM

#

sly pecan What does typst use for hashing now?

SipHasher 1-3 iirc

#

or 1-4

#

I would argue there's not much leeway in changing that 😐

#

It's mostly maloc and caching stuff

#

eaither Hasher::write or stuff in comemo

#

clearly we can do better

sly pecan Jun 24, 2024, 1:42 PM

#

sturdy sequoia SipHasher 1-3 iirc

128 bit right iirc?

sturdy sequoia Jun 24, 2024, 1:42 PM

#

sly pecan 128 bit right iirc?

yep

#

If we could reduce it to 64-bit maybe it would be faster but I don't know how much, and I am not sure it would be worth it even

sly pecan Jun 24, 2024, 1:44 PM

#

sturdy sequoia yep

There's https://github.com/Cyan4973/xxHash but it's a c library

sly pecan Jun 24, 2024, 1:45 PM

#

sturdy sequoia If we *could* reduce it to 64-bit maybe it would be faster but I don't know how ...

I don't think that would be a good idea

sturdy sequoia Jun 24, 2024, 1:46 PM

#

The problem is also quality, we need a very high quality hash

#

although perhaps having 128-bits decreases the need for quality somewhat

#

https://github.com/ogxd/gxhash

GitHub

GitHub - ogxd/gxhash: The fastest hashing algorithm 📈

The fastest hashing algorithm 📈. Contribute to ogxd/gxhash development by creating an account on GitHub.

#

there's also gxhash, but I tested it and saw no performance improvements

#

although it has had a lot of commits since then so perhaps it has improved

sly pecan Jun 24, 2024, 1:48 PM

#

sturdy sequoia there's also gxhash, but I tested it and saw no performance improvements

That one is apparently unsound

sturdy sequoia Jun 24, 2024, 1:48 PM

#

#

Like in theory it's king

sly pecan Jun 24, 2024, 1:48 PM

#

sturdy sequoia The problem is also quality, we need a very high quality hash

The quality section does indicate it's fairly good. Though I have no idea about theoretical guarantees

sturdy sequoia Jun 24, 2024, 1:49 PM

#

I should also mention: they still don't have a fallback version of the algorithm

#

meaning it only works on ARM & x64

sly pecan Jun 24, 2024, 1:49 PM

#

See https://www.reddit.com/r/rust/comments/1d5yq2l/gxhash_an_extremely_fast_hardwareaccelerated/l6pcd02?context=3 @sturdy sequoia

protestor's comment on "GxHash - an extremely fast hardware-acceler...

Explore this conversation and more from the rust community

sly pecan Jun 24, 2024, 1:49 PM

#

sturdy sequoia meaning it only works on ARM & x64

No wasm you mean?

sturdy sequoia Jun 24, 2024, 1:50 PM

#

sly pecan No wasm you mean?

yep

sly pecan Jun 24, 2024, 1:50 PM

#

I thought wasm had simd

sturdy sequoia Jun 24, 2024, 1:51 PM

#

https://github.com/ogxd/gxhash/issues/7

GitHub

WASM support · Issue #7 · ogxd/gxhash

Hello, I find this project quite interesting, do you think there could be a WASM compatible version (perhaps just not manually vectorized) for applications that compile on both WASM and native targ...

#

I did open an issue early on

sly pecan Jun 24, 2024, 1:51 PM

#

I thought you meant xxhash

sturdy sequoia Jun 24, 2024, 1:51 PM

#

xxhash is in C anyway

#

so it's a no-go

sly pecan Jun 24, 2024, 1:52 PM

#

Did you see the reddit comment I linked? I think gxhash is a non-starter

sturdy sequoia Jun 24, 2024, 1:52 PM

#

sly pecan Did you see the reddit comment I linked? I think gxhash is a non-starter

it is for multiple reasons 😂

sly pecan Jun 24, 2024, 1:54 PM

#

https://crates.io/crates/xxhash-rust @sturdy sequoia

sturdy sequoia Jun 24, 2024, 1:55 PM

#

I'll give it a whir 😉

sly pecan Jun 24, 2024, 1:55 PM

#

sturdy sequoia I'll give it a whir 😉

I have no idea what i'm talking about, fyi

#

But it seems promising

sturdy sequoia Jun 24, 2024, 2:01 PM

#

I'm testing it, we'll know in a couple of minutes

#

@sly pecan it's slower, by quite a margin in fact

#

likely because it handles smaller inputs less well than SipHash 1-3

#

for the thesis: it's 0.2s slower on average compared to the VM, for the raytracer: it's 4s slower so 19s compared to 15s with SipHash 1-3

#

at least on my hardware *

sly pecan Jun 24, 2024, 2:08 PM

#

sturdy sequoia <@399269065388195842> it's slower, by quite a margin in fact

Is this with the simd features enabled?

sturdy sequoia Jun 24, 2024, 2:08 PM

#

#

I mean there aren't many feature flags?

#

I enabled xxh3 for 128-bit support and that' sit

#

and doing the whole -C target_cpu=native is cheating imo

#

since we can't distribute that

sly pecan Jun 24, 2024, 2:09 PM

#

sturdy sequoia likely because it handles smaller inputs less well than SipHash 1-3

I thought the idea was that small stuff wouldn't be memoized?

sturdy sequoia Jun 24, 2024, 2:09 PM

#

sly pecan I thought the idea was that small stuff wouldn't be memoized?

yes but the individual inputs (i.e pieces of data) that we pass to sh13 are smol

sly pecan Jun 24, 2024, 2:10 PM

#

Ok

sturdy sequoia Jun 24, 2024, 2:10 PM

#

I think that a good way of determining whether we need memoization would be to have a .len() method on Value that gives us an idea of the size of the input, that way if we have large arrays then we can check those, if we don't then we skip memoization, etc.

#

maybe more something like .size-hint()

sly pecan Jun 24, 2024, 2:11 PM

#

sturdy sequoia I enabled xxh3 for 128-bit support and that' sit

What is the talk about one shot vs streaming?

#

In the documentation

sturdy sequoia Jun 24, 2024, 2:13 PM

#

ah, it does look like it uses different algos then

#

using the Xx3 struct will make it streaming, using the xx3() function will make it one-shot

#

no idea if that has an impact

sly pecan Jun 24, 2024, 2:13 PM

#

sturdy sequoia and doing the whole `-C target_cpu=native` is cheating imo

Well more like x64-v3 in order to enable avx etc

sturdy sequoia Jun 24, 2024, 2:14 PM

#

sly pecan Well more like x64-v3 in order to enable avx etc

yes but native will select based on the hardware of the user 😉

#

To be fair, we could have a "launcher" that selects a typst version with or w/o AVX for better performance

#

I tested, using target_cpu makes zero difference

#

🤷‍♂️

stone pilot Jun 25, 2024, 2:10 PM

#

sly pecan What does typst use for hashing now?

A new much faster hasher has just been integrated into Rust. Maybe this can benefit Typst. Let me try to find a reference to that information.

#

No, sorry, the last changes were to the sorting algorithms, not hashing.

sturdy sequoia Jun 25, 2024, 2:39 PM

#

Yes indeed, mind you we do perform some sorting and the new algorithm will probably perform slightly better

cunning wadi Jun 25, 2024, 3:18 PM

#

stone pilot No, sorry, the last changes were to the sorting algorithms, not hashing.

They did switch to a different hash function (from fxhash to a wyhash-inspired one)

#

This isn't really an option for Typst though, because wyhash (not sure about their modification) only has 62 bits of collision resistance

sly pecan Jun 26, 2024, 2:42 PM

#

How big a contribution does shaping have towards compilation? Are we talking miniscule?

lunar kettle Jun 26, 2024, 2:48 PM

#

sly pecan How big a contribution does shaping have towards compilation? Are we talking min...

yeah probably

left night Jun 26, 2024, 3:37 PM

#

sly pecan How big a contribution does shaping have towards compilation? Are we talking min...

Depends on how text heavy the document is. If it's mostly text, it is a fair amount. If you have a lot of other stuff, not that much anymore.

slim sequoia Jun 26, 2024, 4:02 PM

#

Optimized linebreaks can really suck up performance

left night Jun 26, 2024, 4:05 PM

#

slim sequoia Optimized linebreaks can really suck up performance

Big performance improvements for that are coming soon :)

slim sequoia Jun 26, 2024, 4:05 PM

#

❤️

left night Jun 26, 2024, 4:05 PM

#

Just had an idea recently on how to optimize the optimization

#

And it's already mostly implemented, just needs polish

sturdy sequoia Jun 26, 2024, 9:23 PM

#

sly pecan How big a contribution does shaping have towards compilation? Are we talking min...

I wouldn't say minuscule, but much lower than hashing, image encoding, image decoding

sturdy sequoia Jun 26, 2024, 9:23 PM

#

left night Just had an idea recently on how to optimize the optimization

LET HIM COOK

glad urchin Jun 26, 2024, 9:31 PM

#

it's optimizations all the way down

stone pilot Jun 26, 2024, 9:56 PM

#

cunning wadi This isn't really an option for Typst though, because wyhash (not sure about the...

Latest rustc-hash from 3 weeks ago also has interesting speed improvements. Not sure though if these are suitable for typst https://github.com/rust-lang/rustc-hash/blob/master/CHANGELOG.md

GitHub

rustc-hash/CHANGELOG.md at master · rust-lang/rustc-hash

Custom hash algorithm used by rustc (plus hashmap/set aliases): fast, deterministic, not secure - rust-lang/rustc-hash

#

It looks like it's also 64bit. Probably the same wycats one used for the sorting improvements

cunning wadi Jun 26, 2024, 10:06 PM

#

stone pilot Latest rustc-hash from 3 weeks ago also has interesting speed improvements. Not ...

yep that's the one I meant

cunning wadi Jun 26, 2024, 10:08 PM

#

stone pilot It looks like it's also 64bit. Probably the same wycats one used for the sorting...

a quick ctrl-f in the design documents of the new sort algorithm tells me that there is no hasher included
EDIT: no hash found in source code either

#

and it would honestly surprise me if it was because the sort function only takes Ord

#

(if anyone's curious, here's the PR: https://github.com/rust-lang/rust/pull/124032)

stone pilot Jun 27, 2024, 5:32 AM

#

cunning wadi a quick ctrl-f in the design documents of the new sort algorithm tells me that t...

No, it probably uses the rustc-hash one, which is already an existing dependency.
It's a pity we can't find easy wins 🥹

cunning wadi Jun 27, 2024, 6:36 AM

#

stone pilot No, it probably uses the rustc-hash one, which is already an existing dependency...

How do you tell?

stone pilot Jun 27, 2024, 6:47 AM

#

The Fxhash implementation came from that crate as that's the one used in the compiler.

#

I find amazing that Typst needs higher colision resistance than the compiler?

untold turret Jun 27, 2024, 6:54 AM

#

comemo needs perfect hashing as when there is a collision, typst may not (re)compute something and cause a wrong compilation.

cunning wadi Jun 27, 2024, 7:04 AM

#

stone pilot The Fxhash implementation came from that crate as that's the one used in the com...

Yeah but weren't you talking about sorting?

stone pilot Jun 27, 2024, 7:17 AM

#

Yes, I first misremembered the sorting changes as related to hashing improvements, but those are separate improvements. Sorry for the confusion!

molten kayak Jun 27, 2024, 8:28 AM

#

stone pilot I find amazing that Typst needs higher colision resistance than the compiler?

FYI the Rust compiler uses multiple hashes. For normal HashMaps (which do verify whether a collision happened or not) they use rustc-hash. For on-disk incremental cache they only store the hash (thus risking collisions, like comemo) and for that they use SipHash with a 128 bit output.

atomic violet Jun 27, 2024, 8:52 AM

#

I just had an idea. If hash slow, what if typst no hash typst eq?

There are a few different types of objects I'd say: the Copy ones which are usually O(1) comparable, the Rc ones (any reference counted objects: Arc, Eco*, ...) which are hashable, and the comemo tracked objects. I think the main issue is the second type (most typst values).

Consider PicoStr. These are interned strings managed by a specialized interner. Can comemo be the interner for Rc-type objects? When function is called, comemo hashes the arguments, then looks up the result, and returns the result. In a sense, comemo "manages" the return value (it won't be dropped unless comemo decides to evict the returning call). What if comemo will apply the same interning as with PicoStr, but for all Rc-type hashable returned objects? Actually, what if comemo will manage input objects too?

Consider a CRc type (comemo reference counted) which is Arc but with one extra atomic bit indicating whether it is a canonical instance of the object. When calling a memoized function comemo would check if the bit of CRc argument is set, and if it is not, lookup the canonical instance of the object by hashing it and looking it up. It will then only use the location of that canonical instance in memory, instead of the hash. Then it will return the same canonical instance of the return value, if applicable.

The reason why this optimization might be good (if possible), is because now arguments are perfectly disambiguated by their memory locations. You no longer need high quality hash functions for interning (because Eq-ing objects one time per each instancing is acceptable), and you no longer need high quality hash functions for looking up the cache too (because all arguments are basically numbers now, and you can Eq arguments too). And it's not like you do much more work anyway: you hash everything before interning, but comemo would do that anyway. Although it's hard to estimate the memory usage.

sly pecan Jun 27, 2024, 11:50 AM

#

left night Big performance improvements for that are coming soon :)

That's good. Further optimization would presumably also make performance headroom for more microtypography.

sly pecan Jun 27, 2024, 11:53 AM

#

atomic violet I just had an idea. If hash slow, what if typst no hash typst eq? There are a f...

Could the choice of whether to hash or not be based on size?

#

I'm too stupid to understand

atomic violet Jun 27, 2024, 12:00 PM

#

I think it can be, but it's probably extra work with little to no benefit. If the total size of the object is 1 kilobyte, that 1 kilobyte must be crunched through eq-check either way: be it hash or eq. And that size cannot be determined at compile time, because most objects are build from collections (which have dynamic size, obviously). This leaves us with simple structs, which are usually Copy. That's why I kind of distinguish only between Copy and Rc-types, and not based on size.

sly pecan Jun 27, 2024, 12:05 PM

#

atomic violet I think it can be, but it's probably extra work with little to no benefit. If th...

I was thinking of memory usage.

atomic violet Jun 27, 2024, 12:10 PM

#

In this case I am confused, I don't see how it will help memory usage 🤔

sly pecan Jun 27, 2024, 12:12 PM

#

atomic violet In this case I am confused, I don't see how it will help memory usage 🤔

Aren't the hashes much smaller?

#

I should stop talking 😂

atomic violet Jun 27, 2024, 12:18 PM

#

Oh, I get it now. Yeah, it might be a good idea, but a small counter argument is: if you have a big object, you probably haven't built it yourself. It was probably returned by a function, and this function is memoized anyway. So it probably won't help much, but one can never know before one measures so 🤷‍♂️

left night Jul 1, 2024, 10:52 AM

#

https://github.com/typst/typst/pull/4483

GitHub

Optimize par layout by laurmaedje · Pull Request #4483 · typst/typst

This PR optimizes the optimal paragraph layout used when justification is enabled. Typst uses the Knuth-Plass (TeX) algorithm for producing optimal paragraphs. This dynamic programming algorithm mi...

sly pecan Jul 1, 2024, 10:55 AM

#

left night https://github.com/typst/typst/pull/4483

Wow, that's significant. Do you have any examples of performance deltas in real-world documents?

#

@sturdy sequoia The Thesis

left night Jul 1, 2024, 10:57 AM

#

sly pecan <@130737672951037952> The Thesis

The thesis isn't really bounded by optimizing paragraphs

#

I tested it with an earlier version and it was a few percents I think

sly pecan Jul 1, 2024, 10:57 AM

#

I didn't think so, but it should have some effect

sly pecan Jul 1, 2024, 10:58 AM

#

left night I tested it with an earlier version and it was a few percents I think

That's still a nice gain

left night Jul 1, 2024, 10:58 AM

#

I think it will be most impactful with book projects

#

No cetz, no crazy stuff, just tons of paragraphs.

sly pecan Jul 1, 2024, 10:59 AM

#

Is it possible to use different strategies depending on paragraph length?

left night Jul 1, 2024, 11:00 AM

#

You mean a different way to bound the paragraph?

sly pecan Jul 1, 2024, 11:00 AM

#

yeah

left night Jul 1, 2024, 11:00 AM

#

maybe. could also be that there's a smarter way to apply the bound. that's just what I came up with.

sly pecan Jul 1, 2024, 11:00 AM

#

I guess there are diminishing returns. Few paragraphs have thousands of words anyway

left night Jul 1, 2024, 11:01 AM

#

yeah, I'm quite happy that it's this way around, rather than the bound only kicking in late

#

I also enjoy a lot that something that feels like algorithms class actually brings a lot of real-world performance here.

#

I did a course on bounds for graph vertex cover optimization in university, which was somewhat similar.

left night Jul 1, 2024, 11:07 AM

#

sly pecan That's still a nice gain

roughly 2.85s -> 2.65s

sly pecan Jul 1, 2024, 11:07 AM

#

"Unfortunately, this approach of computing widths falls down with proper text shaping"

#

What does luahbtex do when using harfbuzz?

sturdy sequoia Jul 1, 2024, 11:10 AM

#

left night roughly 2.85s -> 2.65s

hey, that's pretty good

left night Jul 1, 2024, 11:12 AM

#

sly pecan What does luahbtex do when using harfbuzz?

I wondered that too but haven't checked yet!

sturdy sequoia Jul 1, 2024, 11:22 AM

#

left night I wondered that too but haven't checked yet!

In my testing there is pretty much no difference on The Thesis™️

#

But that's okay

sly pecan Jul 1, 2024, 11:25 AM

#

left night I wondered that too but haven't checked yet!

https://github.com/latex3/luaotfload/issues/152 discussion here may possibly be relevant

GitHub

harf: missed hyphenation points with kerns · Issue #152 · latex3/lu...

With harf, almost no hyphenation points are found here. The reason is the harf mode is creating a huge discretionary with all those letters instead of inserting small ones with just the kern in rep...

#

sounds like latex may cheat a bit?

#

would probably be easiest just to ask the latex and/or harfbuzz devs

#

Another consideration is microtypographical features, which will reduce the need for hyphenation

feral imp Jul 1, 2024, 11:31 AM

#

That "oi wiki" document would probably show a delta.

sly pecan Jul 1, 2024, 11:31 AM

#

@untold turret

feral imp Jul 1, 2024, 11:32 AM

#

sly pecan <@432835220593704981>

My hear tells me we should ping @onyx furnace right? You're the arbiter of the big-wiki-file, right?

onyx furnace Jul 1, 2024, 11:33 AM

#

yes. but i cannot find the zip file now😂 i can regenerate one if it's needed

sturdy sequoia Jul 1, 2024, 11:35 AM

#

feral imp That "oi wiki" document would probably show a delta.

it's mostly bottlenecked by plugins

#

😭

#

The version I have is too old to even compile on main @onyx furnace 😂

left night Jul 1, 2024, 11:40 AM

#

sturdy sequoia In my testing there is pretty much no difference on *The Thesis™️*

well, it's for once an optimization that's not tuned for the thesis

feral imp Jul 1, 2024, 11:40 AM

#

onyx furnace yes. but i cannot find the zip file now😂 i can regenerate one if it's needed

We have to make laurmadje feel better about his PR.

sturdy sequoia Jul 1, 2024, 11:40 AM

#

left night well, it's for once an optimization that's not tuned for the thesis

angryeyes

#

But but but.....

left night Jul 1, 2024, 11:40 AM

#

feral imp We have to make laurmadje feel better about his PR.

I feel good since its more than 8x for my test document.

sturdy sequoia Jul 1, 2024, 11:40 AM

#

left night I feel good since its more than 8x for my test document.

that's pretty dang impressive

#

are there many docs that are bottlenecked by paragraph layout?

left night Jul 1, 2024, 11:41 AM

#

I would wager to guess that most documents that aren't bottlenecked by something else and use justification are bottlenecked by this.

#

Most prominently of course books

#

And test benchmarks of people coming from LaTeX ^^

sturdy sequoia Jul 1, 2024, 11:43 AM

#

left night I would wager to guess that most documents that aren't bottlenecked by something...

I wonder what the thesis is bottlenecked by? thinkies

#

'cause I have no idea at this point 😄

#

by images & codly

#

I guess I shouldn't be surprised 😄

sly pecan Jul 1, 2024, 11:51 AM

#

sturdy sequoia by images & codly

I would've thought images were pretty fast

sturdy sequoia Jul 1, 2024, 11:52 AM

#

sly pecan I would've thought images were pretty fast

nope

#

I mean the multithreading helps

#

but on each thread it's still fairly slow

#

not much we can do there I am afraid

sly pecan Jul 1, 2024, 11:52 AM

#

sturdy sequoia but on each thread it's still fairly slow

png?

sturdy sequoia Jul 1, 2024, 11:53 AM

#

sly pecan png?

yes

sly pecan Jul 1, 2024, 11:53 AM

#

sturdy sequoia yes

Are they high resolution?

sturdy sequoia Jul 1, 2024, 11:53 AM

#

sly pecan Are they high resolution?

decently yeah

#

I would love if we had impecable SVG support

#

it would be so good

sly pecan Jul 1, 2024, 11:53 AM

#

No one needs more than 640x480

sturdy sequoia Jul 1, 2024, 11:54 AM

#

sly pecan No one needs more than 640x480

and no one will ever need more then 256MB of RAM and a 20GB Hard drive

sly pecan Jul 1, 2024, 11:54 AM

#

sturdy sequoia I would love if we had impecable SVG support

It's mostly there isn't it?

lunar kettle Jul 1, 2024, 11:54 AM

#

sturdy sequoia I would love if we had impecable SVG support

Are we not impeccable yet 😠😂😂

sturdy sequoia Jul 1, 2024, 11:54 AM

#

lunar kettle Are we not impeccable yet 😠😂😂

I think it's more of a draw.io issue at this point uwueasterhead

lunar kettle Jul 1, 2024, 11:54 AM

#

indeed 😦

sturdy sequoia Jul 1, 2024, 11:55 AM

#

they're like "we only want to support the web broswer"

#

Like just add an option "inline everything" in your goddamn software ffs

lunar kettle Jul 1, 2024, 11:55 AM

#

@sly pecan I love how they marked your one comment in a draw io issue as a duplicate 😂

sturdy sequoia Jul 1, 2024, 11:55 AM

#

lunar kettle <@399269065388195842> I love how they marked your one comment in a draw io issue...

Care to send it? 😄

sly pecan Jul 1, 2024, 11:55 AM

#

lunar kettle <@399269065388195842> I love how they marked your one comment in a draw io issue...

Wait what

lunar kettle Jul 1, 2024, 11:56 AM

#

Anyway off topic

sturdy sequoia Jul 1, 2024, 11:57 AM

#

I wish we had a draw.io like tool that generates Typst code

#

(side project anyone?)

left night Jul 1, 2024, 11:59 AM

#

sturdy sequoia I wonder what the thesis is bottlenecked by? <:thinkies:974370878949294090>

general layout stuff I guess

lunar kettle Jul 1, 2024, 11:59 AM

#

@sturdy sequoia btw does vtune work with programs that only run for a few ms?

left night Jul 1, 2024, 11:59 AM

#

all the flows and pads and grids

sturdy sequoia Jul 1, 2024, 12:00 PM

#

lunar kettle <@130737672951037952> btw does vtune work with programs that only run for a few ...

no, use callgrind for those

#

you will get barely any samples at that point

sturdy sequoia Jul 1, 2024, 12:00 PM

#

left night all the flows and pads and grids

yeah, codly and images too

left night Jul 1, 2024, 12:00 PM

#

but maybe actually no. most of those layouts end up in style or context

sturdy sequoia Jul 1, 2024, 12:01 PM

#

There is one page that takes 450ms per iteration all on its own and it's a big table 😄

left night Jul 1, 2024, 12:01 PM

#

sturdy sequoia not much we can do there I am afraid

in some cases, it's possible to reuse the encoded PNG or JPEG data. but it's a bit of work since there is no crate yet that gives such low-level access to the image data. not hard to write, but needs some time.

sturdy sequoia Jul 1, 2024, 12:02 PM

#

left night in some cases, it's possible to reuse the encoded PNG or JPEG data. but it's a b...

and probably somewhat error prone *

#

right?

left night Jul 1, 2024, 12:02 PM

#

not necessarily

#

if you want to be sure, you'd still need to decode the image, but not encode

feral imp Jul 1, 2024, 12:03 PM

#

left night And test benchmarks of people coming from LaTeX ^^

Ah. Yes, you should feel good about this PR then.

left night Jul 1, 2024, 12:05 PM

#

sly pecan sounds like latex may cheat a bit?

yes, it does sound like it

#

XeTeX splits text at hyphenation points and then shapes only the parts between the hyphenation points. This ignores ligatures/kerning across discretionaries. Then these widths are used for linebreaking. After linebreaking, the horizontal lists are reshaped, this time taking kerning/liagtures across discretionaries into account.

#

so, it's sort of like if we used the approximate result directly instead of as a bound

#

which would speed up things much more. but it's a hack.

sturdy sequoia Jul 1, 2024, 12:11 PM

#

left night which would speed up things _much more._ but it's a hack.

I feel like we can do it as you've done, no need for a hack that might results in incorrect linebreaking imo

left night Jul 1, 2024, 12:12 PM

#

agreed

sly pecan Jul 1, 2024, 12:19 PM

#

The alternative would be more granular control than simple and optimized I guess

left night Jul 1, 2024, 12:20 PM

#

sly pecan The alternative would be more granular control than simple and optimized I guess

the thing is that the approximate layout can result in overfull lines

#

which, from my reading, XeTeX would also suffer from

sly pecan Jul 1, 2024, 12:21 PM

#

How overfull are we talking? Many sins can be hidden by adjusting spacing

left night Jul 1, 2024, 12:21 PM

#

typically not much, but fonts can do random shit

keen scroll Jul 1, 2024, 12:43 PM

#

Just blame it on the font if that happens 😈

left night Jul 1, 2024, 12:51 PM

#

keen scroll Just blame it on the font if that happens 😈

that's the LaTeX way :p we don't have to since we do it properly.

sturdy sequoia Jul 1, 2024, 1:27 PM

#

left night that's the LaTeX way :p we don't have to since we do it properly.

I mean we do things properly while being heaps faster

#

so you know... we're right 😎

left night Jul 1, 2024, 1:28 PM

#

sturdy sequoia I mean we do things properly while being heaps faster

at least a lot faster than LuaLaTeX

#

pdfLaTeX is a matter of how you view it:

#

I think it's still faster for a single plain compilation with just a lot of text (not 100% sure), but you need to run it multiple times for outline etc. so I guess a single run is more akin to a typst watch cycle

sly pecan Jul 1, 2024, 5:09 PM

#

What's the deal with this jump? The slope seems to change after too

slim sequoia Jul 1, 2024, 5:21 PM

#

The post suggested memory bottlenecking, perhaps that's where the bottleneck becomes apparent in the new algorithm

sly pecan Jul 1, 2024, 5:23 PM

#

slim sequoia The post suggested memory bottlenecking, perhaps that's where the bottleneck bec...

Oh you mean that's where l1/l2 cache is full?

slim sequoia Jul 1, 2024, 5:24 PM

#

I'm not sufficiently technologically literate to know what that means or if its the case

#

I'm a chemist simpleton 😛

sly pecan Jul 1, 2024, 5:29 PM

#

Upon further reflection, that doesn't make sense either, because the y axis isn't time, it's the number of lines built

#

It's just strange to me that there's such a distinct jump

left night Jul 1, 2024, 5:31 PM

#

sly pecan It's just strange to me that there's such a distinct jump

it's probably a very specific line being approximated badly leading to a less tight bound

#

I assume that this particular bump is specific to lorem ipsum with default settings

sly pecan Jul 1, 2024, 5:32 PM

#

That makes sense

sturdy sequoia Jul 1, 2024, 6:39 PM

#

sly pecan Oh you mean that's where l1/l2 cache is full?

most likely L3

#

L1/L2 have basically the same latency ||gross oversimplification|| most likely when it can't rely on L3 anymore and has to go to main mem

sly pecan Jul 1, 2024, 6:40 PM

#

sturdy sequoia most likely L3

Memory isn't the culprit here, since the y axis isn't time

sturdy sequoia Jul 1, 2024, 6:41 PM

#

sly pecan Memory isn't the culprit here, since the y axis isn't time

no but more memory means more latency means more slower

sly pecan Jul 1, 2024, 6:42 PM

#

sturdy sequoia no but more memory means more latency means more slower

Well yes

#

But it's not relevant here

sturdy sequoia Jul 1, 2024, 6:43 PM

#

hmm maybe

sly pecan Jul 1, 2024, 6:44 PM

#

sturdy sequoia hmm maybe

The y axis is just the number of lines

sturdy sequoia Jul 1, 2024, 6:45 PM

#

well

#

Ich bin dumb

sly pecan Jul 1, 2024, 6:49 PM

#

sturdy sequoia Ich bin dumb

I made the same mistake 😀

sturdy sequoia Jul 1, 2024, 6:51 PM

#

sly pecan I made the same mistake 😀

rip in pepperoni to the both of us

left night Jul 2, 2024, 8:28 AM

#

@sturdy sequoia does perfetto also always initialize fully zoomed in for you since a while? I think originally it showed the full trace by default but now I need to drag the slicer all the way to the right every time. I can't figure out what the cause is.

sturdy sequoia Jul 2, 2024, 8:29 AM

#

@left night yes, it's really gosh darn annoying

left night Jul 2, 2024, 8:29 AM

#

they must've shipped some bug

#

cause I deleted all local data

#

maybe this? https://github.com/google/perfetto/issues/810

sturdy sequoia Jul 2, 2024, 6:05 PM

#

left night maybe this? <https://github.com/google/perfetto/issues/810>

yes I think so

sturdy sequoia Jul 3, 2024, 9:17 AM

#

@left night interestingly, on my thesis, I resized (using a script 😅) all of the images and it had zero impact on compile times! (probably thanks to parallel compilation)

#

But, removing the glossary calls in my code blocks saves almost 0.4s of compile time (1.8s -> 1.4s cold)

#

Also makes incremental quite a bit faster 🎉

#

Although it does save almost 1GB of RAM when compiling lol

left night Jul 3, 2024, 9:19 AM

#

sturdy sequoia <@311948531835469827> interestingly, on my thesis, I resized (using a script 😅)...

time to write a more efficient glossary

sturdy sequoia Jul 3, 2024, 9:20 AM

#

left night time to write a more efficient glossary

Yes but this all makes me think that:

There should be a basic "image" resizer in the webapp for convenience when uploading like "oh, you uploaded an image and it's quite large, wanna resize it?"
We should have a quick guide on how to make fast docs
And yes, we need a more efficient built-in glossary 😄

#

BTW @left night could I ask you (and it's likely a big ask) but to compare the VM and main in the webapp? (since obviously I can't do that), I am curious if the VM helps in WASM at all or not 🙂

#

The big advantage of the VM (in theory) is also that it could be JIT'ed in WASM

left night Jul 3, 2024, 9:28 AM

#

sturdy sequoia BTW <@311948531835469827> could I ask you (and it's likely a big ask) but to com...

I'll try that on a weekend sometime.

sturdy sequoia Jul 3, 2024, 9:29 AM

#

left night I'll try that on a weekend sometime.

Thanks 😄

sly pecan Jul 4, 2024, 4:58 PM

#

@sturdy sequoia did the new thing affect The Thesis™️?

sturdy sequoia Jul 4, 2024, 4:59 PM

#

sly pecan <@130737672951037952> did the new thing affect The Thesis™️?

Which one?

sly pecan Jul 4, 2024, 4:59 PM

#

sturdy sequoia Which one?

https://github.com/typst/typst/pull/4497

sturdy sequoia Jul 4, 2024, 5:01 PM

#

I shall try it then

slim sequoia Jul 4, 2024, 5:02 PM

#

wouldn't also mind knowing how it it looks different if at all

sly pecan Jul 4, 2024, 5:02 PM

#

slim sequoia wouldn't also mind knowing how it it looks different if at all

If it's just a refactor it shouldn't change at all

#

Apart from bug fixes

slim sequoia Jul 4, 2024, 5:03 PM

#

in which case dherse whatever you do please don't tell me if it looks different at all

sly pecan Jul 4, 2024, 5:22 PM

#

sturdy sequoia I shall try it then

proud sandal Jul 4, 2024, 5:24 PM

#

must've worsened the performance by quite a bit then

sturdy sequoia Jul 4, 2024, 5:24 PM

#

Y'all know I work like... a difficult job that you know, requires most of my brain power 😂

proud sandal Jul 4, 2024, 5:24 PM

#

it's still compiling

sturdy sequoia Jul 4, 2024, 5:24 PM

#

I just got on my PC :-p

#

And then I was stuck in traffic for an hour

#

because even the f-ing E40 was clogged for some f-ing reason that blows my mind

sly pecan Jul 4, 2024, 5:27 PM

#

sturdy sequoia Y'all know I work like... a difficult job that you know, requires most of my bra...

I was joking. Take your time ❤️

sturdy sequoia Jul 4, 2024, 5:27 PM

#

Plus I thought I would get fired yesterday, which obviously didn't happen and I was wayyy over reacting

#

but damn he drained me

sly pecan Jul 4, 2024, 5:27 PM

#

sturdy sequoia Plus I thought I would get fired yesterday, which obviously didn't happen and I ...

Whaaaaa

sturdy sequoia Jul 4, 2024, 5:28 PM

#

sly pecan Whaaaaa

Yeah, dumb company was mad that I used a colleagues license for their software to try if it was the right fit for my work

#

they got into a bit of issy fit while we were on call (just me and them)

#

had to calm them down

#

and immediately talked to my management who said that they were a pain in the neck and that I reacted correctly

#

I was shooketh for sure

#

I wish, I meant "it"

#

😭

slim sequoia Jul 4, 2024, 5:29 PM

#

The gods hath spoken

sturdy sequoia Jul 4, 2024, 5:32 PM

#

almost done, just waiting on main to compile release

#

I also optimized The Thesis™️ itself

#

it compiles in barely 1s now!

slim sequoia Jul 4, 2024, 5:33 PM

#

But but

sly pecan Jul 4, 2024, 5:33 PM

#

sturdy sequoia it compiles in barely 1s now!

Before or after recent commits?

slim sequoia Jul 4, 2024, 5:33 PM

#

How are we meant to say "wow goes vroom" if it's already vroom

#

😠

sturdy sequoia Jul 4, 2024, 5:33 PM

#

slim sequoia How are we meant to say "wow goes vroom" if it's already vroom

but it can vroom more

slim sequoia Jul 4, 2024, 5:33 PM

#

:3

sturdy sequoia Jul 4, 2024, 5:33 PM

#

https://tenor.com/view/cars-lightningmcqueen-kachow-gif-4810661

Tenor

slim sequoia Jul 4, 2024, 5:34 PM

#

Someone needs to keep track of compile times in a chart

sly pecan Jul 4, 2024, 5:34 PM

#

No need for incremental compilation anymore

slim sequoia Jul 4, 2024, 5:34 PM

#

X axis release, y axis inverse of compile time

sly pecan Jul 4, 2024, 5:34 PM

#

slim sequoia Someone needs to keep track of compile times in a chart

Stonks

sturdy sequoia Jul 4, 2024, 5:34 PM

#

sly pecan No need for incremental compilation anymore

@left night in shambles right now 😂

#

To be fair, incremental is sloooooooooooooooooooooooooow now

#

it takes 0.4s per iteration

#

which is slow compared to cold compile

slim sequoia Jul 4, 2024, 5:35 PM

#

I blame a lack of VM

sly pecan Jul 4, 2024, 5:35 PM

#

It's gotten slower? Or just relative to cold

sturdy sequoia Jul 4, 2024, 5:35 PM

#

I'll do a custom build without comemo in a few minutes 😂

sturdy sequoia Jul 4, 2024, 5:35 PM

#

sly pecan It's gotten slower? Or just relative to cold

it's gotten slower at some point

#

but don't know which commit

sly pecan Jul 4, 2024, 5:35 PM

#

sturdy sequoia it's gotten slower at some point

Oh noes

glad urchin Jul 4, 2024, 5:36 PM

#

sturdy sequoia it's gotten slower at some point

sorry we had to intentionally slow down your thesis to ensure we can still run benchmarks on it

sturdy sequoia Jul 4, 2024, 5:38 PM

#

Note: PDF export on Windows native

#

so to me it does look like decent gains """for free"""

#

Wait no

#

I ran the VM runs with timings

#

DON'T LOOK AT THE RESULTS

slim sequoia Jul 4, 2024, 5:40 PM

#

sturdy sequoia Jul 4, 2024, 5:41 PM

#

branch     cold     inc
main pre:  1.63s   401ms
main post: 1.46s   406ms
vm pre:    1.45s   400ms
vm post:   1.36s   410ms

The VM is so f-ing useless on The Thesis™️ lol

#

Ok, There was only one bad result, it's fixed now 😉

#

note that "main pre" is a bit older than just before this commit afaik so the difference should be a bit lower

sly pecan Jul 4, 2024, 5:43 PM

#

0.1s isn't bad

sturdy sequoia Jul 4, 2024, 5:43 PM

#

sly pecan 0.1s isn't bad

yeah I agree

sly pecan Jul 4, 2024, 5:43 PM

#

Weird that incremental is comparatively slow

sturdy sequoia Jul 4, 2024, 5:43 PM

#

sly pecan Weird that incremental is comparatively slow

yeah incremental has become crazy slow at some point

#

but I don't know when

#

For some reason this citation is by far the slowest

#

taking 25 ms all on its own lol

sly pecan Jul 4, 2024, 5:47 PM

#

I guess incremental performance is harder to figure out

feral imp Jul 4, 2024, 5:47 PM

#

sly pecan I was joking. Take your time ❤️

sly pecan Jul 4, 2024, 5:48 PM

#

Have you tried the speculative execution branch?

sturdy sequoia Jul 4, 2024, 5:50 PM

#

sly pecan Have you tried the speculative execution branch?

https://tenor.com/view/the-what-smile-whut-weird-stare-gif-16592004

Tenor

#

you mean in incremental? then yes but I don't remember the results

sly pecan Jul 4, 2024, 5:50 PM

#

sturdy sequoia you mean in incremental? then yes but I don't remember the results

Yeah

#

Ok

feral imp Jul 4, 2024, 5:51 PM

#

Incremental has slowed down... But more stuff happens now.... You are tablex free now? But incremental is slower than it used to be.

😬

sly pecan Jul 4, 2024, 5:53 PM

#

sturdy sequoia For some reason this citation is by far the slowest

25 Ms for a single citation is ridonkulous

sturdy sequoia Jul 4, 2024, 5:53 PM

#

feral imp Incremental has slowed down... But more stuff happens now.... You are tablex fre...

I optimized the heck out of my thesis (not pushed yet): smaller figured, replaced fixed refs to links (heaps faster), native tables, etc.

sturdy sequoia Jul 4, 2024, 5:54 PM

#

sly pecan 25 Ms for a single citation is ridonkulous

I agree, if a single citation took 80% of year to complete I would be worried too

#

😎

#

sly pecan Jul 4, 2024, 5:54 PM

#

I blame autocorrect

left night Jul 4, 2024, 5:54 PM

#

sturdy sequoia ``` branch cold inc main pre: 1.63s 401ms main post: 1.46s 406ms vm...

what is pre and post here?

sturdy sequoia Jul 4, 2024, 5:55 PM

#

left night what is pre and post here?

pre and post the PR you did to refactor something to do with lines

left night Jul 4, 2024, 5:55 PM

#

sturdy sequoia it's gotten slower at some point

I still cannot reproduce that

sturdy sequoia Jul 4, 2024, 5:55 PM

#

left night I still cannot reproduce that

I mean it used to take 100ms on my machine

left night Jul 4, 2024, 5:55 PM

#

sturdy sequoia pre and post the PR you did to refactor something to do with lines

what why did it speed up so much. the goal of the PR wasn't even performance.

sturdy sequoia Jul 4, 2024, 5:55 PM

#

maybe it's Windows specific

sturdy sequoia Jul 4, 2024, 5:55 PM

#

left night what why did it speed up so much. the goal of the PR wasn't even performance.

main pre is a tad older, the VM is more comparable here

#

sowwy

left night Jul 4, 2024, 5:55 PM

#

I mean, I did optimize some stuff, but I didn't expect any real-world gains

feral imp Jul 4, 2024, 5:55 PM

#

Did you mess up with environment variables and the package system maybe?

sly pecan Jul 4, 2024, 5:56 PM

#

left night what why did it speed up so much. the goal of the PR wasn't even performance.

Happy little accidents?

left night Jul 4, 2024, 5:56 PM

#

let me see whether I can reproduce this result

feral imp Jul 4, 2024, 5:57 PM

#

Does your thesis use any package from the universe? It might be something like that... 🤡

sturdy sequoia Jul 4, 2024, 5:57 PM

#

feral imp Does your thesis use any package from the universe? It might be something like t...

incremental I wouldn't expect no

sturdy sequoia Jul 4, 2024, 5:57 PM

#

left night I mean, I did optimize some stuff, but I didn't expect any real-world gains

I mean 0.1s is good, no?

#

You should really only compare with the VM here because the main pre was a slightly older build of main

#

(I can't be arsed to do a new one)

feral imp Jul 4, 2024, 5:58 PM

#

sturdy sequoia I mean 0.1s is good, no?

That's noticeable, so that's very good. If it happened unintentionally... That's suspicious, but good?

left night Jul 4, 2024, 5:58 PM

#

sturdy sequoia I mean 0.1s is good, no?

it is good, this is what I'm surprised about

#

let me see, just compiling --release and it always takes so long

sturdy sequoia Jul 4, 2024, 5:58 PM

#

left night it is good, this is what I'm surprised about

you did mention in the PR using some Cow maybe less cloning?

sturdy sequoia Jul 4, 2024, 5:59 PM

#

left night let me see, just compiling --release and it always takes so long

yeah, it's that codegen-units=1 at play here

left night Jul 4, 2024, 5:59 PM

#

no, if anything it's Preparation::slice

#

cause that is O(n) -> O(1)

sturdy sequoia Jul 4, 2024, 5:59 PM

#

And perhaps accidentally better caching?

sturdy sequoia Jul 4, 2024, 5:59 PM

#

left night cause that is O(n) -> O(1)

maybe that too

#

@left night I think I actually found a """simple""" way of optimizing eval angrythunk

#

Without doing anything fancy

#

But y'all'll have to wait to know what it is 😈

left night Jul 4, 2024, 6:04 PM

#

sturdy sequoia I mean 0.1s is good, no?

for me, it's not really faster. It's pretty much the same.

#

I compared directly the commits immediately before and after the PR

sturdy sequoia Jul 4, 2024, 6:04 PM

#

left night for me, it's not really faster. It's pretty much the same.

I mean, for me it was repeatably faster by 0.1s and the run-to-run variance is basically zero

#

another AMD win 💪

left night Jul 4, 2024, 6:05 PM

#

or maybe it's actually Apple Silicon that dealt well with the crappy old code :p

sharp garnet Jul 4, 2024, 6:05 PM

#

sturdy sequoia But y'all'll have to wait to know what it is 😈

flat ast by any chance?

sturdy sequoia Jul 4, 2024, 6:06 PM

#

sharp garnet flat ast by any chance?

I mean, it kind of is? thinkies

#

No, it's Cow, with me it's always Cow

sharp garnet Jul 4, 2024, 6:06 PM

#

Ok, then it's suggestion for optimization. https://www.cs.cornell.edu/~asampson/blog/flattening.html

Flattening ASTs (and Other Compiler Data Structures)

This is an introduction to data structure flattening, a special case of arena allocation that is a good fit for programming language implementations.We build a simple interpreter twice, the normal way and the flat way, and show that some fairly mechanical code changes can give you a 2.4× speedup.

sturdy sequoia Jul 4, 2024, 6:07 PM

#

@left night isn't our AST flattened?

left night Jul 4, 2024, 6:07 PM

#

sturdy sequoia <@311948531835469827> isn't our AST flattened?

nope

sturdy sequoia Jul 4, 2024, 6:07 PM

#

left night nope

oh, then it might indeed be worth it thinkeyes

left night Jul 4, 2024, 6:07 PM

#

one might try it, but I do not believe that it will be faster

slim sequoia Jul 4, 2024, 6:07 PM

#

sturdy sequoia But y'all'll have to wait to know what it is 😈

You tease

sturdy sequoia Jul 4, 2024, 6:07 PM

#

slim sequoia You tease

it's Cows

#

I love Cows

left night Jul 4, 2024, 6:08 PM

#

and it might make the code less maintainable

sturdy sequoia Jul 4, 2024, 6:08 PM

#

the rust kind

sturdy sequoia Jul 4, 2024, 6:08 PM

#

left night and it might make the code less maintainable

that's what I'd be afraid too

left night Jul 4, 2024, 6:08 PM

#

also it would mean a bit more overhead on incremental reparsing

#

but probably negligible

sturdy sequoia Jul 4, 2024, 6:08 PM

#

Does parsing actually have a performance impact?

left night Jul 4, 2024, 6:09 PM

#

not much I think

sturdy sequoia Jul 4, 2024, 6:09 PM

#

left night not much I think

figures, are we parsing in parallel?

left night Jul 4, 2024, 6:09 PM

#

nope

#

cause it happens on demand

sturdy sequoia Jul 4, 2024, 6:09 PM

#

yeah figures

left night Jul 4, 2024, 6:09 PM

#

but if imports were parallel then we would

sturdy sequoia Jul 4, 2024, 6:10 PM

#

Indeed

left night Jul 4, 2024, 6:10 PM

#

though maybe not

#

depends on the World actually

sturdy sequoia Jul 4, 2024, 6:10 PM

#

if an import or include returned a Deferred<T> i GUESS

left night Jul 4, 2024, 6:10 PM

#

cause it returns a parsed Source

sturdy sequoia Jul 4, 2024, 6:10 PM

#

Sorry caps

#

But indeed I don't think World is easy to Send + Sync?

#

or we'd need a kind of ScopedDeferred<T>

#

using scoped threads

left night Jul 4, 2024, 6:13 PM

#

sturdy sequoia But indeed I don't think `World` is easy to `Send + Sync`?

it already is?

sturdy sequoia Jul 4, 2024, 6:14 PM

#

left night it already is?

hmmmmmm thinkinglare

left night Jul 4, 2024, 6:16 PM

#

sturdy sequoia hmmmmmm <:thinkinglare:796814092496404490>

without send + sync world, no multithreading go brr

sturdy sequoia Jul 4, 2024, 6:18 PM

#

BTW, I just found a stupidly simple optimization, I'll open a PR for it

#

probably changes nothing in the real world

#

but it's so dumb I can't not open up a PR

feral imp Jul 4, 2024, 6:21 PM

#

Thesis is the real world (for now)

sturdy sequoia Jul 4, 2024, 6:22 PM

#

https://github.com/typst/typst/pull/4500

GitHub

Go from `String` to `&str` when passing font names to SVG code by D...

The name says it all, it serves no purpose (other than bloating my PR count 😎). I can't measure any difference, perhaps on some specific SVG heavy documents it might but it's unlikely. Anyw...

#

Pretty sure it's my smallest PR by an order of magnitude 😂

left night Jul 4, 2024, 6:27 PM

#

sturdy sequoia https://github.com/typst/typst/pull/4500

ah, that's a remnant of an older approach where the family name had to be kept around

sturdy sequoia Jul 4, 2024, 6:34 PM

#

left night ah, that's a remnant of an older approach where the family name had to be kept a...

figured, it's a complete micro-optimization but I think it's okay, at the end of the day, it has no cost or complexity and it's cheaper

#

on very constrained devices it might even make the tiniest of differences 🤷‍♂️

left night Jul 4, 2024, 6:36 PM

#

sturdy sequoia figured, it's a complete micro-optimization but I think it's okay, at the end of...

yeah, it has no cost.

sly pecan Jul 4, 2024, 8:13 PM

#

@sturdy sequoia how much of the 400 ms is export?

sturdy sequoia Jul 4, 2024, 8:25 PM

#

sly pecan <@130737672951037952> how much of the 400 ms is export?

Like 80ms something like that

sly pecan Jul 4, 2024, 8:26 PM

#

Ok

#

Do you know why that citation takes 25 ms?

slim sequoia Jul 4, 2024, 9:23 PM

#

Found a replacement for @sturdy sequoia 's thesis https://github.com/typst/typst/issues/4501

GitHub

Extremely long compile time: 24h and running for a 270 mb typ file ...

Description my typst compile command on my Mac M2 runs for a full 24h already for a 270 mb file. Admittedly, it is a rather unusual document with one line only, consisting of around 57 million char...

glad urchin Jul 4, 2024, 9:24 PM

#

well

#

that's a lot of linebreaks

#

idk but this seems like an instance of https://www.youtube.com/watch?v=ibjLxdp6qg0

#

however i cannot reply with that video unfortunately

#

😂

sly pecan Jul 4, 2024, 9:44 PM

#

282 million characters is like 150 000 pages

#

What on earth is this person doing

left night Jul 4, 2024, 9:46 PM

#

if it has ultralong paragraphs, this is actually a case where my latest PR could help because they might be running into quadratic runtime

#

but still a very valid question what the heck they are doing

low sapphire Jul 4, 2024, 9:47 PM

#

sly pecan 0.1s isn't bad

I wonder how fast it would compile on a mid-range cpu^^

sly pecan Jul 4, 2024, 9:47 PM

#

left night if it has ultralong paragraphs, this is actually a case where my latest PR could...

I suspect they don't have enough memory to compile the document

low sapphire Jul 4, 2024, 9:47 PM

#

Dherse has insane specs IIRC

left night Jul 4, 2024, 9:47 PM

#

sly pecan I suspect they don't have enough memory to compile the document

but then it would be killed right?

glad urchin Jul 4, 2024, 9:48 PM

#

given they're on a mac, im assuming it's using a ton of swap

sly pecan Jul 4, 2024, 9:48 PM

#

left night but then it would be killed right?

Wouldn't it go to the swap file?

glad urchin Jul 4, 2024, 9:48 PM

#

iirc macos just doesnt care and will fill your whole disk with swap if needed

#

lol

#

so they might be the first person to have typst consume hundreds of gigabytes of memory. or something

#

i already thought my 10000 tables and grids consuming 60 GB of RAM was bad, but then someone in #quick-questions was doing this like for real

and now we have this

#

if anything this shows that the compiler is very resilient 😂

sly pecan Jul 4, 2024, 9:51 PM

#

Yeah this document is likely on the order of 1 TB of memory usage or more

glad urchin Jul 4, 2024, 9:51 PM

#

i wouldnt doubt it

#

especially since they said there are footnotes and whatever

#

i wonder if they're using something like AI or whatever to generate such a document

#

cuz theres no way i can think of someone actually typing all that

sly pecan Jul 4, 2024, 9:53 PM

#

glad urchin i wonder if they're using something like AI or whatever to generate such a docum...

Based on their description and their profile I suspect it may be a genome or something?

glad urchin Jul 4, 2024, 9:53 PM

#

hmmmmmm

#

their profile does suggest they'd do this kind of stuff yeah

#

😂

#

paging @feral imp for an assessment

sly pecan Jul 4, 2024, 9:55 PM

#

I guess at some point reducing memory usage should be looked into. But this particular one sounds pretty far outside of what would be considered reasonable

slim sequoia Jul 4, 2024, 9:55 PM

#

glad urchin idk but this seems like an instance of <https://www.youtube.com/watch?v=ibjLxdp6...

Do it

feral imp Jul 4, 2024, 9:55 PM

#

His main research interests are efficient algorithm development, developing automated pipelines for biological data analysis, epigenetics, phylogenetics and, more generally, finding creative solutions for a wide range of bioinformatics challenges.

glad urchin Jul 4, 2024, 9:56 PM

#

whatever they're cooking right now is no joke

slim sequoia Jul 4, 2024, 9:56 PM

#

Is the guy trying to print a human genome in hard case binding?

feral imp Jul 4, 2024, 9:56 PM

#

But people are really into this.. mega docs thing... Just the other day, someone had 250 pages note document........

slim sequoia Jul 4, 2024, 9:56 PM

#

(which NGL would be pretty cool)

glad urchin Jul 4, 2024, 9:57 PM

#

feral imp But people are really into this.. mega docs thing... Just the other day, someone...

those are rookie numbers

sly pecan Jul 4, 2024, 9:57 PM

#

feral imp But people are really into this.. mega docs thing... Just the other day, someone...

This is like 150 000 pages

feral imp Jul 4, 2024, 9:57 PM

#

sly pecan This is like 150 000 pages

Yes! I'm following.

glad urchin Jul 4, 2024, 9:57 PM

#

someone had 8k pages on #quick-questions the other day

#

lol

#

and this one idk

#

this one is just astronomical

#

i have no idea how many pages that would be

slim sequoia Jul 4, 2024, 9:57 PM

#

My second question is how the fuck did he have the patience to wait 24 hours before thinking perhaps he should ask if that's a normal amount of time

feral imp Jul 4, 2024, 9:57 PM

#

If you told me.. oh typst would be used to generate mega pages documents a year ago, I would have insulted you REPEATEDLY.

feral imp Jul 4, 2024, 9:58 PM

#

slim sequoia My second question is how the fuck did he have the patience to wait 24 hours bef...

Bioinformaticians have patience of a newton cradle.

glad urchin Jul 4, 2024, 9:58 PM

#

feral imp Bioinformaticians have patience of a newton cradle.

yeah i was gonna say, this is probably right up their alley

#

i assume biological simulations arent usually the cheapest stuff

slim sequoia Jul 4, 2024, 9:59 PM

#

feral imp Bioinformaticians have patience of a newton cradle.

(TBF I'd say a Newton's cradle level of patience is more like ADHD: short loud bursts surrounding long periods of stagnation)

feral imp Jul 4, 2024, 10:00 PM

#

slim sequoia (TBF I'd say a Newton's cradle level of patience is more like ADHD: short loud b...

It is hard for me to be maximally entertaining all the time. Sometimes, it is just noise 😛 🤣

slim sequoia Jul 4, 2024, 10:00 PM

#

👀

#

Lots of innuendos in this server today

sly pecan Jul 4, 2024, 10:16 PM

#

@glad urchin they added some more information. I have to admit I don't really understand what they're talking about.

glad urchin Jul 4, 2024, 10:17 PM

#

guess our estimates were way off

#

lol

sly pecan Jul 4, 2024, 10:18 PM

#

glad urchin guess our estimates were way off

You mean memory?

#

They haven't compiled the entire document yet though

#

Anyway do I understand correctly that they have 2.5 million calls to underline ()?

glad urchin Jul 4, 2024, 10:19 PM

#

i mean

#

i guess using #let would make parsing faster

#

but thats about it

#

memoization would work the same way regardless i think, since the parameters are the same

left night Jul 5, 2024, 6:11 AM

#

glad urchin memoization would work the same way regardless i think, since the parameters are...

I don't think that's actually the case. Built in functions are not memoized by default, only user defined ones. So the fact that they didn't define a closure might actually conserve memory.

sturdy sequoia Jul 5, 2024, 8:55 AM

#

@left night does Args need to contain an EcoVec of items?

#

couldn't it just as well be a Vec?

#

is it cloned often? and does it matter when it is cloned?

#

because I would expect that it doesn't get cloned often

#

Ah I see, Vec is bigger than EcoVec

#

I am assuming that's the moitivation here

#

Interestingly, increasing the size of Args does not increase the size of Value

#

That being said, on my machine it makes no difference

#

:/

left night Jul 5, 2024, 9:09 AM

#

sturdy sequoia I am assuming that's the moitivation here

yeah since it's in Value it must be <= 24 bytes. but with arg sinks it's also cloned from time to time.

sturdy sequoia Jul 5, 2024, 9:10 AM

#

left night yeah since it's in `Value` it must be <= 24 bytes. but with arg sinks it's also ...

yeah but some heavy(er) cloning for rare cases is fine imo

left night Jul 5, 2024, 9:10 AM

#

sturdy sequoia Interestingly, increasing the size of `Args` does not increase the size of `Valu...

weird

sturdy sequoia Jul 5, 2024, 9:10 AM

#

since Args are more mutably accessed than anything else afaik

left night Jul 5, 2024, 9:10 AM

#

some strange optimization

sturdy sequoia Jul 5, 2024, 9:10 AM

#

left night weird

or just some rust-analyzer quirkiness

left night Jul 5, 2024, 9:10 AM

#

maybe

#

seems more likely

#

since rust isn't that smart usually

sturdy sequoia Jul 5, 2024, 9:13 AM

#

left night since rust isn't _that_ smart usually

RIIZ

#

IS THAT WHY THE KID TALK ABOUT RIIZ?

#

/s

left night Jul 5, 2024, 9:14 AM

#

zig?

sturdy sequoia Jul 5, 2024, 9:14 AM

#

left night zig?

yes that's the joke

left night Jul 5, 2024, 9:14 AM

#

hadn't seen RIIZ before

sturdy sequoia Jul 5, 2024, 9:14 AM

#

I just invented it 😎

left night Jul 5, 2024, 9:14 AM

#

^^

sly pecan Jul 6, 2024, 6:01 PM

#

https://github.com/typst/typst/issues/4512

GitHub

Partial Evaluation for Huge Documents · Issue #4512 · typst/typst

Description Brief Provide a partial evaluation API, with semantics: generate PDF from page $i$ to page $i+m$. This feature makes sense only if intermediate results can be offloaded to disk storage....

sly pecan Jul 6, 2024, 6:21 PM

#

I'm confused

sturdy sequoia Jul 7, 2024, 11:15 AM

#

sly pecan I'm confused

I think this person assumes that we could just "render page n" and it would be faster than exporting a whole document

#

but it isn't

#

since exporting is actually among the fastest steps

sly pecan Jul 7, 2024, 11:16 AM

#

sturdy sequoia I think this person assumes that we could just "render page n" and it would be f...

yeah, though it's not even clear what they wanted, since they added offloading to storage into the mix

sturdy sequoia Jul 7, 2024, 11:16 AM

#

sly pecan yeah, though it's not even clear what they wanted, since they added offloading t...

Probably hopeful we could do something that fits their exact workload

sly pecan Jul 7, 2024, 11:17 AM

#

sturdy sequoia Probably hopeful we could do something that fits their exact workload

4*365 paragraphs isn't that obscenely huge, probably on the order of 500 pages?

#

I feel like there's a different reason why it doesn't compile

sturdy sequoia Jul 7, 2024, 11:21 AM

#

sly pecan 4*365 paragraphs isn't that obscenely huge, probably on the order of 500 pages?

Indeed, doesn't seem that huge imo

molten kayak Jul 7, 2024, 11:44 AM

#

Looking at their script, they are generating ~10000 tags (there could be duplicates, that should be pretty rare), each of which associated with a paragraph (considering 4 paragraphs a day this would take 6-7 years to write) of 100 words or tags (with tags being ~1/5 of the total). That's roughly 800000 words and 200000 refs in total.
If you bring down the number of tags to 1000 then it takes ~15 seconds (I'm on the latest release, it might be faster on main) and produces 540 pages, which shouldn't be that bad.

sly pecan Jul 7, 2024, 11:45 AM

#

molten kayak Looking at their script, they are generating ~10000 tags (there could be duplica...

When you say tags you mean labels?

molten kayak Jul 7, 2024, 11:48 AM

#

sly pecan When you say tags you mean labels?

The script calls them tags and generate a #tag[a_word] for each of them. I haven't dig into what that function does but I guess it creates a label with that word, because the same words are used as refs with @a_word

sly pecan Jul 7, 2024, 11:48 AM

#

@sturdy sequoia seems it's introspection then. That's a lot of labels, but I still would expect it to compile

sly pecan Jul 7, 2024, 11:49 AM

#

molten kayak Looking at their script, they are generating ~10000 tags (there could be duplica...

What happens if you try to compile the original with 10 000 labels?

molten kayak Jul 7, 2024, 11:52 AM

#

sly pecan What happens if you try to compile the original with 10 000 labels?

I did try it and gave up after leaving it on for a minute. I guess it would take at least 10 times the amount of time it takes with 1000 labels, so at least 2 minutes and a half

sly pecan Jul 7, 2024, 11:53 AM

#

molten kayak I did try it and gave up after leaving it on for a minute. I guess it would take...

It would surprise me if it scaled linearly

#

Try 2000?

molten kayak Jul 7, 2024, 11:54 AM

#

Ah I should also mention that those 15 and 5 seconds are with --timings (which I guess might slow down compilation a bit)

#

Well, with 2000 it took 37 seconds to split out a bunch of errors because the script didn't ensure that labels are unique...

sly pecan Jul 7, 2024, 11:58 AM

#

molten kayak Well, with 2000 it took 37 seconds to split out a bunch of errors because the sc...

Okay, that might be part of OPs problem

#

Maybe you could reply to their issue?

molten kayak Jul 7, 2024, 12:09 PM

#

sly pecan Maybe you could reply to their issue?

Done

#

Btw if I modify the script to ensure unique labels it doesn't change much the timings (it takes 40 seconds to compile with 2000 tags)

sturdy sequoia Jul 7, 2024, 12:21 PM

#

sly pecan <@130737672951037952> seems it's introspection then. That's a lot of labels, but...

Okay but even then, how would we improve this?

#

I guess we could for large numbers of labels to parallel shenanigans, but that seems overkill

sturdy sequoia Jul 7, 2024, 12:22 PM

#

molten kayak Ah I should also mention that those 15 and 5 seconds are with `--timings` (which...

depending on the CPU timing can have basically no effect, on my Desktop (7950x3d) it slows down my thesis by maybe 0.1s

molten kayak Jul 7, 2024, 12:53 PM

#

I've also tried the version of typst in main and it seems much better. It takes ~5 seconds for 1000 tags (vs 15 seconds for the latest release) and ~2 seconds for the 1000 tags without refs (vs 5 seconds for the latest release). The full 10000 tags version takes 1m30s, which is kinda slow but not that bad for that amount of text. Also seems much better than the previously predicted 2m30s for the latest release (which also assumes linear scaling).

sly pecan Jul 7, 2024, 12:59 PM

#

molten kayak I've also tried the version of typst in main and it seems much better. It takes ...

That doesn't seem unreasonable

sly pecan Jul 7, 2024, 12:59 PM

#

sturdy sequoia Okay but even then, how would we improve this?

I guess there's nothing to worry about for the time being

gritty hazel Jul 9, 2024, 2:20 PM

#

glad urchin lol

Thought I'd report back on this one, I'm the dude who had the 8k doc 😄. After making the tweaks you recommended @glad urchin of removing the grids I went from 32GB memory consumption down to 24GB memory consumption all the while the doc has grown to 9k pages! So we're looking good, thanks for the help!

glad urchin Jul 9, 2024, 2:26 PM

#

Sounds great 👍

sly pecan Jul 9, 2024, 2:37 PM

#

gritty hazel Thought I'd report back on this one, I'm the dude who had the 8k doc 😄. After ...

Have you tried to compile on main?

gritty hazel Jul 9, 2024, 4:20 PM

#

sly pecan Have you tried to compile on main?

Not yet

quaint blaze Jul 9, 2024, 10:34 PM

#

gritty hazel Thought I'd report back on this one, I'm the dude who had the 8k doc 😄. After ...

24 gigs of ram to compile a pdf :ferrisBallSweat:

feral imp Jul 10, 2024, 7:03 AM

#

quaint blaze 24 gigs of ram to compile a pdf :ferrisBallSweat:

8k pages is crazy on its own.

low sapphire Jul 10, 2024, 8:00 AM

#

Is there no way to limit the memory usage?

#

Less threads? Can you even set that manually?

left night Jul 10, 2024, 8:11 AM

#

low sapphire Less threads? Can you even set that manually?

threading is only used on main, there you can use -j

sturdy sequoia Jul 10, 2024, 8:50 AM

#

low sapphire Is there no way to limit the memory usage?

none, the enabled argument for comemo was just released, we'll see if that helps 😉

#

(it needs to be used, right, but I'll try to open a PR this week that uses it)

sly pecan Jul 10, 2024, 10:21 AM

#

sturdy sequoia none, the `enabled` argument for comemo was just released, we'll see if that hel...

Is that just a binary switch for the entire document?

left night Jul 10, 2024, 10:24 AM

#

sly pecan Is that just a binary switch for the entire document?

no, it's a condition that's evaluated for every memoizible function call, whether to actually memoize

sturdy sequoia Jul 10, 2024, 11:02 AM

#

sly pecan Is that just a binary switch for the entire document?

No, it would be per-function, for example disabling memoization on small functions, or small layout calls, these kinds of things

sly pecan Jul 10, 2024, 11:59 AM

#

sturdy sequoia No, it would be per-function, for example disabling memoization on small functio...

Manual, or based on a heuristic?

sturdy sequoia Jul 10, 2024, 12:20 PM

#

sly pecan Manual, or based on a heuristic?

Heuristic hopefully

left night Jul 10, 2024, 12:20 PM

#

sturdy sequoia Heuristic hopefully

I still think we could do something time-based

#

not sure whether it would work on wasm

#

but maybe later, some manual heuristics go a longer way initially I think

sturdy sequoia Jul 10, 2024, 12:32 PM

#

left night I still think we could do something time-based

but that would still require hashing the arguments, no?

#

since it would need to check whether the call is already in the cache?

left night Jul 10, 2024, 12:38 PM

#

sturdy sequoia but that would still require hashing the arguments, no?

my original idea was that we could opt out of hashing when we detect that the function call is almost always cheap and go back to caching if we observe that it got expensive

#

but I now think it wouldn't really work out well

#

cause it would be static per function and e.g. call_closure just is unpredictable in that way. sometimes the function is expensive, sometimes its cheap.

#

so whenever the opt-out would work statically, we might as well remove the memoize in the source code

sturdy sequoia Jul 10, 2024, 1:39 PM

#

left night cause it would be static per function and e.g. `call_closure` just is unpredicta...

Perhaps we can do it as an AtomicBool in Closure?

#

that says whether that closure is memoizeable, and this boolean is set based on timings?

#

for user-code at least it should help!

left night Jul 10, 2024, 1:42 PM

#

sturdy sequoia Perhaps we can do it as an `AtomicBool` in `Closure`?

not a bad idea! although there might of course still be closures that are sometimes very cheap and sometimes very expensive.

#

but it would be a good approximation I think

#

maybe something with a few more bits than a bool

#

so that we can say like "if this was cheap a few times in a row, skip it"

#

but maybe a bool is actually better because it kicks in faster

sturdy sequoia Jul 10, 2024, 1:43 PM

#

I was thinking of having the concept of a Value::len that gives an approximate length of the items which gives us a hint towards the hashing cost & running cost of whathever function is being called on it

left night Jul 10, 2024, 1:43 PM

#

but a function can be very cheap even if the value is big

#

e.g. array.at

sturdy sequoia Jul 10, 2024, 1:43 PM

#

Yes but the hashing it still there (if it's a closure)!

#

That's the trick, it doesn't remove the expensive bit

#

unless we make a LazyEcoVec that contains the hash on the heap too 😂

left night Jul 10, 2024, 1:44 PM

#

the bool could skip the hashing

sturdy sequoia Jul 10, 2024, 1:44 PM

#

Indeed, but my idea with len is that we can check if it's going to be costly to hash

left night Jul 10, 2024, 1:44 PM

#

overall, if we can somewhat reasonably observe the runtime behaviour, it will be always better than a hand-written heuristic

#

at least that's my theory

left night Jul 10, 2024, 1:45 PM

#

sturdy sequoia Indeed, but my idea with `len` is that we can check if it's going to be costly t...

what would you do if Value::len is large?

sturdy sequoia Jul 10, 2024, 1:45 PM

#

I agree

sturdy sequoia Jul 10, 2024, 1:45 PM

#

left night what would you do if `Value::len` is large?

that's the problem, there is an argument either way 😂=

#

Perhaps having two bools: one that tells us when len is big is slow, and when it's fast

left night Jul 10, 2024, 1:46 PM

#

sounds a little over-engineered tbh

sturdy sequoia Jul 10, 2024, 1:46 PM

#

yeah probably

left night Jul 10, 2024, 1:46 PM

#

what's the problem with a single bool that says whether it's cheap?

sturdy sequoia Jul 10, 2024, 1:47 PM

#

Yeah no, it's good

left night Jul 10, 2024, 1:47 PM

#

we would need to measure to find out whether there are too many cache misses due to it

sturdy sequoia Jul 10, 2024, 1:47 PM

#

It's fairly easy using a MISS_COUNT and HIT_COUNT, I've actually used this in the past for measuring cache efficiency

left night Jul 10, 2024, 1:48 PM

#

wait, are we talking about the same kind of cache?

#

I meant comemo's cache, not the CPUs

sturdy sequoia Jul 10, 2024, 1:48 PM

#

I meant in comemo

left night Jul 10, 2024, 1:48 PM

#

ok

#

so we would check enabled = !func.is_cheap()

#

and then in the body unconditionally update is_cheap after the function, correct?

#

idk whether we can measure anything on wasm though

sly pecan Jul 15, 2024, 8:21 PM

#

Should I just close https://github.com/typst/typst/issues/4560 ? It doesn't seem like an actual issue, apart from somewhat high memory usage

GitHub

Slow compilation to PDF if a lot of tables are present · Issue #456...

Description Currently we have an invoice generation system which generates PDF by converting from HTML. I was looking it to migrate to typst for PDF generation. Issue is our invoice is 8 column wit...

left night Jul 15, 2024, 8:23 PM

#

sly pecan Should I just close https://github.com/typst/typst/issues/4560 ? It doesn't seem...

feels similar to the recent issue

#

the 24h one

#

which is still open ...

left night Jul 18, 2024, 8:48 AM

#

fwiw, the hypermedia systems book goes from 6.5s (0.11.1) to 1.8s (main) on my machine. pretty nice win.

#

it also found a panic in main ... but I fixed it

feral imp Jul 18, 2024, 10:24 AM

#

Double win.

sturdy sequoia Jul 18, 2024, 12:15 PM

#

left night fwiw, the hypermedia systems book goes from 6.5s (0.11.1) to 1.8s (main) on my m...

Nice!

south apex Jul 18, 2024, 12:49 PM

#

left night fwiw, the hypermedia systems book goes from 6.5s (0.11.1) to 1.8s (main) on my m...

I didn’t realize that book is written with typst; cool!!

pearl sedge Jul 18, 2024, 12:56 PM

#

south apex I didn’t realize that book is written with typst; cool!!

blog post is out
https://dz4k.com/2024/new-hypermedia-systems/

Building the new Hypermedia Systems

stone pilot Jul 18, 2024, 4:35 PM

#

@pearl sedge very nice! and I didn't know about shiroa, which is also very cool!

sturdy sequoia Jul 22, 2024, 6:27 PM

#

Ok, I was watching this conference and I just got ideas on how to make the VM heaps faster

#

thinkies

slim sequoia Jul 22, 2024, 6:35 PM

#

ooo

untold turret Jul 22, 2024, 6:40 PM

#

You could prepare a gift, the vm-3, for laurmaedje, who's on vacation.

sturdy sequoia Jul 22, 2024, 6:41 PM

#

untold turret You could prepare a gift, the vm-3, for laurmaedje, who's on vacation.

I am thinking of first doing an AST flattening stage

#

then a VM-3 based on that

#

Currently the big problem of the VM are:

Complexity
Non-parallel compile (something I'd like to solve)
Inefficient data structures

#

With flattening I fix three, and reduce 1, additionally, during flattening, I would look for static path and launch compilation of those

untold turret Jul 22, 2024, 6:44 PM

#

what's ast flattening? a sequence or a thinner tree of IR converted from ast?

sly pecan Jul 22, 2024, 6:46 PM

#

sturdy sequoia Ok, I was watching this conference and I just got ideas on how to make the VM he...

Sunk cost fallacy go brrrrrrrr

#

😂

sturdy sequoia Jul 22, 2024, 8:08 PM

#

untold turret what's ast flattening? a sequence or a thinner tree of IR converted from ast?

still a tree but that uses indices instead of pointers

#

usize -> u32 (size reduction) and more cache friendly

untold turret Jul 23, 2024, 12:41 AM

#

sturdy sequoia usize -> u32 (size reduction) and more cache friendly

why dont we use indices when building ast? like rowan, the parsing framework of rust analyzer can reuse (green) nodes by a shared cache from last or even this parsing.

lost meteor Jul 23, 2024, 2:50 AM

#

The node struct in typst-syntax is actually meant to be the same structure as in Rowan, and we have an incremental parsing algorithm in reparser.rs, but I'm not sure how optimal it is

sturdy sequoia Aug 7, 2024, 11:56 AM

#

@left night What about a hotness and depth based system for disabling memoization?

#

If the function is called within a function that is cold (rarely updated), its sub functions could be skipped from memoization

#

essentially trying to memoize at the highest level we can as often as possible

#

If the caller is not hot, we disable memoization

#

(locally right)

#

Unless the function itself is very hot (called often)

left night Aug 7, 2024, 11:58 AM

#

since we don't initially know which functions are hot, isn't that something that should rather be handled by eviction?

#

if a sub-result isn't reused quickly, it will get evicted

sturdy sequoia Aug 7, 2024, 11:58 AM

#

Hmm, I suppose yeah

left night Aug 7, 2024, 11:58 AM

#

and then it will not be computed since higher levels are memoized

sturdy sequoia Aug 7, 2024, 11:59 AM

#

I guess that's true that eviction will evict small functions quickly-ish

#

But if they were evicted previously, we could skip memoization

left night Aug 7, 2024, 11:59 AM

#

doesn't help with peak memory usage after first compilation, but after a few seconds it kicks in

sturdy sequoia Aug 7, 2024, 11:59 AM

#

left night doesn't help with peak memory usage after first compilation, but after a few sec...

true

left night Aug 7, 2024, 12:00 PM

#

sturdy sequoia But if they **were** evicted previously, we could skip memoization

perhaps yeah, though it could also mean that user changed which part of the document they are writing in

#

and then something that was previously cold is suddenly hot

sturdy sequoia Aug 7, 2024, 12:00 PM

#

That's the tricky bit and the point of keeping track of "hotness"

#

which can be integrated into comemo

left night Aug 7, 2024, 12:00 PM

#

yeah

sturdy sequoia Aug 7, 2024, 12:03 PM

#

so it would be something like enabled = hotness > something or not_previoisly_evicted the problem is checking for whether it was previously evicted (could be at the function level, but it would still require hashing arguments to perform that check intelligently)

#

Other thing we could try: we could evict more quickly if the function is deep in the call stack

sly pecan Aug 31, 2024, 10:41 PM

#

Have you tested regressions recently @sturdy sequoia ? Curious if all the big layout stuff has had any impact

sturdy sequoia Aug 31, 2024, 10:51 PM

#

sly pecan Have you tested regressions recently <@130737672951037952> ? Curious if all the ...

There were big layout stuffs? 😮

#

@lunar kettle when you have time, can you test the PR I have opened, #4871 it should fix all Oklab colors in PDF (not the way I'd like...)

lunar kettle Aug 31, 2024, 11:03 PM

#

will test right away

sturdy sequoia Aug 31, 2024, 11:03 PM

#

lunar kettle will test right away

bruh it's 1 AM, it can wait for tomorrow ❤️

sly pecan Aug 31, 2024, 11:04 PM

#

Yes

lunar kettle Aug 31, 2024, 11:04 PM

#

dw the final thing before going to bed 😂

sturdy sequoia Aug 31, 2024, 11:04 PM

#

lunar kettle dw the final thing before going to bed 😂

I can also just give you the PDF 😄

lunar kettle Aug 31, 2024, 11:04 PM

#

I was working on my pdf crate 👀 and it's soon ready for the first release I think, hehe

sly pecan Aug 31, 2024, 11:04 PM

#

https://github.com/typst/typst/pull/4840

GitHub

Improve realization and page layout by laurmaedje · Pull Request #4...

This pull request refactors and improves realization and page layout, fixing various bugs in the process.
Changes

Page styles are now resolved in a smarter way: Beyond taking the intersection of ...

#

For instance this

sturdy sequoia Aug 31, 2024, 11:05 PM

#

@lunar kettle

📎 gradient-presets.pdf

lunar kettle Aug 31, 2024, 11:05 PM

#

sturdy sequoia Aug 31, 2024, 11:05 PM

#

Well isn't that nice

lunar kettle Aug 31, 2024, 11:05 PM

#

https://tenor.com/view/hell-yeah-thats-what-im-talking-about-kevin-malone-the-office-gif-17805914

Tenor

sturdy sequoia Aug 31, 2024, 11:05 PM

#

lunar kettle https://tenor.com/view/hell-yeah-thats-what-im-talking-about-kevin-malone-the-of...

https://tenor.com/view/mood-lilo-sad-depressed-me-gif-12825220

Tenor

#

Still depressed having removed "native" oklab

lunar kettle Aug 31, 2024, 11:06 PM

#

thsis should in theory make it possible to encode as cmyk too, right? even if it's a bit lossy

#

for the future I mean

lunar kettle Aug 31, 2024, 11:06 PM

#

sturdy sequoia Still depressed having removed "native" oklab

yeah :(

sly pecan Aug 31, 2024, 11:06 PM

#

sturdy sequoia Still depressed having removed "native" oklab

sturdy sequoia Aug 31, 2024, 11:06 PM

#

lunar kettle thsis should in theory make it possible to encode as cmyk too, right? even if it...

cmuk should already worked

#

there was a PR to fix it

sturdy sequoia Aug 31, 2024, 11:07 PM

#

sly pecan

https://github.com/typst/typst/pull/4871

GitHub

Nuked custom PDF export code by Dherse · Pull Request #4871 · typst...

Not gonna lie, this one makes me a bit sad 😭
Changes:

Removes Oklab export script (the last PostScript code in Typst)
Removes Oklab DeviceN color
Makes Oklab use sRGB in PDF export
Adds additional...

#

The PR is well titled 😂

sly pecan Aug 31, 2024, 11:07 PM

#

Did you see the pr I linked above?

sturdy sequoia Aug 31, 2024, 11:07 PM

#

sly pecan Did you see the pr I linked above?

yes, I am building but it's slow

sly pecan Aug 31, 2024, 11:09 PM

#

sturdy sequoia yes, I am building but it's slow

Gotta get one of those new 192 core epycs

#

Speaking of which, I just ordered computer parts, including a 7800x3d

#

Couldn't really bring my computer with me to the us

#

Though I did bring a PS5, XSX, GPU, ram and ssds 😂

sturdy sequoia Aug 31, 2024, 11:12 PM

#

Using a super simple "best of 10" approach, I get the following results:

 commit    cold   t1    t2
c4dd6fa0 = 1.99s 467ms 115ms
a7c4aae3 = 2.26s 440ms 117ms

#

So there has been a significant slowdown on masterproef

sly pecan Aug 31, 2024, 11:12 PM

#

Oh no 😦

sturdy sequoia Aug 31, 2024, 11:12 PM

#

sly pecan Speaking of which, I just ordered computer parts, including a 7800x3d

https://tenor.com/view/noice-nice-click-gif-8843762

Tenor

sly pecan Aug 31, 2024, 11:12 PM

#

You had a 7950x3d?

sturdy sequoia Aug 31, 2024, 11:12 PM

#

sly pecan Though I did bring a PS5, XSX, GPU, ram and ssds 😂

Wait you're already in the US? 😱

sly pecan Aug 31, 2024, 11:12 PM

#

Yeah

sturdy sequoia Aug 31, 2024, 11:13 PM

#

Congrats ❤️

sly pecan Aug 31, 2024, 11:13 PM

#

Since august 16

#

Thanks

sturdy sequoia Aug 31, 2024, 11:13 PM

#

sly pecan You had a 7950x3d?

Yes but my RAM is no longer overclocked, it just wan't stable enough

#

and kind of pissing me off

#

4 sticks of RAM OC'ed on this platform just isn't that nice

sly pecan Aug 31, 2024, 11:13 PM

#

As in no Expo?

sturdy sequoia Aug 31, 2024, 11:13 PM

#

yes

#

I had managed to make it super stable but at the cost of insane temps on the I/O die by manual tweaking of timings

sly pecan Aug 31, 2024, 11:13 PM

#

4 sticks of ram is a shitshow on any platform. Honestly they should just ship motherboards with only 2 slots

sturdy sequoia Aug 31, 2024, 11:14 PM

#

sly pecan 4 sticks of ram is a shitshow on any platform. Honestly they should just ship mo...

probably yeah

sly pecan Aug 31, 2024, 11:14 PM

#

I'm doing an itx build

#

In a fractal terra

sturdy sequoia Aug 31, 2024, 11:14 PM

#

I used to have a Ryzen 2950x with eight sticks, that thing was rock stable

#

||that gif was annoying me||

sly pecan Aug 31, 2024, 11:16 PM

#

Didn't wanna have to deal with all the issues with the multi CCD

#

So that's why 7800

sturdy sequoia Aug 31, 2024, 11:16 PM

#

sly pecan Didn't wanna have to deal with all the issues with the multi CCD

frankly, it wasn't that bad, in NUMA mode Windows was actually clever enough

#

(on the 2950x)

sturdy sequoia Aug 31, 2024, 11:16 PM

#

sly pecan So that's why 7800

yeah there's some weird scheduling things with core parking

#

just Windows being windows

sly pecan Aug 31, 2024, 11:19 PM

#

sturdy sequoia So there has been a significant slowdown on masterproef

Laurenz is purposely doing it so you'll have something to work on

sturdy sequoia Aug 31, 2024, 11:19 PM

#

sly pecan Laurenz is purposely doing it so you'll have something to work on

haha 😂

#

I like speaking to you

#

but it's almost 1:30 and I am tired

#

😪

#

Have a good one ❤️

sly pecan Aug 31, 2024, 11:22 PM

#

Nighty night

glossy shore Sep 1, 2024, 12:58 PM

#

sturdy sequoia so it would be something like `enabled = hotness > something or not_previoisly_e...

why would you need to hash arguments for function-level eviction checking

sturdy sequoia Sep 1, 2024, 3:58 PM

#

glossy shore why would you need to hash arguments for function-level eviction checking

because you still need the hash to store it in case it was needed

glossy shore Sep 1, 2024, 5:39 PM

#

I don't follow

sly pecan Sep 2, 2024, 12:33 PM

#

https://github.com/typst/typst/pull/4876 @sturdy sequoia you gotta test performance again 😎

GitHub

New realization / Text show rules now work across elements by laurm...

This pull request contains a full rewrite of Typst's realization subsystem. This work is the result of a long time of planning and incremental improvements toward making these changes possi...

feral imp Sep 2, 2024, 12:40 PM

#

Sounds more like a call to not do too much performance testing.. But a quickie The Thesis performance stats would be informative..

lunar kettle Sep 2, 2024, 12:50 PM

#

ill give it a try on my computer

#

Before:
typst c main.typ --root ../ --font-path ../fonts  7,67s user 1,12s system 323% cpu 2,721 total
typst c main.typ --root ../ --font-path ../fonts  7,68s user 1,20s system 320% cpu 2,775 total

After:
typst c main.typ --root ../ --font-path ../fonts  7,43s user 0,96s system 316% cpu 2,647 total
typst c main.typ --root ../ --font-path ../fonts  7,52s user 1,10s system 309% cpu 2,782 total
typst c main.typ --root ../ --font-path ../fonts  7,45s user 0,99s system 316% cpu 2,668 total

lament fulcrum Sep 2, 2024, 12:59 PM

#

sly pecan https://github.com/typst/typst/pull/4876 <@130737672951037952> you gotta test p...

holy f, the oldest still open issue

sly pecan Sep 2, 2024, 1:00 PM

#

lunar kettle ill give it a try on my computer

Does that compare it to the previous commit?

#

I'm surprised it's actually faster. But that'll depend on the document I guess

lunar kettle Sep 2, 2024, 1:01 PM

#

sly pecan Does that compare it to the previous commit?

yeah I just realized i forgot to check what I had before that 😂

#

will try again

lament fulcrum Sep 2, 2024, 1:04 PM

#

should maybe also test a document with lots of show text rules?

glossy shore Sep 2, 2024, 1:56 PM

#

assuming it even still gives the correct results

left night Sep 2, 2024, 2:06 PM

#

glossy shore assuming it even still gives the correct results

rude

glossy shore Sep 2, 2024, 2:11 PM

#

lol 😆

#

I mean this is still a breaking change

left night Sep 2, 2024, 2:11 PM

#

*bug fix

#

but I get what you mean now

glossy shore Sep 2, 2024, 2:12 PM

#

even if the new behaviour would be less surprising any documents written for the old one could have worked around it in an incompatible way

left night Sep 2, 2024, 2:12 PM

#

I just looked at the thesis before/after and it seems to be exactly the same. But I'm not sure it even uses text show rules.

glossy shore Sep 2, 2024, 2:12 PM

#

nice

left night Sep 2, 2024, 2:12 PM

#

But generally, I would not be surprised if some bug sneaked it. It's a complete rewrite after all.

#

Though I did rewrite it very incrementally, constantly running the tests

#

Something like 100 WIP commits that nobody will ever see

glossy shore Sep 2, 2024, 2:13 PM

#

I wouldn't worry about that anyways, this is a change for the best

#

and Typst is in beta for a reason

left night Sep 2, 2024, 2:13 PM

#

actually we dropped the whole "beta" wording ^^

#

but it's 0.x of course

#

the beta label was mostly for the web app

slim sequoia Sep 2, 2024, 2:15 PM

#

From "beta" to "better" 🥳

glossy shore Sep 2, 2024, 2:26 PM

#

wops

#

I wonder what a 1.0 might entail

slim sequoia Sep 2, 2024, 2:31 PM

#

Holographic content realization - content appearance depends on what angle you loop at it (only IPS panel screens supported)

left night Sep 2, 2024, 2:34 PM

#

glossy shore I wonder what a 1.0 might entail

Me too :)

glossy shore Sep 2, 2024, 2:35 PM

#

cetainly imagined as much, don't be mistaken

sturdy sequoia Sep 2, 2024, 3:46 PM

#

lunar kettle ``` Before: typst c main.typ --root ../ --font-path ../fonts 7,67s user 1,12s s...

that actually looks like a nice lil' performance bump

lunar kettle Sep 2, 2024, 3:46 PM

#

eh it varies a lot

#

i think it should be about the same

sturdy sequoia Sep 3, 2024, 5:27 PM

#

@cunning wadi Easiest 20% gains in masterproef compile time yet: upgrading my RAM so that I can use EXPO 😎

cunning wadi Sep 3, 2024, 5:43 PM

#

sturdy sequoia <@162509247257509888> Easiest 20% gains in masterproef compile time yet: upgradi...

😎

#

Wait you weren't using EXPO/XMP?

sturdy sequoia Sep 3, 2024, 5:44 PM

#

cunning wadi Wait you weren't using EXPO/XMP?

With four sticks, as BIOS updates rolled out it was too buggy/crashy

cunning wadi Sep 3, 2024, 5:45 PM

#

I should check how fast my new computer is at mosterproef™️

sturdy sequoia Sep 3, 2024, 5:52 PM

#

cunning wadi I should check how fast my new computer is at mosterproef™️

masterproef * :-p

#

https://tenor.com/view/jack-nicholson-jack-nicholson-yes-evil-smile-gif-13030909

Tenor

sly pecan Sep 3, 2024, 9:01 PM

#

sturdy sequoia <@162509247257509888> Easiest 20% gains in masterproef compile time yet: upgradi...

Using 2 sticks now?

sturdy sequoia Sep 3, 2024, 9:14 PM

#

sly pecan Using 2 sticks now?

yes

sly pecan Sep 3, 2024, 9:14 PM

#

2*32 or 2*48?

sturdy sequoia Sep 3, 2024, 9:15 PM

#

64GiB

placid pivot Sep 4, 2024, 2:14 AM

#

There is 48*2?

#

I'm a bit late but yeah

sturdy sequoia Sep 4, 2024, 8:10 AM

#

placid pivot There is 48*2?

Yes but I think it’s only supported on intel CPU’s

placid pivot Sep 4, 2024, 8:10 AM

#

sturdy sequoia Yes but I think it’s only supported on intel CPU’s

the only CPU to kill themselves

sturdy sequoia Sep 4, 2024, 8:11 AM

#

placid pivot the only CPU to kill themselves

I mean amd had explosive cpu with the x3d on ASUS motherboards 😂

#

But to be fair it wasn’t their own fault

#

Unlike Intel

placid pivot Sep 4, 2024, 8:11 AM

#

sturdy sequoia I mean amd had explosive cpu with the x3d on ASUS motherboards 😂

AHAHAHAH REALLY???

sturdy sequoia Sep 4, 2024, 8:12 AM

#

placid pivot AHAHAHAH REALLY???

Yes early on with the 9000x3d some motherboard would shove wayyyyy too much voltage down the 3D stacked cores causing them to die

#

But it was motherboards not respecting the specs from amd about max voltage

placid pivot Sep 4, 2024, 8:14 AM

#

Aaaaah I see

#

Like intel no?

sturdy sequoia Sep 4, 2024, 8:16 AM

#

No it was purely the motherboard fault unlike intel whose guidelines lead to degradation long term due to poor design and a need to keep bumping the clock to stay competitive

sly pecan Sep 4, 2024, 12:13 PM

#

sturdy sequoia Yes but I think it’s only supported on intel CPU’s

Fairly sure that's not the case

shy sage Sep 4, 2024, 12:17 PM

#

sturdy sequoia Yes but I think it’s only supported on intel CPU’s

No I have 2x48 with a ryzen, works great 👍

sturdy sequoia Sep 4, 2024, 12:37 PM

#

Huh I didn’t know

sly pecan Sep 4, 2024, 12:41 PM

#

sturdy sequoia Huh I didn’t know

sturdy sequoia Sep 4, 2024, 1:06 PM

#

sly pecan

not really, i'll survive with 64 Gig of ram as long as it's not crashy 😄

sly pecan Sep 4, 2024, 2:24 PM

#

If that's what you tell yourself to sleep at night!

glossy shore Sep 4, 2024, 5:34 PM

#

16 is too much ;-;

cunning wadi Sep 4, 2024, 11:26 PM

#

okay I ran masterproef on main

#

1sec 847ms 406µs 833ns

#

on my new computer

sly pecan Sep 4, 2024, 11:38 PM

#

cunning wadi 1sec 847ms 406µs 833ns

Those 833 ns really made a difference

cunning wadi Sep 4, 2024, 11:38 PM

#

sly pecan Those 833 ns really made a difference

Imagine if it was 834ns!

#

Horrible!

hoary dew Sep 5, 2024, 7:05 AM

#

cunning wadi okay I ran masterproef on main

Did you compare with 0.11.1?

cunning wadi Sep 5, 2024, 7:08 AM

#

I didn't

#

I mainly wanted to see how fast my new computer can compile

#

I can do so later though

feral imp Sep 5, 2024, 7:09 AM

#

1 sec thesis is really nice.

#

Main is increasing capabilities, whilst still retaining respectable performance.

sturdy sequoia Sep 5, 2024, 8:48 AM

#

cunning wadi 1sec 847ms 406µs 833ns

that's with the optimized masterproef, right? (latest commit)

#

I made the figures smaller and optimized a few things

sturdy sequoia Sep 5, 2024, 8:49 AM

#

feral imp 1 sec thesis is really nice.

and 400ms hot which that is too long imo

#

it should be no more than 100ms incremental even for 164 pages

feral imp Sep 5, 2024, 8:51 AM

#

🤷‍♂️ I don't know man, performance would be nice, but we are getting a ton of features... And those are a bit necessary. Execution performance isn't necessary... Lower memory usage is though.

It is a balance.

Feel free to make typst fast again

sturdy sequoia Sep 5, 2024, 8:51 AM

#

feral imp 🤷‍♂️ I don't know man, performance would be nice, but we are getting a ton of f...

The thing is I keep doomscrolling issues on GH but I don't know where to contribute

#

performance has gotten so tricky do optimize (multithreading really isn't helping :()

#

I found a few low hanging fruit but I think that's it

feral imp Sep 5, 2024, 8:52 AM

#

... What about what laurmadje wrote in the PR... Let me find the link...

feral imp Sep 5, 2024, 8:53 AM

#

sly pecan https://github.com/typst/typst/pull/4876 <@130737672951037952> you gotta test p...

This one specifically..

sturdy sequoia Sep 5, 2024, 8:54 AM

#

I wonder if just re-allocating Vec for StyleChain would be more efficient that linked lists

#

I mean linked lists are known to be quite bad for CPU caches and allocations end up being fairly cheap

#

I guess the cloning into the new vec would be the most expensive

proud sandal Sep 5, 2024, 9:00 AM

#

How many styles chains are there on average and how large are they

sturdy sequoia Sep 5, 2024, 9:00 AM

#

proud sandal How many styles chains are there on average and how large are they

each set and show rule adds one or more element to the chain

#

one for each field set in a set but afaik they're grouped in an array

proud sandal Sep 5, 2024, 9:00 AM

#

Mhm

sturdy sequoia Sep 5, 2024, 9:00 AM

#

StyleChain is a linked lists of slices

#

But I guess in a large document that can get quite long

proud sandal Sep 5, 2024, 9:01 AM

#

Do they share common nodes?

sturdy sequoia Sep 5, 2024, 9:01 AM

#

proud sandal Do they share common nodes?

yes afaik since they're built from the top of the doc down

proud sandal Sep 5, 2024, 9:01 AM

#

Like if there's two code paths and each one is taken once

#

OK

sturdy sequoia Sep 5, 2024, 9:01 AM

#

But there @left night is far far far more knowledgeable than me

#

I probably shouldn't be talking without confirmation

glossy shore Sep 5, 2024, 9:19 AM

#

oh is it like a cons list?

molten kayak Sep 5, 2024, 9:54 AM

#

I wonder if something like a persistent map would be more efficient. It would retain the sharing of the style chain for the nodes up the chain while being faster to index into.

left night Sep 5, 2024, 10:04 AM

#

sturdy sequoia I wonder if just re-allocating `Vec` for `StyleChain` would be more efficient th...

As hinted in the PR, I want to rework all StyleChain<'a> into &'a Styles. Styles is planned to become more than just Vec<Style>. I don't have very concrete plans yet, but I want to make it somewhat efficient to clone, but at the same time very fast lookup for style properties, recipe matching, and lazily initialized RegexSets for text show rule matching.

left night Sep 5, 2024, 10:04 AM

#

sturdy sequoia performance has gotten so tricky do optimize (multithreading really isn't helpin...

what do you mean with "multithreading really isn't helping :("?

sturdy sequoia Sep 5, 2024, 10:06 AM

#

left night what do you mean with "multithreading really isn't helping :("?

it makes flamegraphs really hard to decipher

#

even with just two threads (the minimum) it's very difficult to interpret

left night Sep 5, 2024, 10:06 AM

#

I think we could make it not use a worker if threads=1

#

that would make it easier to optimize single-threaded performance

sturdy sequoia Sep 5, 2024, 10:07 AM

#

I think that would be better indeed, would make profiling with say VTune a lot easier

left night Sep 5, 2024, 10:07 AM

#

Feel free to give that a shot

proud sandal Sep 5, 2024, 10:14 AM

#

molten kayak I wonder if something like a persistent map would be more efficient. It would re...

Perhaps, depends on the exact operations I reckon but I was thinking of persistent data structures too

left night Sep 5, 2024, 10:16 AM

#

proud sandal Perhaps, depends on the exact operations I reckon but I was thinking of persiste...

Me too

cunning wadi Sep 5, 2024, 11:23 AM

#

sturdy sequoia that's with the optimized masterproef, right? (latest commit)

Yeah

sturdy sequoia Sep 5, 2024, 11:31 AM

#

cunning wadi Yeah

it's nice to have been bumped in performance by tweaking the doc itself 😄

sturdy sequoia Sep 5, 2024, 11:32 AM

#

left night Feel free to give that a shot

I'll do that after work

sly pecan Sep 24, 2024, 3:01 PM

#

@sturdy sequoia you know the drill

#

🤣

feral imp Sep 24, 2024, 3:03 PM

#

In my very unscientific testing on a few documents, performance was not affected much. While working on the PR, I primarily on focused on getting things right and not hurting performance too much, so that's a good result I would say. Fundamentally, doing relayouts is of course more expensive than not doing them.
Source: https://github.com/typst/typst/pull/5017

sly pecan Sep 24, 2024, 3:06 PM

#

Yeah but he is the performance guru

sturdy sequoia Sep 24, 2024, 5:51 PM

#

sly pecan <@130737672951037952> you know the drill

Care to explain? 😄

feral imp Sep 24, 2024, 6:00 PM

#

feral imp > In my very unscientific testing on a few documents, performance was not affect...

@sturdy sequoia I'm sure he meant this. That's why I paste'd it in.

sly pecan Sep 24, 2024, 6:00 PM

#

sturdy sequoia Care to explain? 😄

New layout dropped

#

https://github.com/typst/typst/pull/5017

GitHub

New flow layout, with multi-column floats by laurmaedje · Pull Requ...

This is a full, fundamental redesign and rewrite of Typst's most central layouter, the flow layouter. This layouter is responsible for arranging paragraphs, blocks, spacing, placed elements...

sturdy sequoia Sep 24, 2024, 7:55 PM

#

sly pecan New layout dropped

NEW WHAT

#

😮

sly pecan Sep 26, 2024, 8:13 PM

#

https://github.com/typst/typst/pull/5046 does this only address regressions since 0.11.1, or even stuff from before then?

GitHub

Better block caching by laurmaedje · Pull Request #5046 · typst/typ...

Some caching opportunities got lost through the various refactors. This speeds up incremental a bunch!

sturdy sequoia Sep 26, 2024, 9:07 PM

#

sly pecan https://github.com/typst/typst/pull/5046 does this only address regressions sinc...

But it will come with increased ram usage

#

balanced as all things should be

sly pecan Sep 26, 2024, 9:08 PM

#

sturdy sequoia But it will come with increased ram usage

I've said it before, typst pro should come with a stick of ram

molten kayak Sep 27, 2024, 6:29 AM

#

sturdy sequoia balanced as all things should be

Meanwhile me and my 24+ GB being used by typst while compiling a presentation 😭

feral imp Sep 27, 2024, 7:36 AM

#

molten kayak Meanwhile me and my 24+ GB being used by typst while compiling a presentation 😭

Polylux? Png?

molten kayak Sep 27, 2024, 7:39 AM

#

feral imp Polylux? Png?

Polylux initially, then switched to touying. To be fair though there are a bunch of fletcher diagrams and I reached 24GB while editing them (just compiling when using touying is much more manageable with 1-2GB now). The presentation is like 70 pdf pages, it would be 17 slides but with a bunch of pauses in slides with diagrams.

feral imp Sep 27, 2024, 7:40 AM

#

Ok.

left night Sep 27, 2024, 8:02 AM

#

sly pecan https://github.com/typst/typst/pull/5046 does this only address regressions sinc...

a bit of both

#

many blocks very cached before (at least I'm fairly sure they where), but some weren't, in particular equations

sturdy sequoia Sep 27, 2024, 9:46 AM

#

molten kayak Polylux initially, then switched to touying. To be fair though there are a bunch...

Yeah I’ve had similar issues with large cetz diagrams or when updating godly

#

Coldly *

sturdy sequoia Sep 27, 2024, 9:46 AM

#

sly pecan https://github.com/typst/typst/pull/5046 does this only address regressions sinc...

Btw I couldn’t measure any performance improvement on my machine 🥲

left night Sep 27, 2024, 10:23 AM

#

sturdy sequoia Btw I couldn’t measure any performance improvement on my machine 🥲

not every optimization is tuned for your thesis 😉

#

I have a math-heavy document I got from someone and incremental is 6x faster with this PR

sturdy sequoia Sep 27, 2024, 10:26 AM

#

left night not every optimization is tuned for your thesis 😉

Noooooooooo

sturdy sequoia Sep 27, 2024, 10:26 AM

#

left night I have a math-heavy document I got from someone and incremental is 6x faster wit...

Wow that’s huge

left night Sep 27, 2024, 10:28 AM

#

sturdy sequoia Wow that’s huge

equations were just not cached at all before. oops.

#

on main, actually no single/multi blocks were cached, only user blocks

#

on 0.11.1, only LayoutMultiple was cached, not LayoutSingle (at least I think so, I didn't 100% validate that)

sturdy sequoia Sep 27, 2024, 10:34 AM

#

Good news is that it’s now cached!

#

I have a math heavy doc of mine but it’s on the webapp so I wonder if it will benefit from it

glad urchin Sep 27, 2024, 10:38 AM

#

I'll definitely be testing that on some of my own documents too

#

I do have a few with some math which eventually got slow in incremental , purely anecdotal though

#

And it wasn't that much to bother me, was just a bit more than usual :p

#

Thought it was normal for the size of the doc, but maybe it improves anyway 👀

glad urchin Oct 14, 2024, 5:37 AM

#

so

#

while answering a forum post, i found one small document that got 7x slower on 0.12.0-rc1 for some reason, which is interesting

#

(from https://forum.typst.app/t/how-to-create-a-table-with-round-corners-like-in-rect-function/1051/)

#table(
  fill: (x, y) =>
    if x == 0 or x == 6 or x == 7 or y==1 { silver },
  columns: (0.45cm,0.45cm,0.45cm,0.45cm,0.45cm,0.45cm,0.45cm,0.45cm,),
inset: 2pt,
stroke: 0.2pt,
  align: center,
  table.header(
    table.cell(colspan:8, fill: silver)[*Dezember*],
    [*KW*],[*Mo*], [*Di*], [*Mi*], [*Do*], [*Fr*], [*Sa*], [*So*]
  ),
[48], [], [], [], [], [1], [2], [3],
[49], [4], [5], [6], [7], [8], [9], [10],
[50], [11], [12], [13], [14], [15], [16], [17],
[51], [18], [19], [20], [21], [22], [23], [24],
[52], [25], [26], [27], [28], [29], [30], [31],
)

#

0.11.1 -> 5 ms, 0.12.0-rc1 -> 35 ms

#

very odd

#

timings: (0.11.1)

#

0.12.0-rc1

#

seems like pad is more expensive now

#

okay i can confirm this happens even for this MWE

#table(
  columns: 2,
  [a], [b], [c], [d]
)

#

3 ms -> 10 ms (~3.3x slower)

#

big sad

glad urchin Oct 14, 2024, 5:58 AM

#

okay false alarm, i blundered, sorry folks lol

#

testing environment was exceedingly unscientific

#

so we can ignore that

#

after deleting everything and starting from scratch it works fine so yeah

#

my fault 😄

untold turret Oct 14, 2024, 6:02 AM

#

wondering whether we could have a repo that tracks performance, and you can run results on action bots by triggering actions of the repo, like https://github.com/ziglang/gotta-go-fast?tab=readme-ov-file

GitHub

GitHub - ziglang/gotta-go-fast: Performance Tracking for Zig

Performance Tracking for Zig. Contribute to ziglang/gotta-go-fast development by creating an account on GitHub.

glad urchin Oct 14, 2024, 6:02 AM

#

would be cool to have something more automated yeah

#

i would have probably used such a thing before posting anything here

#

haha

#

Basically the person in the forum thread reported a 10x slowdown by wrapping tables in blocks, so i tried to reproduce and noticed there was a slowdown even without blocks
But turns out i was just using some bad binary , maybe built without optimization or smth as it was in some random testing folder
Using the binary from gh releases fixed it (for the case without blocks)

#

With blocks there is a slowdown on both Typst versions , but nowhere near 10x for me

untold turret Oct 14, 2024, 6:06 AM

#

Beyond that, it should be great to have something like typst-test bench that provides machine-independent statistics, then we can make such a repo with the tool. We can simply run the tools locally.

glad urchin Oct 14, 2024, 6:06 AM

#

More like 1.5x or smth

sturdy sequoia Oct 14, 2024, 10:41 AM

#

untold turret wondering whether we could have a repo that tracks performance, and you can run ...

I has sort of worked on that at some point

#

But it's tricky

sturdy sequoia Oct 14, 2024, 10:41 AM

#

glad urchin With blocks there is a slowdown on both Typst versions , but nowhere near 10x fo...

can also be that their CPU is old and stuff like that, or low memory bandwidth and a branch mispredict is very expensive for them?

glad urchin Oct 15, 2024, 1:30 PM

#

sturdy sequoia can also be that their CPU is old and stuff like that, or low memory bandwidth a...

not sure, but now they say that, in 0.11, tables without blocks took 60s to compile, while tables with blocks took 600s

whereas, in 0.12.0-rc1, tables + blocks is taking 10s
so that's impressive

violet axle Oct 15, 2024, 2:54 PM

#

For our practical university course, the time has also gone down from ~10s to ~3s (on a rather old pc). So thank you for the improvements!

unborn geyser Oct 22, 2024, 10:13 PM

#

untold turret wondering whether we could have a repo that tracks performance, and you can run ...

prob not relevent, but in python there's airspeed velocity (https://asv.readthedocs.io/en/latest/) (example site: https://pv.github.io/numpy-bench/). Maybe something similar exists for rust

left night Nov 24, 2024, 5:39 PM

#

@sturdy sequoia some interesting stuff I dug up about the history of pre-interned strings in rustc:

https://github.com/rust-lang/rust/pull/59655
https://github.com/rust-lang/rust/pull/59655#issuecomment-480598659
https://github.com/rust-lang/rust/pull/60630
https://github.com/rust-lang/rust/pull/95726
in short: rustc has a list with all the stuff in one place. I'd love to have a const/static PicoStr, but such a list is kind of a pain. I wish the compiler could somehow do it for us, but looks like the rustc folks also didn't come up with a better way.

sturdy sequoia Nov 24, 2024, 6:01 PM

#

left night <@130737672951037952> some interesting stuff I dug up about the history of pre-i...

Indeed, having static strings to at least skip initialization of all "built-in" strings would be so nice, I wonder if eventually we'll be able to do it with const functions?

#

Or I could just spend the two hours to make it by end lol

#

replacing the pico macro with (essentially) a static list of strings it matches over

left night Nov 24, 2024, 6:03 PM

#

sturdy sequoia Indeed, having static strings to at least skip initialization of all "built-in" ...

even with const fn evolved much, I think it'd be hard/impossible because it needs global state

#

maybe we should first discuss what kind of strings we'd like to have statically interned in the first place

#

right now PicoStr is used exclusively for Labels

#

I think it would be nice to use it for all fields (possibly replacing the u8 field ids?)

#

I also would like to use it for HTML tags & attributes in the future

#

and I'd like to have the tags available in const contexts

sturdy sequoia Nov 24, 2024, 6:06 PM

#

I know that using it for fields and function args actually makes a decent perf bump even with the current impl

#

(since I had done it for the VM notably)

left night Nov 24, 2024, 6:07 PM

#

I think variables might also want to use it

#

i.e. anything in the LHS of a Scope

sturdy sequoia Nov 24, 2024, 6:07 PM

#

Huh it's funny because I hadn't tested it, since the VM used IDs for variables

left night Nov 24, 2024, 6:07 PM

#

we probably checked before, but are rust static strings guaranteed to be dedup-ed?

sturdy sequoia Nov 24, 2024, 6:07 PM

#

maybe each scope could have its own list of IDs for variable (so as to not polute the global interner)

sturdy sequoia Nov 24, 2024, 6:07 PM

#

left night we probably checked before, but are rust static strings guaranteed to be dedup-e...

const ones are but iirc only within one crate

left night Nov 24, 2024, 6:08 PM

#

so we can't rely on pointer comparison for string literals?

sturdy sequoia Nov 24, 2024, 6:08 PM

#

I am not sure

left night Nov 24, 2024, 6:08 PM

#

also unfortunate that strings have alignment of 1

#

kinda prevents using padding bits for tricks

sturdy sequoia Nov 24, 2024, 6:10 PM

#

https://stackoverflow.com/questions/74228163/if-the-same-string-literal-appears-in-code-twice-does-it-appear-in-the-executab

Stack Overflow

If the same string literal appears in code twice, does it appear in...

Let's say that I created some Rust source code that combines lots of duplicate string literals. Are they de-duplicated during the compilation process?

#

This is the best source I can find

left night Nov 24, 2024, 6:10 PM

#

but I guess distinguishing between static strings and runtime ones doesn't work anyway

sturdy sequoia Nov 24, 2024, 6:10 PM

#

left night also unfortunate that strings have alignment of 1

😦

left night Nov 24, 2024, 6:10 PM

#

we can't know if a runtime interned string also exists in the static strings

#

so we can't compare properly

#Performance