Rich Text - Uncreatively Named Text Handling Prototype Tool | Graphics Programming | Page 2

old harbor Dec 6, 2023, 7:51 PM

#

rip

olive sapphire Dec 6, 2023, 7:51 PM

#

These are the only ones I have:

molten galleon Dec 6, 2023, 7:51 PM

#

I just keep 2 builds, debug with the debug and run benchmarks/profiling with release, luckily I don't use ue so I don't have any cases where debug is literally unusable

olive sapphire Dec 6, 2023, 7:53 PM

#

Oh! I found it:

#

Wow, changing build types is so weird/complicated in VS.

old harbor Dec 6, 2023, 8:03 PM

#

it should be a drop down list in the toolbar somewhere already, out of the box

molten galleon Dec 7, 2023, 12:16 AM

#

#

random thing

#

#

someone mentioned obsidian here a while ago, so I tried it out and it's honestly pretty great

#

it's a solid markdown editor for starters but it also lets me keep a random ass pile of notes in pretty much the same way you'd want from talking to yourself (or others) on discord

#

would definitely recommend

#

it has ```language``` markdown support natively, instantly great

old harbor Dec 7, 2023, 12:19 AM

#

yeah obsidian is great

nocturne hill Dec 7, 2023, 12:43 AM

#

molten galleon

are these all root tasks 💀

#

(i.e. is this necessary for some other project or are these tasks just self contained so to say)

molten galleon Dec 7, 2023, 12:45 AM

#

no they're all completely unrelated

#

they're just the main projects I find fun atm, they do sort of feed off each other in abstract ways

#

in the sense that my raytracer uses my vulkan abstraction and I plan to use this as my general purpose text handling lib in the future

molten galleon Dec 7, 2023, 8:59 PM

#

#

thinking up some ways to generate test data for benchmarking

left junco Dec 7, 2023, 9:01 PM

#

Is notepad really able to render all that

#

That's honestly impressive

#

I never considered it as anything but an ASCII editor lol

molten galleon Dec 7, 2023, 9:01 PM

#

what I ended up doing was using a normal distribution to generate word lengths from randomly picked unicode blocks (out of a subset of unicode blocks I know I have font info for), probably the best I can do without using a dictionary, which is too much of a pain in the ass

#

yeah it clearly has fallback font support too, of some extent

#

but I think hardcoded or using some windows registry default, I have my font set to noto sans but it wants to fallback to Segoe UI or its list of fallbacks when it wants

#

#

also this is laggy as fuck

#

this is 1MiB (1024 * 1024) bytes of UTF8 text

#

#

I just tried inserting some emoji (which, note to self, I need to add some block ranges for so I can test hitting that in my font cache too)

#

and it took a few seconds

#

#

n++ also struggles with it

#

resizing notepad is so slow too lol

molten galleon Dec 7, 2023, 11:00 PM

#

generated randomized font size, weight, and style runs as well, and a debug html file dump

old harbor Dec 7, 2023, 11:07 PM

#

froge_love

#

do you also support that wonky ass unicode overflow thingymajigg where the characters go crazy?

molten galleon Dec 7, 2023, 11:08 PM

#

zalgo text? yeah

old harbor Dec 7, 2023, 11:08 PM

#

yep 😄

molten galleon Dec 7, 2023, 11:08 PM

#

the thread image is a pic of zalgo text rendered with my thing

old harbor Dec 7, 2023, 11:08 PM

#

ah

#

indeed 😄 i was looking at the thread pics earlier

olive sapphire Dec 7, 2023, 11:57 PM

#

left junco Is notepad really able to render all that

It's using the system's/window's text drawing APIs which are pretty bad but can render anything.

molten galleon Dec 7, 2023, 11:59 PM

#

windows text rendering is not bad at all though

#

cleartype is like the best thing around

old harbor Dec 8, 2023, 12:00 AM

#

i find windows' text rendering superior

olive sapphire Dec 8, 2023, 12:00 AM

#

molten galleon windows text rendering is not bad at all though

I'm not saying it's bad in quality, but it's very slow.

#

and Notepad does not do texture atlases or any kind of basic optimization.

old harbor Dec 8, 2023, 12:01 AM

#

how is that relevant?

molten galleon Dec 8, 2023, 12:02 AM

#

with a lot of text on the screen you can see the double buffering from when it tries to redraw on the WM_PAINT message

olive sapphire Dec 8, 2023, 12:02 AM

#

left junco That's honestly impressive

I was giving a bit more info on this.

#

@molten galleon BTW, how do you handle texture atlases? Is it per font?

#

Just wondering about different methods

molten galleon Dec 8, 2023, 12:03 AM

#

nope lazily allocated like I mentioned

#

which is the method roblox ships to all platforms with

olive sapphire Dec 8, 2023, 12:04 AM

#

What do you mean by that? I'm not too familiar with graphics terminology yet.

molten galleon Dec 8, 2023, 12:04 AM

#

I start with an empty atlas, I request a glyph from my atlas, if it doesn't have it then it adds it

#

lazily [x] (just as general programming terminology) just means "only do it when someone asks for the result of it"

olive sapphire Dec 8, 2023, 1:43 AM

#

molten galleon I start with an empty atlas, I request a glyph from my atlas, if it doesn't have...

Ok, so that's the same method I do right now, but I'm confused on how to do multiple fonts and multiple atlases when the first atlas is completely filled up.

molten galleon Dec 8, 2023, 1:10 PM

#

https://stackoverflow.com/questions/3980035/performance-of-win32-memory-mapped-files-vs-crt-fopen-fread
https://manybutfinite.com/post/page-cache-the-affair-between-memory-and-files/

Ended up doing a little digging into memory mapped files on wingdings because of the discussion the other day, this is some good info

#

I'm realizing that while memory mapping isn't the optimal thing for loading a file for a single read, like for a glTF or an image, it might be a lot better than trying to handroll some kind of LRU cache for font file memory, for a whole host of reasons:

an LRU cache like MuPDF's works on variably sized data and just holds whole handles to file memory, its eviction scheme isn't actually true LRU but it tries to scavenge for objects of closer size to the new memory you need, more like an allocator
evicting a file whole means that you might have to eat the cost of reading a whole file into memory multiple times during the course of a single text layouting operation
font files have disproportionately hotter areas (the certain charmaps/tables you want), and trying to serve that need would basically cause you to reinvent the OS's paging, virtual memory, and file caching

molten galleon Dec 9, 2023, 12:51 PM

#

#

preliminary benchmarks (before any fancy alterations to font loading) show I can layout 1MiB of text in about 1.8 seconds on release

#

#

could probably shave some time here and there if I wanted to go extra hard but the vast, vast, majority of the time is spent on harfbuzz actually computing the shaping

#

so I'd say my main goal now would be to just try to reduce overall memory usage without causing too massive of a regression, and make this thread safe (since rn you can't use the font cache from multiple threads), since that's about all you can do to improve runtime

#

but really, it seems the main way to speed up layout is to tailor it to how you're actually using the text

old harbor Dec 9, 2023, 3:15 PM

#

that function naming 😄

molten galleon Dec 9, 2023, 3:19 PM

#

lol which ones? most of the functions in that profile are from hb/FT

old harbor Dec 9, 2023, 3:20 PM

#

yeah underscore mixed with capitalized snek

#

PascalCase and lower_case_snek

#

its all in there : )

molten galleon Dec 9, 2023, 3:21 PM

#

oh yeah lol, just seemed right to name it like that since it narrows the benchmarks by category

#

so you can see how BM_Layout_MultiFont_LineBreak would fit in

#

(which is currently disabled due to epic font cache bugs)

old harbor Dec 9, 2023, 3:27 PM

#

yeah

molten galleon Dec 10, 2023, 1:35 PM

#

fakk how do you make a multi-line-graph in excel

#

#

some random tut tells me my data needs to be kinda arranged like this

#

but how do I do that, if a pivot table didn't

#

fuckkkk how tf do I use excel

old harbor Dec 10, 2023, 1:42 PM

#

shift+enter

#

or

#

even better

#

you remove the grid lines in that range 😛

livid patrol Dec 10, 2023, 1:44 PM

#

One time I tried making a macro for something I was frequently doing in Excel. I never succeeded and I still haven't mentally recovered from it

molten galleon Dec 10, 2023, 2:01 PM

#

#

so I got it to treat string length as numbers, that's a start, it still thinks its all 1 dataset though

#

I just want it to show multiple lines for each test

#

ack

#

rather, 1 line for each test

#

#

I need to figure out how to squeeze it out of this because this is how google benchmark dumps the data to me

#

what a pain

#

you'd think pivot tables would be the way you aggregate it by run name

molten galleon Dec 10, 2023, 2:25 PM

#

ok I've given up on getting a nice graph view for now, excel has won for now

molten galleon Dec 10, 2023, 3:34 PM

#

https://learn.microsoft.com/en-us/globalization/fonts-layout/fonts#font-linking
surprised I didn't come across this before

#

and linux's fontconfig

#

it looks like I've been wandering down essentially the right path, minus the specifics and complexity of my fallback/linking scheme

molten galleon Dec 13, 2023, 4:23 PM

#

got my FontRegistry API looking like how I'd want it, still some internal stuff to smooth over but things are looking good

#

looks like there was basically no performance change before/after either, including changing from loading font files into memory to just mapping them

#

but now building text layouts is safely multithreadable

old harbor Dec 28, 2023, 11:16 AM

#

i found something

#

which might be something for you

#

when you decide to bin vulkan and come back to the holy opengl land

#

or want to use gl for some dummy project

#

NV_path_rendering 😄

#

requires gl1.1 at least, and you can apparently feed novideo cards SVG and postscript path syntax

#

        const char *svgPathString =
          // star
          "M100,180 L40,10 L190,120 L10,120 L160,10 z"
          // heart
          "M300 300 C 100 400,100 200,300 100,500 200,500 400,300 300Z";
        glPathStringNV(pathObj, GL_PATH_FORMAT_SVG_NV,
                       (GLsizei)strlen(svgPathString), svgPathString);

#

probably compiles that into some draw list of sorts under the hood

#

        const char *psPathString =
          // star
          "100 180 moveto"
          " 40 10 lineto 190 120 lineto 10 120 lineto 160 10 lineto closepath"
          // heart
          " 300 300 moveto"
          " 100 400 100 200 300 100 curveto"
          " 500 200 500 400 300 300 curveto closepath";
        glPathStringNV(pathObj, GL_PATH_FORMAT_PS_NV,
                       (GLsizei)strlen(psPathString), psPathString);

#

        static const GLubyte pathCommands[10] =
          { GL_MOVE_TO_NV, GL_LINE_TO_NV, GL_LINE_TO_NV, GL_LINE_TO_NV,
            GL_LINE_TO_NV, GL_CLOSE_PATH_NV,
            'M', 'C', 'C', 'Z' };  // character aliases
        static const GLshort pathCoords[12][2] =
          { {100, 180}, {40, 10}, {190, 120}, {10, 120}, {160, 10},
            {300,300}, {100,400}, {100,200}, {300,100},
            {500,200}, {500,400}, {300,300} };
        glPathCommandsNV(pathObj, 10, pathCommands, 24, GL_SHORT, pathCoords);

#

turtle grafics en novideo

#

ideal to draw glyphs 😛

molten galleon Dec 28, 2023, 1:03 PM

#

wow lmao

molten galleon Dec 28, 2023, 1:57 PM

#

kinda makes me wonder too if this is/was a vector for exploits

#

because you have to think that at least at some point some browser might've been pasting raw <svg> strings into this

old harbor Dec 28, 2023, 4:57 PM

#

molten galleon kinda makes me wonder too if this is/was a vector for exploits

haha that was my first thought too when i saw that - forgot to post the sauce https://github.com/KhronosGroup/OpenGL-Registry/blob/main/extensions/NV/NV_path_rendering.txt

olive sapphire Dec 28, 2023, 4:58 PM

#

don't browsers just use bigboi graphics apis like Skia

#

don't think they would need anything like this in the first place prob

old harbor Dec 28, 2023, 4:58 PM

#

yes

old harbor Dec 28, 2023, 4:59 PM

#

olive sapphire don't think they would need anything like this in the first place prob

ofc not, they use whatever libs are there already

olive sapphire Dec 28, 2023, 4:59 PM

#

I saw this during my research but it's kinda useless because only Nvidia GPUs support it lmao

old harbor Dec 28, 2023, 4:59 PM

#

ye

#

it might not be useless though

#

if your vector grafics editor you are making can detect nvidia, you could potentially use it and have hardware accellerated vetor grafics

olive sapphire Dec 28, 2023, 5:01 PM

#

yeah but if your other option doesn't use cpu prerendered in the first place, it's probably too dogshit to exist

#

and I don't think you can use it in place of a CPU prerender

old harbor Dec 28, 2023, 5:01 PM

#

you dont know

#

and its there

#

use it or dont : ) doesnt mean its dogshit

olive sapphire Dec 28, 2023, 5:02 PM

#

nah this is cool, but for 99% of people that are trying to make their games accessible at all, i don't think this would be an option 😔

#

i wonder if you could build a resizing algo that efficiently resizes based on the vectors instead of having freetype generate them for you again

molten galleon Dec 28, 2023, 5:04 PM

#

I'd wanna figure out how the driver implements it

olive sapphire Dec 28, 2023, 5:04 PM

#

good luck

molten galleon Dec 28, 2023, 5:04 PM

#

I'm guessing there's a 'standard' set of techniques to rasterizing vectors tho

olive sapphire Dec 28, 2023, 5:04 PM

#

not like the whole linux community spends decades trying to figure that shit out and still fails

old harbor Dec 28, 2023, 5:04 PM

#

olive sapphire nah this is cool, but for 99% of people that are trying to make their games acce...

obviously

old harbor Dec 28, 2023, 5:05 PM

#

molten galleon I'm guessing there's a 'standard' set of techniques to rasterizing vectors tho

glCreateLists, glCompileLists, glDrawLists 😛

molten galleon Dec 28, 2023, 5:05 PM

#

stuff like this is probably bundled as special shaders the driver people wrote

olive sapphire Dec 28, 2023, 5:05 PM

#

old harbor glCreateLists, glCompileLists, glDrawLists 😛

wait is that actually a thing

old harbor Dec 28, 2023, 5:05 PM

#

yes, fixed function stuff from before you were born probably

molten galleon Dec 28, 2023, 5:05 PM

#

kinda like the BVH builder shaders that pixel writes

#

I bet mesa has an implementation of this ext

olive sapphire Dec 28, 2023, 5:05 PM

#

i looked it up and nothing showed up

#

🤔

#

https://community.khronos.org/t/multi-threaded-loading-screen/56944

#

it's mentioned here but i can't find docs for it

old harbor Dec 28, 2023, 5:06 PM

#

ah its glNewList(... GL_COMPILE)/glCallList

#

my DSA brain turned it into glCreate...

#

its been a while but i have written code using those

#

23 years

olive sapphire Dec 28, 2023, 5:11 PM

#

it renders subsequent draw commands?

#

is that kind of like a vulkan command buffer?

old harbor Dec 28, 2023, 5:12 PM

#

no

#

much simpler

#

vulkan commandbuffer stuff is more like NV_command_list

molten galleon Dec 28, 2023, 5:13 PM

#

iirc graphite posted a while back how he figured out how to do some ptr offset hacks into VkCommandBuffer on nvidia and write in NV device commands directly

olive sapphire Dec 28, 2023, 5:13 PM

#

oh ok

molten galleon Dec 28, 2023, 5:13 PM

#

fun cursed stuff

olive sapphire Dec 28, 2023, 5:14 PM

#

molten galleon iirc graphite posted a while back how he figured out how to do some ptr offset h...

oh like he just looked at the VkCommandBuffer internal code and got a pointer back from it? lol

molten galleon Dec 28, 2023, 5:14 PM

#

idk how he figured it out

olive sapphire Dec 28, 2023, 5:15 PM

#

i'm not experienced but i don't there there are too many diverse ways to go about that

old harbor Dec 28, 2023, 5:17 PM

#

molten galleon iirc graphite posted a while back how he figured out how to do some ptr offset h...

yeh but thats not necessary probably

#

ah you said it, fun cursed stuff, indeed

molten galleon Dec 28, 2023, 5:18 PM

#

yeah not at all, its just a neat hack, and a look into how commandbuffers are actually implemented on nv drivers

old harbor Dec 28, 2023, 5:18 PM

#

a list of numbers with other numbers in between 🙂

#

hehe reminds me of my super naive commandlist implementation https://github.com/deccer/Xacor/blob/master/Xacor.Graphics.Api.D3D11/D3D11CommandList.cs

nocturne hill Jan 4, 2024, 5:19 PM

#

molten galleon idk how he figured it out

have you ever made cheats for video games

#

same thing

molten galleon Jan 4, 2024, 5:19 PM

#

haven't done anything more advanced than modifying values in cheat engine

nocturne hill Jan 4, 2024, 5:19 PM

#

well those values you were modifying were inside a struct of some sort

#

you figure out some amount of the layout of that struct somehow

#

https://gitlab.freedesktop.org/mesa/mesa/-/blob/main/src/amd/vulkan/radv_private.h#L1603

#

VkCommandBuffer is always a real pointer, basically

#

and so you do ((nv_cmd_buffer*)myVkCb)->things

#

very simple

#

but also if graphite learned this by poking at nv driver with a disassembler, the resulting work is potentially not legally shippable

#

this also won't work in presence of any layers that implement VkCommandBuffer

old harbor Apr 13, 2024, 1:26 PM

#

https://www.youtube.com/watch?v=SO83KQuuZvg

YouTube

Sebastian Lague

Coding Adventure: Rendering Text

This... is text! Let's figure out how to draw it.
Starring: Bézier curves and (so many) floating point problems.

Source code: currently in early access on Patreon, but will be freely available on the 25th. If you'd like to support my work (and get early access to new projects) you can do so here:
https://www.patreon.com/SebastianLague
https://k...

▶ Play video

molten galleon Apr 13, 2024, 4:20 PM

#

pretty cool, I like how he goes from first principles and explores all the differentm ethods

balmy gulch Apr 13, 2024, 4:21 PM

#

https://tenor.com/view/cat-revives-gif-24153050

Tenor

old harbor Apr 13, 2024, 4:32 PM

#

sebastian can explain all the stuff he does so nicely

nocturne hill Apr 13, 2024, 8:49 PM

#

old harbor https://www.youtube.com/watch?v=SO83KQuuZvg

#wip message

old harbor Apr 13, 2024, 8:51 PM

#

noice

nocturne hill Apr 13, 2024, 8:52 PM

#

:frog_ancient:

#

I should redo my path rasterizer at some point, I'm not really sure I agree with lague's takeaways tbh

#

finding roots of even quadratics is a huge floating point pain

#

which he does demonstrate in the vids

#

in fact IIRC my solution to numeric robustness was to switch to iterative root finder

#

it was simply more robust against higher power coefficients being close to 0 than solving analytically

#

I was considering tessellating my stuff to linear segments

#

that stuff is very robust

#

or maybe I should do something else idk

#

and also I rather want path/vector textures

#

instead of just drawing text on-screen

#

though

#

that might be a meme use case

#

because there's no way to author that

#

and a much better solution for vector-looking memes on models is still SDF/threshold masking

#

because that's actually authorable in something like blender

old harbor Apr 13, 2024, 9:04 PM

#

NV_path_rendering my beloved 🙂 (jk)

molten galleon Apr 13, 2024, 9:27 PM

#

wdym no way to author that, inkscape?

nocturne hill Apr 13, 2024, 9:31 PM

#

molten galleon wdym no way to author that, inkscape?

in blender

#

well I guess you could do that but idk it's not the most convenient looking thingy

#

I don't think you can use SVG as a texture in blender

molten galleon Apr 13, 2024, 9:33 PM

#

oh I see what you mean, using svgs as texture maps for models

#

that'd be cool but you'd probably have to roll your own tooling via a plugin

nocturne hill Apr 13, 2024, 9:34 PM

#

that tooling would just be converting stuff to bitmap at like load time and it would suck a lot

#

I tried doing ktx2 images in blender that way once

molten galleon Apr 18, 2024, 2:37 PM

#

Finally got around to removing all my optional git submodules and just pulling them if you have tests enabled and don't have them in tree

#

now the library is officially usable and not clutter

molten galleon Apr 19, 2024, 6:09 PM

#

#

don't love synthetic bold, but it works

#

need to properly override the advance lookup

molten galleon Apr 19, 2024, 7:31 PM

#

synthetic italic isn't nearly as bad

old harbor Apr 19, 2024, 7:31 PM

#

i prefer the light and regular ones tbh

#

and medium for specific fonts

molten galleon Apr 19, 2024, 7:33 PM

#

regular is the standard font weight

#

that's the one that's being scaled

#

most of the time you only ever see regular and bold

old harbor Apr 19, 2024, 7:36 PM

#

yeah

#

i put Light on purpose for coding fonts sometimes

#

or imgui :3

molten galleon Apr 20, 2024, 6:00 PM

#

#

and (non-synthetic) smallcaps

#

slowly working through the remaining feature list

molten galleon Apr 23, 2024, 12:34 PM

#

oh boy I am getting to the fun parts, so far I've been doing smallcaps, superscript, and subscript glyph substitutions by passing the relevant opentype font tags to the harfbuzz shaper, and it just does the substitution lookup from the gsub table if possible, but now I'm looking into how to synthesize them if they're not present and harfbuzz doesn't seem to provide too much for that

#

I think relying on opentype features is on the whole, the wrong way of going about it, and if I want synthetic substitution glyphs I'd have to do a pre-scan in the step where I generate font runs and have some kind of font instance for handling the substitutions via the hb nominal glyph function override

#

luckily this is a CSS feature https://developer.mozilla.org/en-US/docs/Web/CSS/font-synthesis so I am thinking the easiest way to figure out how to do this is to just make a debug build of chromium on linux, make a simple html file, and insert a debug break where they call harfbuzz stuff

#

if at all possible though I want to avoid the scenario where I need to parse the otf tables (or worse, write into them)

left junco Apr 24, 2024, 2:43 AM

#

What is the input to this whole pipeline

#

Just font files?

#

Or is there some database that fonts only affect a subset of

molten galleon Apr 24, 2024, 9:30 AM

#

there's font files, font family description files, and the formatting runs data for the target string

#

#

font family files describe the relationship between individual faces, which is mainly necessary for changing the weight, (style) italicness, and finding the correct sources for languages and symbols

#

#

and the formatting runs just describe the intervals of fonts, colors, italic, underline, stroke, etc. for the string, in my library I provide a generator that parses inline XML from your string for formatting, but your source could be anything, like syntax highlighting

molten galleon Apr 24, 2024, 6:52 PM

#

LLVM ERROR: IO failure on output stream: No space left on device

#

fuck guess I'm not building chromium, only 450 files left

#

I could probably dig up an HDD and move the chromium dir there

molten galleon Apr 26, 2024, 10:06 AM

#

in the absence of actually building chromium, I think I found where they do the smallcaps transformation

#

haven't traced entirely what goes on but it mimics what I assume I'd have to do, which is to split out smallcaps runs during my run segmentation stage

#

based on everything I've seen though, I think now is the time to start building custom harfbuzz font callbacks instead of relying on hb-ft.h

molten galleon Apr 26, 2024, 12:08 PM

#

one of my original ideas was going to be to emit bitflags on the high bits of glyphs to signal if they need synthesis, I think it's actually ok since hb_codepoint_t is unsigned but truetype fonts max out at 16 bit glyph indices I think, but I am not confident enough in that to risk it, and it seems chromium doesn't, instead what they seem to do is split a string like Hello World to H ello W orld and mark those ello and orld ranges as handled by a font that always does a synthesis

#

though I'll have to see how to handle non-synthetic caps since from probing the hb buffer callback, it only does a glyph substitution if it actually has the smcp table and hits a possible substitution, so my get nominal glyph would have to look something like

if is OTF and has smcp table and smcp[chr]:
  return glyphof(chr)
else
  return glyphof(toupper(chr))

molten galleon Apr 26, 2024, 12:30 PM

#

however I don't see any evidence of that in chromium, I might have to build it after all

molten galleon Apr 29, 2024, 3:23 PM

#

been spending a few days fucking around with building chromium

#

whatever version I pulled from main and built (~9 hours) seems to crash on startup, with a mile deep stack trace I have no idea how to debug, so I had to get the git history and start changing tags (like an hour to pull and overnight to do the deltas)

#

switched to the latest tagged version and I'm rebuilding, only 37k files I need to rebuild

#

all to find out I probably already found all the references I need in the source code search, but I still want to have a working version to probe

molten galleon Apr 30, 2024, 12:38 PM

#

I think I've placed too much trust in chromium, I finally did what I should have done to start and put together a simple test page to invoke synthetic smallcaps and subscript/superscript, and I realized that the reason I can't find these features in the source code is that chromium doesn't even support them

#

but my test page works in firefox, so time to start snooping through firefox instead

nocturne hill Apr 30, 2024, 12:39 PM

#

@molten galleon hello where do I license your path rasterizer

molten galleon Apr 30, 2024, 12:39 PM

#

I haven't bothered looking through actual glyph rasterization methods yet

#

been too deep in font features

nocturne hill Apr 30, 2024, 12:40 PM

#

💀

molten galleon Apr 30, 2024, 12:40 PM

#

for now I just rasterize via freetype to an atlas, and have path rasterization for "later"

#

though this is my last major feature before I do that

molten galleon May 1, 2024, 10:07 AM

#

so firefox does pretty much what I expected I'd need to do

#

for synthetic subscript and superscript it basically completely overrides the font settings and doesn't emit a subs/sups tag, just creates a copy of the base font with scaling and translation applied, there's no mixing of synthetic and whatever scant few letters are maybe provided as subscript/superscript, which I guess works out best overall

#

for smallcaps it looks like substitution is never done in hb_font_get_nominal_glyph, but as a part of the subfont-splitting stage of run generation, where it writes out a transformed string based on the actual capping rules (smcp, pcap, c2sc), gonna have to check what ICU provides me here since mozilla is full of a lot of handrolled unicode support, and of particular note is that Greek casing is apparently complex enough to warrant a lot of notes

#

gonna have to look further to see if they check tables/glyphs for availability, but I think due to how CSS font-synthesis tags work, deciding what gets synthesized is mainly user driven

molten galleon May 1, 2024, 3:24 PM

#

so the chromium SmallCapsIterator just marks out runs of small-cappable chars by using u_hasBinaryProperty(UCHAR_CHANGES_WHEN_UPPERCASED) and tacking on u_getCombiningClass() to make sure combining marks join with their adjacent run

#

in the firefox impl they seem to use their custom unicode functions to lookup if a char has some kind of alternate capitalization mapping, but the main thing is that they run a toupper transformation using their custom uppercase transformer thing, which I think I'd have to do with ICU's transformers

molten galleon May 3, 2024, 12:35 PM

#

https://www.unicode.org/reports/tr21/tr21-5.html

#

Case mappings may produce strings of different length than the original.
For example, the German character U+00DF "ß" small letter sharp s expands when uppercased to the sequence of two characters "SS". This also occurs where there is no precomposed character corresponding to a case mapping, such as with U+0149 "ŉ" latin small letter n preceded by apostrophe.

nocturne hill May 3, 2024, 12:40 PM

#

ǹ

molten galleon May 3, 2024, 12:40 PM

#

gonna have to test out how this applies to editable text, if it does at all, since this messes with any kind of logic that'd let me build a remapping between the u_strToUpper result and the original string

#

so true

nocturne hill May 3, 2024, 12:41 PM

#

unicode more like unicöck

molten galleon May 3, 2024, 12:41 PM

#

mozilla has a custom implementation of locale-aware str toupper that seems to obey turkish and greek casing rules but might just ignore funky stuff

molten galleon May 3, 2024, 9:16 PM

#

hmm, so I actually don't know what kind of funky logic they use to actually decide whether to synthesize or not, because it seems to trip on the Straße, but if I force enable smallcap synthesis in the code I get der STRASSE just as I expected, and just as u_strToUpper gives me with lang="de", and based on how the selection cursor interacts with it, the 2 S's are treated logically as 1 character

#

I think I have enough info to start building my own solution, perhaps slightly better than firefox/chromium in these minor ways

old harbor May 3, 2024, 9:21 PM

#

i have a huge respect for you putting up with this text stuff 🙂

#

23:23:23.232

molten galleon May 3, 2024, 9:50 PM

#

I just want it to work well lmao

#

I like how smallcaps look, and from the frontend of the renderer they're just toupper(c) scaled down a few percent, so how hard could it be?

#

little did I know...

molten galleon May 4, 2024, 11:39 AM

#

so one minor thing I failed to consider was that mozilla's TransformString does also fill a bitset with what character indices it performed a special case substitution on, and marks if the string needs a merge, which I can't do without having my own implementation

#

which means it's NIH time baby

#

as a bonus, mozilla and ICU's uppercase transformations are both UTF-16 native so rolling my own UTF-8 native one means I can continue to avoid having to convert there and back, which I've successfully avoided so far

molten galleon May 6, 2024, 10:59 PM

#

#

finally piped everything through to get synthesis for subscript and superscript (here it's actually seamlessly combining with natural smallcaps, not synthetic yet)

#

so now I should have everything I need for synthetic smallcaps from the perspective of glyph transformations, just need to add the char substitutions

old harbor May 6, 2024, 11:18 PM

#

you could toss a little sneak peak in there as well "libFancyFont" with a superscript "tm" 😄

molten galleon May 7, 2024, 10:37 AM

#

true

#

here's how it looks with the ™️ emoji

old harbor May 7, 2024, 10:51 AM

#

that does look neat

molten galleon May 7, 2024, 11:05 PM

#

notepad flashes the cursor exactly 5 times after you make an input then holds solid

#

this is important information

left junco May 7, 2024, 11:29 PM

#

Linux terminals do the same thing, it's to save having to redraw the screen since on an old terminal it would have been the only thing moving

#

Idk if there's any reason for it these days beyond historical

#

But I like it

molten galleon May 8, 2024, 1:13 PM

#

honestly shocked that it more or less "just works"

#

#

just using a hardcoded substitution for now, but it just treats the 2 glyphs like 1 character, just like it's supposed to

#

although it does need to be improved a bit since clearly only the 2nd S glyph is being used for the bounds

#

but now synthetic smallcaps just work, all that's left is to implement the unicode case mapping algorithm

molten galleon May 8, 2024, 8:29 PM

#

https://www.unicode.org/Public/UNIDATA/SpecialCasing.txt

#

ah yes, parseable raw data where some comments are data and some are comments

#

both ICU and firefox seem to have some kind of mix of generated lookup tables and handwritten logic

molten galleon May 10, 2024, 1:05 PM

#

oh boy, I found the icu::Edits class and icu::CaseMap::utf8ToUpper, no need to do any complex casemapping crap myself

#

I think now I've covered every core font styling feature I wanted to add

#

and only 2 layout features left for later, vertical text and tab alignment

#

so now it's time to fix some bugs and start on glyph rendering

old harbor May 10, 2024, 1:07 PM

#

what about angled text 😛

#

i do have some 45deg text in some of my excelsheeterinos hehe

molten galleon May 10, 2024, 1:08 PM

#

like globally angled?

#

iirc there's a param in the GDI font api to set individual character spins

old harbor May 10, 2024, 1:10 PM

#

ye globally, as in the whole text in the cell is angled

molten galleon May 10, 2024, 1:12 PM

#

that's on the UI layer to do, and the glyph renderer to make not look aliased as fugg

#

since you just get rect positions in the space of your parent textbox

#

https://learn.microsoft.com/en-us/windows/win32/api/wingdi/nf-wingdi-createfonta

#

I found what it was called again, escapement and orientation

#

orientation is the rotation of the baseline (so the angle of the string) and escapement is the rotation of each char

#

so to draw a string at an angle, you set orientation = escapement, otherwise your characters are slanted

#

and I assume the reason it does this is probably because your glyph in your atlas has to be rendered as (char = 'A', font = ..., escapement = 45) so that the final bitmap doesn't alias, which is maybe neat to have support for, but I think in a game engine you'd rather handle that with your glyph shaders

#

with an SDF or vector based glyph

old harbor May 10, 2024, 1:30 PM

#

tell that eve online players

#

with their excelsheets in game 😛

#

jk

molten galleon May 14, 2024, 12:24 PM

#

hmm I feel sort of dumb, for a while I thought scaled fonts had some hidden internal scale factor, but I realize I knew how to calculate font metrics the whole time, given a ppem size (4/3 * pt size), it's just upem / ppem

old harbor May 15, 2024, 11:14 AM

#

@radiant spear feel free to pester DR about font isms too, he knows his shit most on this server i think 🙂

molten galleon May 17, 2024, 12:25 PM

#

gonna interrupt the regularly scheduled text content to post about sails so I don't clutter other people's threads further lol

#

that convo got me thinking about taking another crack at modeling sail force

#

#

and generally you have these 2 main forces

#

#

plus your air drag which is proportional to v_effectiveWind^2 in the direction of -v_effectiveWind

#

I realized just now you could probably get a good enough approximation of the lift using standard wing models and using the chord/sweep/twist/whatever other descriptors of standard wings you have to interpolate the models, and just account for the fact that when v_effectiveWind is parallel to t_sail, the force is 0 because the sail goes flat, and it flips if dot(v_effectiveWind, n_sail) < 0

#

#

#

this btw is why you see a peak of power somewhere around a broad reach, since you get the maximal combo of lift and reaction

#

@left junco if you were further curious about the model

#

there's some interesting behaviors that arise from it too, since in either case the power is delivered in the direction of the sail's normal, but your forward speed is proportional to dot(d_ship, n_sail), so with an "idealized hull" with no lateral drift, there is an optimal angle to maximize the force delivered in the forward direction, but that can change with the lift force since that's a function of effective wind, which depends on your speed, so there's a feedback loop

#

and you'd probably have to solve a differential equation to find its optimal stable state

#

I might have better luck figuring this out 'acquiring' a book on aircraft physics instead of trying to get one of the rare sailing oriented ones, dunno why I never thought to do that

left junco May 17, 2024, 4:11 PM

#

I'll read this in a bit

#

But yeah it could be helpful

molten galleon May 18, 2024, 12:57 PM

#

back to text stuff, I've been coping with the fact that I can't seem to tune my MSDF generation by copying the windows menus for my debug vars

#

#

and it's a little compressed in the video so it takes away from it, but man my MSDF settings suck

#

I've been picking through the code in msdf-atlas-gen to match what he does

#

but I haven't quite got it

#

and I want to at least get a fair setting of MSDF before I compare it to Qin or direct curve rendering

molten galleon May 19, 2024, 3:03 PM

#

#

#

so I tightened it up a tiny bit and double checked that my math matches msdf-atlas-gen, which it does, but MSDF is still fuzzy at large fonts and ugly at small fonts, I think this is down to how I'm rendering it perhaps

#

to render it I referenced this article which referenced in turn this github issue on msdfgen

Medium

Implementing MSDF Font In OpenGL

MSDF is an efficient, modern way of drawing fonts at any size and this article looks at how you can implement it in your application!

GitHub

Improved anti-aliasing. · Issue #22 · Chlumsky/msdfgen

Hi Chlumsky, I noticed some aliasing artifacts when rendering fonts using msdfgen. By changing the shader a bit, I was able to improve the rendering quality: float sigDist = median( sample.r, sampl...

#

which gave me the following code:

vec3 msdf = texture(u_texture, v_texCoord).rgb;
float sigDist = median(msdf.r, msdf.g, msdf.b);
float w = fwidth(sigDist);
float opacity = smoothstep(0.5 - w, 0.5 + w, sigDist);
    
outColor = vec4(v_color.rgb, v_color.a * opacity);

#

which is notably different from the example given on the msdfgen readme: https://github.com/Chlumsky/msdfgen?tab=readme-ov-file#using-a-multi-channel-distance-field
in that it lacks any kind of notion of my input scale or range

#

hmmmmmmmm

#

might need to open it as a generic GP question because I don't really know how antialiased SDF rendering works

molten galleon May 20, 2024, 12:44 AM

#

hmmm I think I'm getting there except for this annoying artifact

molten galleon May 20, 2024, 6:52 PM

#

a lot of experimentation and yet I think I had the best solution the whole time

#

I think small letters just suck with SDF fonts

#

float w = fwidth(sd);
float opacity = smoothstep(0.5 - w, 0.5 + w, sd);

seems to just work, and is especially good for upsampling

#

https://drewcassidy.me/2020/06/26/sdf-antialiasing/

#

this article with its more primitive version seems to show it even works in 3D

#

wonder what I can even do for downsampling, I tried generating mipmaps (automatically) and that didn't really give me much

nocturne hill May 20, 2024, 6:54 PM

#

molten galleon I think small letters just suck with SDF fonts

get high density display

#

if you have chunky pixels you need to do subpixel rendering which can get real headacheful with blending and stuff

#

but it's just evaluating the sdf at 3 points instead of 1

nocturne hill May 20, 2024, 6:55 PM

#

molten galleon wonder what I can even do for downsampling, I tried generating mipmaps (automati...

what filter did you use

#

I don't think pretending SDF is a signal and low pass filtering it is a good idea

#

but let me think for a few seconds

#

ah actually

#

right

molten galleon May 20, 2024, 6:56 PM

#

whatever glGenerateTextureMipmaps uses, I thought about it for a second too and since SDFs discretely sample a lienar thing it seemed fine

#

it didn't even seem necessary actually

nocturne hill May 20, 2024, 6:57 PM

#

anyway

#

I don't think you can do anything at all with SDFs

molten galleon May 20, 2024, 6:57 PM

#

though it makes sense overall why small images suck, you completely miss the important samples that describe the edges

#

potentially

nocturne hill May 20, 2024, 6:57 PM

#

aliasing when letters are small is because you can't represent the edges or whatever

#

SDF can't help you there

#

well

#

you don't need normal images

#

you can just sprinkle some TAA at it

#

assuming you're drawing in 3D

#

also >using gl

#

cringe!

molten galleon May 20, 2024, 6:59 PM

#

this is my shitty test program and not my main vulkan renderer so my goal is to just have as small of a renderer footprint as possible

nocturne hill May 20, 2024, 7:00 PM

#

ok

#

well anyway

#

in 3D I just wouldn't care and let TAA eat it

#

but if you do just switch to normal bitmap with full mip chain

#

well I guess there's also the sucky case of when you're doing a very anisotropic sample

molten galleon May 20, 2024, 7:03 PM

#

yeah, it seems good enough overall for 3D but I guess the lesson here is you can't have MSDF be a 1 size fits all solution for 2D text and 3D

nocturne hill May 20, 2024, 7:03 PM

#

sdf just kinda sucks tbh

#

msdf too

#

msdf actually even worse because the algorithm is something crazy, slow and falls apart in some cases

molten galleon May 20, 2024, 7:05 PM

#

yeah, I'm hoping the Qin paper that jasper mentioned might at least be a saner formulation of MSDF-like stuff

nocturne hill May 20, 2024, 7:05 PM

#

just rasterize paths as is

#

no need for this msdf bs

#

you can also modify your path rasterizer slightly and have it render sdf

molten galleon May 20, 2024, 7:06 PM

#

that's what I will probably end up doing, but the Qin thing seems interesting to at least try, esp. if it ends up being better than MSDF

nocturne hill May 20, 2024, 7:06 PM

#

this way you can also generate bitmap representation at runtime + mips

molten galleon May 20, 2024, 7:06 PM

#

since everyone seems to shill MSDF

#

https://dl.acm.org/doi/abs/10.1145/1111411.1111433

#

and I don't think there's a publicly available implementation of it anywhere

nocturne hill May 20, 2024, 7:08 PM

#

frog_turtle

#

I'll take a look once it downloads LMAO

#

the pdf I mean

#

but it's probably a nexttonothingburger

#

ah nvm found the pdf that opens quickly actually

molten galleon May 20, 2024, 7:09 PM

#

just paste the link into scihub

nocturne hill May 20, 2024, 7:14 PM

#

ok I think I understand what it's doing

#

IIUC there's a list of lines per cell in the grid and they compute min or max { distances to all lines in the list }

#

or segments

#

something along these lines

#

hmm yea it's segments

#

section 5 goes in detail

molten galleon May 20, 2024, 7:27 PM

#

yeah this is pretty weird and itneresting, this is the first time I actually inspected it closely

#

the way it was initiallly described to me was as some kind of much earlier proto-MSDF but with a saner distance function

#

the data representation seems a little funky for modern use though with the voronoi diagram and the indirection

nocturne hill May 20, 2024, 7:40 PM

#

why

molten galleon May 20, 2024, 7:41 PM

#

there's some kind of per-texel indirection and linear search

#

so each glyph seems to point to an index in the line segment table and then it also looks at the next 4 neighbors

molten galleon May 22, 2024, 10:21 AM

#

been rolling the Qin stuff around in my head for a bit and I'm starting to appreciate the elegance of it

molten galleon Jul 6, 2024, 1:36 PM

#

FUCK

#

I have 2 projects, 1 that's just a tester for my lib and 1 that's my actual engine

#

the tester can link the library and run code, my real engine tells me it can't find libicudt

#

ld.lld: error: undefined symbol: icudt73_dat
>>> referenced by libicuuc.a(udata.cpp.obj):(.refptr.icudt73_dat)

#

and I even set up my tester to have a sub-library like the LibEngine in my main repo

#

this literally makes 0 sense

#

the raw compile commands even reference the same libs in the same order

old harbor Jul 6, 2024, 1:49 PM

#

are you on windows still?

molten galleon Jul 6, 2024, 1:50 PM

#

yep

old harbor Jul 6, 2024, 1:50 PM

#

delete build/ (or whatever $(IntermediateDir) in vs perhaps?

molten galleon Jul 6, 2024, 1:51 PM

#

ok now I'm really confused

#

I'm building with gcc

old harbor Jul 6, 2024, 1:51 PM

#

with cmake?

molten galleon Jul 6, 2024, 1:51 PM

#

I copied the libicudt.dll.a from my working project to the non-working project

#

yep

#

not even the actual dll, just the dll.a file

#

and that fixed the build

#

so for some reason, just my engine project is building a bad dll.a file

old harbor Jul 6, 2024, 1:51 PM

#

dll.a should just be the forward for the dll

molten galleon Jul 6, 2024, 1:51 PM

#

very consistently

old harbor Jul 6, 2024, 1:52 PM

#

hmm do you have a rogue setting somewhere in your cmakelists perhaps

#

which switches BUILD_SHARED or something

molten galleon Jul 6, 2024, 1:52 PM

#

I shouldn't

#

hmm brb

#

I will get back to this

old harbor Jul 6, 2024, 1:53 PM

#

: )

#

welcome back btw hehe

molten galleon Jul 6, 2024, 4:38 PM

#

ok I came back, made a new cmake build directory and config'd it, and that directory doesn't have this issue

#

and I did try doing ninja clean in my old one and rebuilding completely

#

so I guess my old build dir had some magic state in the cmake cache that told it to fuck my shit up

#

average saturday morning C++ session

old harbor Jul 6, 2024, 4:40 PM

#

: )

#

deleting build/ only didnt do the trick?

molten galleon Jul 6, 2024, 4:41 PM

#

I just made a build2 in case I had some kind of untracked junk in my build dir I wanted to copy over

#

which is essentially the same as deleting build/

old harbor Jul 6, 2024, 4:41 PM

#

ah

molten galleon Jul 6, 2024, 4:41 PM

#

I don't think there's any particular setting that would've caused this, I don't have BUILD_SHARED_LIBS enabled in either, and my ICU setup is handrolled to build everything static except icudt.dll

molten galleon Jul 9, 2024, 1:50 PM

#

so, asset manager = finally integrate LibRichText into my engine = set up a fresh text atlas using my new UploadPool = I don't have the desired subregion upload behavior

#

I think I understand why devsh has a synchronous mode for his upload pool

#

however there is no way I'm waiting on a semaphore to upload glyph subregions

wet drift Jul 10, 2024, 11:35 AM

#

molten galleon I think I understand why devsh has a synchronous mode for his upload pool

it only kicks in when you run out of staging space

#

we don't actually wait per subregion/glyph

#

timeline semaphores are truly bae here

molten galleon Jul 10, 2024, 12:06 PM

#

yeah that's the conclusion I ended up coming to

#

I added a "semi-synchronous" flush() operation to my upload pool where it will wait on intermediate submissions if it needs to wait to stage more upload jobs, but it won't wait on the last submission

#

so if there's only 1 submission required it doesn't wait at all

molten galleon Jul 10, 2024, 12:10 PM

#

wet drift timeline semaphores are truly bae here

btw, if I issue my upload submissions w/ a final transition barrier and it's issued to the graphics queue, I don't have to wait on the semaphore in my main render submission because it's also on the same queue, right?

wet drift Jul 10, 2024, 12:31 PM

#

yes

molten galleon Jul 10, 2024, 12:38 PM

#

epic

molten galleon Jul 11, 2024, 10:46 PM

#

#

briefly derailing my own thread to come to what is perhaps the most obvious realization

#

displaying an ECS style level in editor is as simple as displaying the entities in their transform hierarchy in the explorer/tree view, and displaying the components in the properties pane, with each of the subheaders (Data, Selection, State, Text in the examples) just being representatives of the components

#

also >OpenTypeFeatures
I need to cook...

old harbor Jul 11, 2024, 10:48 PM

#

: )

#

dotnet peddler spotted

molten galleon Jul 11, 2024, 10:49 PM

#

weak, pathetic

#

I sleep

old harbor Jul 11, 2024, 10:49 PM

#

you can even use "Category" on the left side to "group" properties into categories or "components"

molten galleon Jul 11, 2024, 10:49 PM

#

yeah, that's how it seems to be used here

#

but this is a pretty convenient way to hijack this scheme to display ECS instead of a polymorphic node tree

old harbor Jul 11, 2024, 10:55 PM

#

yeah

#

you cheapskate

livid patrol Jul 11, 2024, 11:17 PM

#

molten galleon displaying an ECS style level in editor is as simple as displaying the entities ...

I came to this realization somewhat recently as well. Before, I had the component editor inline with the transform hierarchy and it was just a mess

#

It also helped to look at existing things that do something similar (blender, ue)

molten galleon Jul 11, 2024, 11:20 PM

#

I didn't know blender was ECS-like in any way

#

it has pretty strongly typed nodes afaict

left junco Jul 11, 2024, 11:20 PM

#

Now you just have to add all the serialization code bleakekw

#

Or the reflection rather

molten galleon Jul 11, 2024, 11:20 PM

#

that's for my python scripts to do

left junco Jul 11, 2024, 11:20 PM

#

Aha smort

#

I forgot you were codegen-pilled

molten galleon Jul 11, 2024, 11:21 PM

#

behold the IDL for my UI classes

left junco Jul 11, 2024, 11:22 PM

#

left junco I forgot you were codegen-pilled

A phrase which my phone suggests to me in auto complete I might add lmao

livid patrol Jul 11, 2024, 11:26 PM

#

molten galleon I didn't know blender was ECS-like in any way

By "components" I just mean editable properties

#

Idk how blender is structured internally

molten galleon Jul 11, 2024, 11:28 PM

#

oh, yeah the explorer+properties dual-pane thing is pretty common once you start noticing it

#

I think it originates in VS actually, and it kinda makes sense that application developers would be most inspired by the tool they're using to make the app

#

"hmmm, I'm in the forms editor, but how should I make my application look?"

old harbor Jul 11, 2024, 11:31 PM

#

wpf makes this also super sexy

#

you design the control you want for a specific type, add it as a resource, and have a DataTemplateSelector select it based on the current object'type

#

its just a big switch block

tropic shoal Jul 12, 2024, 2:29 PM

#

Blender is more ecs base internally as the data model is mostly C and not cpp. The python api is a generated wrapper on top of the c data model.

molten galleon Jul 12, 2024, 5:11 PM

#

that's pretty interesting, also it does look like blender is keen on visually displaying "has a" relationships in the tree view

#

I hadn't really thought too deeply about it

molten galleon Jul 13, 2024, 12:03 AM

#

fuck I found a fun case 足ＩＫ.R

#

#

the I and K are full width (so funky shift-jis compatability stuff), but they register as latin so it selects the latin face instead of the CJK face

#

I think I might just fix this by having my CJK face be the lowest priority general fallback for my latin face

#

ok that did it

old harbor Jul 13, 2024, 10:19 AM

#

that explains why CJK people when writing latin text always have ugly looking ms sans serif stretched, wide text

molten galleon Jul 14, 2024, 1:36 PM

#

adding engine features vs. scrolling through famfamfam/fatcow icons to find the perfect ones

old harbor Jul 14, 2024, 1:47 PM

#

ah man that looks neat

#

famfamfam do be timeless

molten galleon Jul 25, 2024, 8:55 PM

#

in my recent bikeshed rabbithole I started NIHing an LSP client to get syntax highlighting, only to find out LSPs don't feed you basic token info

#

so now I have an LSP client and need to roll some kind of tokenizer and syntax highlighting provider on top of that

#

and I think I was supposed to be working on asset libraries or something...

left junco Jul 25, 2024, 9:13 PM

#

What's LSP

old harbor Jul 25, 2024, 9:17 PM

#

langauge service/server protocol

#

the thing which gives you syntax highlighting or intellisense in your vim or else where

left junco Jul 25, 2024, 9:19 PM

#

Ah

molten galleon Jul 25, 2024, 9:22 PM

#

it doesnt' give you syntax highlighting, but it does give you a lot, like go to declaration/definition, code lens, semantic tokens, etc.

old harbor Jul 25, 2024, 9:24 PM

#

it could though

#

LSPs usually run on a syntax tree

molten galleon Jul 25, 2024, 9:25 PM

#

https://github.com/Microsoft/language-server-protocol/issues/682
yeah, it could technically be a custom command on the server

#

but what I found in practice was that the document tokens request doesn't actually give you tokens with a fine enough grain on the range in text for highlighting

#

it seems like both document tokens and semantic tokens are for augmenting syntax highlighting in ways that simple regex passes from textmate grammars wouldn't

#

and the recommendations all basically point to flagging tokens with a preliminary highlighting pass then augmenting with semantic tokens

old harbor Jul 25, 2024, 9:45 PM

#

probably makes more sense to just regex the text you see right now in that editor view, and color that accordingly, rather than trying to color the whole 10mb worth of your current.cpp 🙂

molten galleon Jul 25, 2024, 9:53 PM

#

yeah, lsps support sending partial text and partial requests afaik

old harbor Jul 25, 2024, 10:18 PM

#

the asset system of yours will greatly benefit from it either way 😛

#

i will allow the side quest

molten galleon Sep 12, 2024, 5:50 PM

#

@balmy gulch

balmy gulch Sep 12, 2024, 5:50 PM

#

ah it was this KEKW

molten galleon Sep 18, 2024, 1:47 PM

#

all this text talk reminds me I need to finish my articles

old harbor Sep 18, 2024, 4:24 PM

#

thisqf

outer dune Sep 18, 2024, 4:36 PM

#

It would be amazing forgelove . You are literally working on stuff that very very few people work on and most take it for granted. Information is scarce and often times mixed (just looking around, the name "font" is used in so many different context for so many different people).

molten galleon Sep 18, 2024, 10:58 PM

#

https://github.com/forenoonwatch/forenoonwatch.github.io/blob/master/text/My Explorations in Text - Part 1.md
here is my rough draft of part1, too lazy to figure out hugo rn

#

tried to make it as short and direct as possible but have enough narrative structure that it wasn't just a dump of "WATCH OUT FOR THIS"

#

I'm kinda worried that it's perfectly clear to me but way too terse for anyone to actually find helpful lol

old harbor Sep 21, 2024, 11:13 AM

#

that screenshot of your ui thing which looks like a winforms knockoff, needs to go into a neat library eventually

#

so that we can make use of it too : >

outer dune Sep 25, 2024, 3:15 PM

#

I was checking in your library how you handle text line breaks, but I wasn't able to find it.
Any tips/pointer to make correct line breaks ?

molten galleon Sep 25, 2024, 3:30 PM

#

I piggyback off of SheenBidi here, this is in my unpublished part2

molten galleon Sep 25, 2024, 3:31 PM

#

outer dune I was checking in your library how you handle text line breaks, but I wasn't abl...

wait, do you mean soft or hard linebreaks

#

I didn't even write that part in my blog yet, actually

outer dune Sep 25, 2024, 3:57 PM

#

Well, I was looking for both. I will start by looking how sheenbidi works and what it does.
I'm looking at the whole concept here, soft/hard line break as well as hyphenation.

molten galleon Sep 25, 2024, 4:14 PM

#

basically everything is in build_layout_info_utf8.cpp

#

I rely on SheenBidi for hard breaks, (aka character-induced linebreaks from CR, LF, LSEP, PSEP), those are really easy to iterate manually because you just search for those 4 characters

#

but SheenBidi just does it, so I let it happen

#

for soft linebreaks, it's basically how you'd expect it to happen conceptually - you do all your shaping first, then you try to find the max run of characters such that width(run) <= textAreaWidth

#

but the trick is you use an icu::LineBreakIterator to find safepoints to break the line

#

so it's max(safe linebreak point) s.t. width <= textAreaWidth

outer dune Sep 25, 2024, 4:18 PM

#

Much much thanks 🙏
What is s.t ?

max(safe linebreak point) s.t. width <= textAreaWidth

molten galleon Sep 25, 2024, 4:18 PM

#

such that lol

outer dune Sep 25, 2024, 4:19 PM

#

Ah lol

#

So it could be done this way :

Shape the whole text through harfbuzz
Find the closest linebreak to textWidth thanks to ICU
Discard everything beyond into a new buffer
Re-Shape the line
Validate or re-iterate from 2
Go down a line through lineHeight found in the font metrics by scale
Restart from 2 for the rest of the text

#

For soft line breaks.
Hard lines breaks can be a pre-pass to that

#

I think I get the gist of it. I will look into your build_layout_info_utf8.cpp file 🙏 forgelove

molten galleon Sep 25, 2024, 4:25 PM

#

you don't reshape the line, the shaping data is always valid, you literally just cut it

#

that's the contract that the icu::LineBreakIterator helps you fulfill - it tells you based on unicode data what is visually safe to cut, so you never end up making a slice in a point that'd require reshaping

outer dune Sep 25, 2024, 4:35 PM

#

Oh wow

outer dune Oct 1, 2024, 8:42 PM

#

Just wanted to say a huge thanks for the pointers. 🙏
I finally got the text rendering at an acceptable level and I would not have succeed without your help.

molten galleon Oct 11, 2024, 9:47 AM

#

briefly back on the horse

#

#

https://drafts.csswg.org/css-text/#tab-size-property

#

maybe I'm the last to find this out, but apparently in text boxes the tab width is entirely based on visual metrics

#

so a 4 space tab aligns to sizeof(' ') not the 4th printable character position

#

#

and CSS gives you two options for it, either supplying the integer space count or specifying the size outright

#

notepad (or whatever text box it derives from) seems to have some kind of particular curve to the tab width in px as a function of the font size, I can't tell where it derives from, but maybe I measured wrong with sharex

#

it seems generally that it's ~6-6.25 pixels per font pt, so a 12 pt font has a 72px tab, an 8pt font has a 50px tab, a 72pt font has a 450px tab, with various values in between

#

I should probably graph it

#

#

it's also approximately, but not quite, 19 spaces

#

so essentially the tab step is applied after shaping, during the point where you sum the advances, you make the advance of any encountered \t chars equal to alignas(tabWidth) cursorX, however the slightly complicating part is that if you have to softbreak the line, I think you have to readjust the size of the tabs

outer dune Oct 11, 2024, 12:23 PM

#

Everytime I see a post in this thread now, I expect greatness and good findings 🙏
I believe if you have a soft break, in some cases that just considered part of the end of the line 1.
In fact it seems... harder to define ? I just tested in Notes. After a certain amount of tabs, it is displayed in line 2.

It seems to be if we are lower than maxWidth, allow one tab at maximum in the line (not visibly displayed) even if it gets bigger than maxWidth. If we are higher or equal maxWidth, 1 tab gets to the next line and is visibly displayed. This would currently break the display in my engine. facepalm

molten galleon Oct 11, 2024, 12:54 PM

#

the status of softbreaks as lines is its own thing

#

but in this context because tabs are a purely visual operation, softbreaks affect it because the tab length affects the length of each line

#

also 99% sure tabs will break your display by default because by default they resolve to a tofu

#

well not "break" but look ugly

molten galleon Oct 11, 2024, 7:45 PM

#

#

got myself worried for a second, but it looks like L1.1 enforces the fact that a bidi shuffle won't affect having to recalculate the dynamic width of tabs

left junco Oct 11, 2024, 7:47 PM

#

You could probably make a living off of this library lol

molten galleon Oct 11, 2024, 7:48 PM

#

I'm going to make the much smarter decision and use this technology solely for textboxes in a game engine that will never ship a game

left junco Oct 11, 2024, 7:49 PM

#

Based

molten galleon Oct 11, 2024, 7:51 PM

#

also idk why I didn't think to do it before but I've also got the CSS spec open

outer dune Oct 11, 2024, 9:11 PM

#

Yeah the best implementation AFAIK is still CSS spec (so far that I found)

molten galleon Oct 12, 2024, 12:06 AM

#

uh oh, codebase is calling in my technical debt

#

I can't properly accumulate line-space offsets to do tabs because of my shitty combination of bases I'm working in, I basically have some data buffers in whole-string space, paragraph space, logical run space, and some redundant representations of glyph widths

#

mainly because I started from ICU layoutex and I wanted to make sure I didn't fuck up the conversion to utf-8

#

but now I think it's refactor and rewrite time

molten galleon Oct 12, 2024, 10:50 AM

#

#

I like how the CSS spec complexity is literally split up in levels

#

I wonder where the spec for text editing is, maybe it's in the html spec

#

https://www.w3.org/2021/06/web-editing-wg-charter.html
this might be something

#

I think I just don't know enough about how you define a text box in HTML+CSS to know who'd be responsible for specifying it, I found the html spec for input type="text" but it doesn't specify the actual input controls, and the CSS specs seem to only cover display

molten galleon Oct 12, 2024, 11:25 AM

#

I guess the actual editing is left as an exercise to the implementer because individual stuff like the cursor is roughly spec'd but I guess it covers for onscreen vs. physical keyboard

#

or other funky input methods

molten galleon Oct 13, 2024, 1:45 PM

#

hmm I really left a big stinky for myself, I didn't document the directionality any of my data is expected to be in so now I have to figure out what steps need what data in what order

#

my final layout info is strictly in (post-UBA) visual order, the data from shaping (which HB gives you in visual order in logical-run-space) I homogenize into logical order (following the order of the string) and then potentially reverse again into visual order based on the bidi level

#

the position data (float) stays forever in visual order without reversal but I currently generate a list of widths (in 26.6 fixed) in visual order that I want to get rid of but that might mean doing something funky like storing my secondary axis in visual order and passing it through to the end, but storing my primary axis in 26.6 fixed in logical order so I can do line breaks and then flipping it during the final run write

#

either way I need to document this mess...

molten galleon Oct 17, 2024, 5:48 PM

#

did you know there's an hb-cplusplus.hh harfbuzz header with a unique_ptr for the buffers?

#

I've been obsessing over how to minimize the number of copies/reversals I need to do in total but maintain my logic, yet it seems like firefox calls hb_buffer_reverse to throw things into logical order... and I can't seem to tell what chromium does because they definitely don't call that, and I don't see them doing much directional branching

outer dune Oct 17, 2024, 6:10 PM

#

That is interesting!

molten galleon Oct 17, 2024, 6:13 PM

#

I also need to set up some benchmarks for conditional reverse iteration and binary search

#

I think I remember from that andrei alexandrescu video that binary search is slower up to like 4 or 5 digit arrays due to the power of cache

#

could be completely missing the mark with that estimate though

molten galleon Oct 18, 2024, 6:52 PM

#

hmm, this might be a perf neutral thing but I think I'm gonna give up on this logic, despite having gotten it to work partially successfully

#

I have decided to say fuck people who use RTL languages, you will pay the price of an O(n) reversal of 32 bytes/glyph

#

and by perf neutral I mean "so far it's been inconclusive in the noise of my profiling" which may or may not have shit methodology

outer dune Oct 18, 2024, 8:03 PM

#

I don't know if there is a lorem ipsum for RTL but that would be a good test. Like you, I expect that to be negligible(until it isn't). You are doing one search that is O(n) right ? That should be fine.

molten galleon Oct 18, 2024, 8:05 PM

#

I just generate random mixture strings at roughly 50/50 distribution but realistically RTL langs are wayyy less common

#

the problem is basically I have 2 choices:

convert everything to LTR order, perform logic on it, convert it back when necessary
keep everything in native order but have a bunch of branching logic that looks like if (RTL) { ... } else { ... } but only have to do reversals during the really rare case where logicalRunDir != visualRunDir

old harbor Oct 18, 2024, 8:08 PM

#

.mυɿodɒl Ɉƨɘ bi minɒ Ɉillom Ɉnυɿɘƨɘb ɒiɔiʇʇo iυp ɒqlυɔ ni Ɉnυƨ ,Ɉnɘbioɿq non ɈɒɈɒbiqυɔ Ɉɒɔɘɒɔɔo Ɉniƨ ɿυɘɈqɘɔxƎ .ɿυɈɒiɿɒq ɒllυn Ɉɒiϱυʇ υɘ ɘɿolob mυlliɔ ɘƨƨɘ Ɉilɘv ɘɈɒɈqυlov ni Ɉiɿɘbnɘʜɘɿqɘɿ ni ɿolob ɘɿυɿi ɘɈυɒ ƨiυႧ .Ɉɒυpɘƨnoɔ obommoɔ ɒɘ xɘ qiυpilɒ Ɉυ iƨin ƨiɿodɒl oɔmɒllυ noiɈɒɈiɔɿɘxɘ bυɿɈƨon ƨiυp ,mɒinɘv minim bɒ minɘ ɈU .ɒυpilɒ ɒnϱɒm ɘɿolob Ɉɘ ɘɿodɒl Ɉυ Ɉnυbibiɔni ɿoqmɘɈ bomƨυiɘ ob bɘƨ ,Ɉilɘ ϱniɔƨiqibɒ ɿυɈɘɈɔɘƨnoɔ ,Ɉɘmɒ Ɉiƨ ɿolob mυƨqi mɘɿo⅃
``` there you go @outer dune

outer dune Oct 18, 2024, 8:08 PM

#

What.The.Hell

molten galleon Oct 18, 2024, 8:08 PM

#

what even is that language

old harbor Oct 18, 2024, 8:09 PM

#

lorem upsom mirrored

outer dune Oct 18, 2024, 8:09 PM

#

Is that parsed RTL ? I 99% sure it's LTR mirrored

molten galleon Oct 18, 2024, 8:09 PM

#

it looks like all LTR glyphs to me

#

the only RTL languages in use today are arabic and hebrew

old harbor Oct 18, 2024, 8:09 PM

#

this was just me do a little trolling, found a mirror text generator and pasted in lorem ipsum

molten galleon Oct 18, 2024, 8:09 PM

#

then a bunch of ancient/obscure languages that are just technically listed

old harbor Oct 18, 2024, 8:11 PM

#

yeah there are only like 8 or so common languages which are RTL by default

outer dune Oct 18, 2024, 8:12 PM

#

I would go option 1. Make everything LTR and convert back if necessary. The other option will introduce a lot of conditionnal branching I believe.

#

Which might be a pain for maintenance later on

molten galleon Oct 18, 2024, 9:26 PM

#

yeah I think I'm gonna stick with option 1 since that makes the code simplest

#

option 2 gives me literally double the code in certain places despite being technically slightly better

#

plus this is microscopic in terms of execution time, most of it comes from allocation (that I can improve)

#

and the rest comes from hb shaping, but that cost is basically immutable

molten galleon Oct 19, 2024, 12:22 PM

#

I think the best data model for positions might just be to only store widths, in logical order

#

the offset0 for each logical run

#

though the more I test the less I'm sure I need to care about that

old harbor Oct 19, 2024, 12:24 PM

#

then what do you need to care about?

#

if not widths and offsets

molten galleon Oct 19, 2024, 12:24 PM

#

just the widths, in aggregate

old harbor Oct 19, 2024, 12:24 PM

#

ah

molten galleon Oct 19, 2024, 12:24 PM

#

basically, (and the more I investigate this the less I'm sure about it)

#

the concept of separating offset and advance is to homogenize layout math over vertical and horizontal text

#

and the aggregate offsetX + cursorX (where cursorX is the sum of all advances 0..n-1) is the only critical position for each glyph

old harbor Oct 19, 2024, 12:25 PM

#

you could pick the one right now which your belly tells to use

#

also the one which might be more readable (code/shader wise)

molten galleon Oct 19, 2024, 12:26 PM

#

I'm picking widths because it makes tabs the easiest to implement

#

this is internal representation while I'm building the text, the final object stores positions (since that makes rendering the fastest/simplest)

#

because when you're dealing with tabs you have to basically apply realWidth = isTab ? alignUpToNext(lineWidthSoFar, tabWidth) : width

old harbor Oct 19, 2024, 12:27 PM

#

yeah

molten galleon Oct 19, 2024, 12:30 PM

#

but the more I think about it, the LayoutEx code I referenced originally already takes the assumption that offset0 = 0 because otherwise accumulating widths wouldn't even get you the line width

#

and also a line split down the middle assumes that the offset/advance of the glyph - 1 doesn't matter

#

#

I think I'm gonna have to store them heterogenously based on what's the primary and secondary axis

#

because basically these widths are what line breaking (and your cursor position) care about

#

and the green arrows are the offset on your secondary axis, which I guess I could also store as widths but I definitely would have to store o0 separately in that case and it would effectively just be difference encoded

molten galleon Oct 20, 2024, 1:11 PM

#

ok epic, having a hybrid width/position data model was definitely the way to go, actually simplified my code a bit

#

and now I have no redundant position/width representations in the data, at my goal of 16 bytes per glyph

#

which is like the theoretical minimum I can achieve uncompressed/unquantized, and doesn't count the memory required for stuff that doesn't scale directly with glyph/character count

#

and that raises some pretty interesting stuff to think about when you consider absolutely huge blocks of text, since they require at least 16x the memory to display (and I am being more efficient with my redundant data than chromium or firefox for sure)

#

so rendering absolutely fuckhuge text files (obviously) requires some pretty aggressive culling, and in turn some pretty efficient scans/heuristics to determine what you wanna display in the viewport

#

and I wonder what you'd do there with soft line breaks enabled, since just scanning the string for newlines up front wouldn't necessarily give you accurate positioning for what your viewport is looking at

old harbor Oct 20, 2024, 1:27 PM

#

isnt it just counting

#

you always know how many characters fit in your viewport

#

and the viewport shows a span of text

molten galleon Oct 20, 2024, 1:31 PM

#

old harbor you always know how many characters fit in your viewport

not necessarily because they all have different glyph sizes

#

and some aren't even visible

#

imagine the world's most adversarial 10GB logfile where the first GB is just several billion ZWNJ

#

so line 0 starts over 1gb into the file

#

﷽

#

𒐫

#

and imagine some of these

#

notepad++ knows the right thing to do though

old harbor Oct 20, 2024, 1:40 PM

#

ah i forgot about those things

#

hmm

#

you do seek within the bytestream of your text still

#

when you scroll through a file for example

#

then you read the span, and inteprete it somehow, and afterwards you know all the characters which are supposed to be visible and whichnt

#

hmm this isnt trivial 🙂

molten galleon Oct 20, 2024, 1:50 PM

#

old harbor then you read the span, and inteprete it somehow, and afterwards you know all th...

the problem there is that this is not just nontrivial, but requires A) dealing with variable sized utf8 encoded codepoints and B) is essentially data-driven since you need the unicode metadata on a codepoint's properties

#

the way I'd do it is just to populate it lazily, first by scanning the entire string for hard newlines (with SIMD + multithreading you could probably rip through several GB/second) and use that as the initial assumption, then actually try laying out whatever's currently supposed to be visible in your viewport

#

then use the softbreaks from that as an "adjustment"

#

but this also makes me wonder if I can use simdjson/simdutf8's code tricks + parsing UnicodeData.txt directly to do some kind of codegen for the theoretical most optimal text prescan possible, based on the actual unicode class queries I need

#

but that's a project for the far future

old harbor Oct 20, 2024, 2:08 PM

#

man, a true brainworm hole

molten galleon Oct 20, 2024, 3:39 PM

#

yeah Big Text™️ is its own separate rabbithole on top of regular small string rendering

left junco Oct 20, 2024, 5:28 PM

#

I'm gonna have to make a mesopotamian faction whose soldiers talk in cuneiform

molten galleon Oct 20, 2024, 5:40 PM

#

the dispute over low quality copper has gone violent

molten galleon Oct 21, 2024, 11:29 PM

#

replaced building vectors and iterating them with iterators for all of my logical run building, benchmark result: inconclusive

#

however I know in my heart that like 8 less std::vectors get allocated per string, and that is enough to bring me joy

#

need to probably run profiling with MSVC so I can get the cool heatmap on the code and see what it tells me beyond the obvious hb_shape and "why are you allocating a hb_buffer and icu::LineBreakIterator per string shaping?"

molten galleon Oct 22, 2024, 10:26 AM

#

I see why no one bothers to make nontrivial optimizations to anything outside of shaping now, lmao

#

#

something minorly stupid and annoying though is that calling icu::Locale::getDefault() too frequently has a tangible cost

#

though other than that, the main thing that registers outside of hb_shape (as like 3% of total time lol) is the allocation costs of SheenBidi

#

actually this was a bad test, since the profiling grouped all sizes of string (up to 1MB) into the same calculation

#

for my smallest test (string size = 64) hb_shape is 'only' 71.81% of the execution time

#

but as expected this only magnifies allocation snafus, e.g. I don't do any reserve calls on my final layout struct, but I do for my temp build vectors, and as such there's a huge cost of allocation from appending positions and glyphs

#

the allocation for the icu::LineBreakIterator is also significantly more expensive than the SBParagraph and hb_buffer, but that's fine because I can factor it out, and seemingly using it is mostly fine

#

and interestingly some attrition from using UText in computing subfonts instead of a dedicated UTF-8/16 iterator

molten galleon Oct 22, 2024, 12:45 PM

#

profiling even shorter strings basically tells me what I expected - initialization and allocation dominates even harder and small buffer optimization (and finally not being lazy and factoring out the reusable stuff) will get me basically the only meaningful wins I can conceivably get

#

however... will I eventually try to create some kind of simd version of sheenbidi just so I can say I beat it in benchmarks? perhaps

#

and hb_shape in turn is basically bottlenecked by however quickly I can read glyphs

#

so, harfbuzz is itself pretty impressive in its optimization

outer dune Oct 22, 2024, 1:04 PM

#

I'm happy to hear that I found the same issue. The bottleneck is also hb_shape on my end.

#

And I can't do much in there (or I don't want to look inside)

#

The other one I found was UText and LineBreakIterator that I heavily re-use when possible.

molten galleon Oct 22, 2024, 1:08 PM

#

yeah, though luckily it's the LineBreakIterator allocation that's the majority of it, not even setText or preceding

#

finding sub-fonts is the only place I use a UText where I don't need to, just because I wanted to unify my code for handling UTF-16 for doing regression testing against ICU

#

but now that I'm confident I don't need to use ICU's stuff anymore I can just drop that

#

but UText is the best interface you have for the break iterators, and it's fine enough there

molten galleon Oct 22, 2024, 3:11 PM

#

with the icu::LineBreakIterator creation factored out, hb_shape is about 80% of an 8 char string, with a scant few % (5 or less) in allocations for appending final data, get_sub_font, and SBAlgorithmCreateParagraph

#

still like 13% speedup just from getting rid of it lol

molten galleon Oct 22, 2024, 9:47 PM

#

ack

#

yet another (several) annoyances I didn't account for

#

SBParagraphDefaultLTR/RTL doesn't seem to correctly swap the run dir (I assume because these characters are strongly LTR), so to specify what I think are level overrides in ICU I need to explicitly specify base paragraph levels of 0/1 (equivalent to CSS direction: ltr | rtl)
harfbuzz is actually smarter than I thought for my own good, and actually resolves glyph mirroring for me, however I think this ends up fine overall because the selection I posted above resolves to llo>, which is correct

#

#1 is a tiny bit annoying because the UBA spec has no definition for what a "default" level is supposed to mean...

#

#

#

ah

molten galleon Oct 23, 2024, 10:17 PM

#

#

big data model refactor is done, L4 character mirroring is confirmed to be a non-issue, directionality overrides work, and tabs now work, also fixed an issue with string-terminating newlines

#

though what I'm seeing here is that I need to revisit subfont selection because it's able to use Twemoji for the numbers and that gets on my nerves

old harbor Oct 23, 2024, 10:32 PM

#

can you render composite eomjis too?

#

like a female black vampire

#

i think thats vampire+darkskintone color + zwj + female sign

#

ah yeah

#

https://emojipedia.org/woman-vampire-dark-skin-tone

Emojipedia

🧛🏿‍♀️ Woman Vampire: Dark Skin Tone Emoji

molten galleon Oct 23, 2024, 10:38 PM

#

#

yeah I can

old harbor Oct 23, 2024, 10:40 PM

#

where is the female coloured vampire? 😄

molten galleon Oct 23, 2024, 10:40 PM

#

#

didn't realize you sent a specific emoji lol

old harbor Oct 23, 2024, 10:40 PM

#

hehe

#

that is cool

molten galleon Oct 23, 2024, 10:41 PM

#

glad I added clipboard support

old harbor Oct 23, 2024, 10:41 PM

#

data point magic

molten galleon Oct 23, 2024, 10:46 PM

#

new knowledge has decided to reveal itself to me:
https://simoncozens.github.io/a-duffers-guide-to-fontconfig-and-harfbuzz/
https://gist.github.com/CrendKing/c162f5a16507d2163d58ee0cf542e695

#

also the spec for how CSS picks fonts https://www.w3.org/TR/css-fonts-4/#font-matching-algorithm

#

but this is more of the macro scale, front-end of font picking where you match a named family to some file or files, not necessarily to figure out what file can render a substring

old harbor Oct 23, 2024, 11:07 PM

#

somehow fonts feel as complex as operating systems, and browsers to me

#

AAA engines are nothing compared to that either hehe

molten galleon Oct 23, 2024, 11:18 PM

#

making text "just work" is a huge part of browser magic

#

and OS magic, for that matter

balmy gulch Oct 23, 2024, 11:27 PM

#

this is epic, but how complex of text can you render without dying?

molten galleon Oct 23, 2024, 11:30 PM

#

idk, hypothetically anything

#

my coverage is pretty good, I can do emoji, I have inline markup for underlines, italic, bold, smallcaps, subscript, superscript (with synthetic glyph gen for all those if they're not available in any font files)

#

emoji with any joiners, zalgo text

#

I render tabs now

#

the single biggest thing I'm missing atm is vertical text, and I've been working up to it in my latest refactor

balmy gulch Oct 23, 2024, 11:32 PM

#

nice

#

how do emojis work btw

#

are they like regular glyphs with color or something else

#

because if they are regular glyphs then they must have a whole lot of codepoints innit?

molten galleon Oct 23, 2024, 11:33 PM

#

molten galleon

yeah stuff like this is 4? codepoints

#

#

5

balmy gulch Oct 23, 2024, 11:34 PM

#

no I mean

molten galleon Oct 23, 2024, 11:34 PM

#

the first 2 are U+1xxxx which means they're 4 bytes I think

balmy gulch Oct 23, 2024, 11:34 PM

#

points in the glyph

#

the actual unicode scalars you use to bezier the thingy

#

or is that not how this works

#

(my understanding is extremely limited)

molten galleon Oct 23, 2024, 11:35 PM

#

oh yeah probably, atm I have freetype rasterize them precomposed because I've been too lazy to deal with too many actual rendering techniques

#

since I've been focusing mostly on layout

#

but you can also have freetype give you them as mono-color layers

balmy gulch Oct 23, 2024, 11:35 PM

#

smh can't believe you didn't make harfbuzz and freetype from scratch

molten galleon Oct 23, 2024, 11:35 PM

#

(really you get the outline 1 layer at a time + a color and you have to rasterize it yourself either through freetype or some other way)

#

the thing I always say is that it's amazing how much you still have to do even if you grab all the libs you need from the start

balmy gulch Oct 23, 2024, 11:36 PM

#

but yeah cool stuff

#

yeah text is a lot of work

molten galleon Oct 23, 2024, 11:36 PM

#

I literally have 4 giant dependencies and there's a ton I need to do on top of that

#

freetype+harfbuzz+icu+sheenbidi (which I technically wouldn't need if I wasn't stubborn about making everything work natively in utf-8 without utf-16 conversion)

#

it gets a shade more annoying because I have to make sure the data I generate from layout is optimal and reconciles with being able to use it for text editing

#

stuff like, e.g. pango is pretty full-featured but read only afaik

#

just shits finalized pixels to a render target

molten galleon Oct 24, 2024, 9:53 AM

#

a simple but long awaited change

molten galleon Oct 25, 2024, 10:34 AM

#

https://developer.mozilla.org/en-US/docs/Web/CSS/text-orientation
https://developer.mozilla.org/en-US/docs/Web/CSS/writing-mode

https://drafts.csswg.org/css-writing-modes/#text-flow

molten galleon Oct 25, 2024, 12:09 PM

#

#

nervous

outer dune Oct 25, 2024, 12:29 PM

#

Oh god that’s horrible to implement

molten galleon Oct 25, 2024, 12:33 PM

#

yeah, you have to split the string into first char and the rest, use the base metrics of the text (line height and ascender) to figure out the glyph metrics of the first char, shape the first char, then shape the rest of the text in 2 pieces, determining the split where it falls off the first block

#

luckily this is way, way out of scope

#

also hyphenated word breaking

molten galleon Oct 25, 2024, 1:42 PM

#

straight up can't find what firefox uses for vertical line height (or I guess line width, as it'd be in this case), I suspect they might just use the horizontal metrics

#

meanwhile

#

in chromium:

left junco Oct 25, 2024, 6:18 PM

#

outer dune Oh god that’s horrible to implement

But so so sexy

molten galleon Oct 25, 2024, 7:21 PM

#

So it looks like what you're supposed to do, roughly speaking, is hit the TrueType vhea table for vertical ascent/descent https://learn.microsoft.com/en-us/typography/opentype/otspec180/vhea
and failing that, CJK fonts may provide a vertical advance via the OS/2 typographic ascender/descender, which itself is a little rabbit hole because it briefly covers the OTF concept of the "ideographic em-box", which luckily I can just skip the fancy stuff for because its secondary metrics are just em-width
https://learn.microsoft.com/en-us/typography/opentype/spec/recom#tad
https://learn.microsoft.com/en-us/typography/opentype/spec/baselinetags#ideoembox

#

#

whoever wrote the FreeType docs decided to copy paste this line for TT_HoriHeader and TT_VertHeader which sent me on a bit of a wild goose chase because this is actually true for horizontal metrics, and in CJK fonts the typographic ascender/descender affects the vertical ascent metrics, just not the secondary axis

outer dune Oct 25, 2024, 8:24 PM

#

What's the fallback on fonts that do not support the field ?
Can't you just assume "easily" assume a sized glyph which is offsetted by it's normal height * size increment ?
Tho, it will affect the following lines x offset... UGH. I don't even know what it gives in Kanji for Top to Bottom lines.

molten galleon Oct 25, 2024, 8:25 PM

#

I'm thinking either you use the line height as the line width, or just the em-width

#

what I'm going to do is find a font that does have vertical metrics and see which value is generally closer

outer dune Oct 25, 2024, 8:26 PM

#

KEKW it breaks some renderers as expected

molten galleon Oct 25, 2024, 8:27 PM

#

is that a browser

outer dune Oct 25, 2024, 8:27 PM

#

Yes, but it's a little bit of a hack, I'm using <span> instead of css ::first-letter

#

Works great for Latin caps

#

Won't work for those with y-bearing though

molten galleon Oct 25, 2024, 8:31 PM

#

outer dune <:KEKW:666849321462792234> it breaks some renderers as expected

I think this is technically something I handle better than CSS can

#

at least if you use spans, because I think CSS can't use the spans as context for a line (but I may be wrong)

#

but the ascender and descender of my lines are max(ascender) and min(descender), so the large glyphs stay on the same baseline, at least

#

they look dumb, but don't collide

outer dune Oct 25, 2024, 8:32 PM

#

I'm checking but :initial-letter is not fully supported on all browser. ::first-letter is though.

#

Yeah, CSS prioritise line-space to be the same. I have the same result as you (bigger gaps between baseline) in my engine

#

At least for <span>, it seems

outer dune Oct 25, 2024, 8:34 PM

#

left junco But so so sexy

I'm getting convinced by that, but I have no usage in-game except in the context of an old letter

left junco Oct 25, 2024, 8:34 PM

#

molten galleon I think this is technically something I handle better than CSS can

molten galleon Oct 25, 2024, 8:34 PM

#

MS word also does this

outer dune Oct 25, 2024, 8:35 PM

#

Yeah, I think it's CSS being dumb (Firefox implementation) 🤔

molten galleon Oct 25, 2024, 8:35 PM

#

you should try it in a chromium based browser and see what's different

outer dune Oct 25, 2024, 8:36 PM

#

Chrome KEKW

#

Somebody stole someone else implementation
"Lemme copy your code, I will change it I swear"

molten galleon Oct 25, 2024, 8:38 PM

#

here's a snippet from the firefox source

#

everyone is copying everyone else because there's no unified body of work for how text should be rendered

#

with the exceptions of tidbits explicitly in the unicode or CSS spec

#

which isn't much for either

molten galleon Oct 25, 2024, 8:41 PM

#

outer dune I'm getting convinced by that, but I have no usage in-game except in the context...

I'm tempted to eventually at least add a path for people to be able to do this manually if they really try, maybe by adding some polymorphism to the layout builder

#

because mostly the biggest change is customizing the linebreak logic to break at a different width for the first n lines

outer dune Oct 25, 2024, 8:43 PM

#

molten galleon here's a snippet from the firefox source

This is amazingly funny, so now we know Pango, Firefox and Chrome are wrong IMO.

molten galleon Oct 25, 2024, 8:44 PM

#

it's hard to say pango is wrong because idk what they did and didn't copy, I should honestly clone pango and snoop in it as well

#

I think a lot of linux distros use pango as the OS text display

#

at the very least, I believe gtk uses it and thus everything that uses gtk probably does too

outer dune Oct 25, 2024, 8:46 PM

#

Yes they do use pango in most (all?) Linux distro. I guess not a lot of people uses that part of it (if it's wrong in pango).
It's quite rare to see such usage anyway 🤔

molten galleon Oct 25, 2024, 9:36 PM

#

pango calls writing mode gravity lol

molten galleon Oct 26, 2024, 11:05 AM

#

#

so it looks like generally non-CJK fonts don't have vhea, but the vertical fallback for non-CJK fonts is the standard ascender/descender, not the emwidth or typographic one

#

I should actually measure what the browsers choose for CJK line widths, because based on this (the metrics from Noto Sans CJK jp) they spec vertical line height as = to emwidth

molten galleon Oct 26, 2024, 7:20 PM

#

yet another minor annoyance about vertical text is that 0 on the Y causes the offset of the glyph to end up above the line, so you still need the normal ascent either way

old harbor Nov 30, 2024, 7:51 PM

#

am i stupid or is discord too stupid to render arabeeque?

#

i wanted this

#

it fails to render الْحَكَم

#

uh

#

wait

livid patrol Nov 30, 2024, 7:53 PM

#

gm habibi

old harbor Nov 30, 2024, 7:53 PM

#

🇬🇲 habibi 🙂

#

discord nick length is fixed : ( it cut off

#

it cut "the ruler" off hehe

livid patrol Nov 30, 2024, 7:54 PM

#

Can you type something again

old harbor Nov 30, 2024, 7:54 PM

#

something again

livid patrol Nov 30, 2024, 7:54 PM

#

You did it too quick

#

I'm trying to get the "x is typing" thing

#

Ty

old harbor Nov 30, 2024, 7:55 PM

#

ahahsdjalkjalksjdlkjaskljlkjdflksdjfklwjerlkwjlksdjfkljasdkljlkwjerkljsldkfjslkdfjlkjwerkljlksdjflkjsdfkjlkasjdlkjsdflksjdflkjsdflkjslkdjflksjdflkjslkdjflksjdflksjdflkjsldkfjlskjdf

#

😛

livid patrol Nov 30, 2024, 7:55 PM

#

old harbor Nov 30, 2024, 7:55 PM

#

ja it cut "the ruler" off

#

because the name is too long

livid patrol Nov 30, 2024, 7:55 PM

#

mixed rtl and ltr is funny

old harbor Nov 30, 2024, 7:55 PM

#

heh it do be

#

Abd al-Malik

#

that does look neat on the dark background

molten galleon Nov 30, 2024, 7:57 PM

#

this is what I see

old harbor Nov 30, 2024, 7:58 PM

#

and with the larger fontsize

livid patrol Nov 30, 2024, 7:58 PM

#

It do

old harbor Nov 30, 2024, 7:58 PM

#

yep i oh

#

molten galleon Nov 30, 2024, 7:59 PM

#

ah neat its arabic combining characters

#

old harbor Nov 30, 2024, 8:08 PM

#

interesting

#

i also like how United Arab Emirates uses those more straight looking characters for their stuff

#

like some sort of monospace 🙂

#

"Allah"

#

almost looks like some simple power tower in arabic heh

#

latin alphabet is so boring

#

do you guys understand the word sprachbund?

#

thats english apparently

livid patrol Nov 30, 2024, 8:22 PM

#

no

#

"a linguistic area"

old harbor Nov 30, 2024, 8:26 PM

#

i should have studied languages

#

rather than doing computer bs 🙂

molten galleon Nov 30, 2024, 9:05 PM

#

for some reason linguistics attracts programmers

old harbor Nov 30, 2024, 9:14 PM

#

: )

outer dune Nov 30, 2024, 11:07 PM

#

It's because we are just typing and talking about text all day frogs

livid patrol Nov 30, 2024, 11:55 PM

#

C++ is just another lingu to istic

molten galleon Jun 1, 2025, 7:29 PM

#

ok time to wake this thread up for some new features

#

so I was thinking how in order to append the results of layouts together (e.g. if you wanted to insert another GUI in the middle of your text, like I do), you would need to provide an initial offset to the first line to join up where the last text block ended... so I thought, "why not allow the params to have an offset and count so you can implement drop caps", and then I thought "wow it'd be crazy and overengineered if I just let you pass a callback to dynamically resize the lines"

so that's what I did

#

#

so now drop-caps are totally possible

#

and then following that up, I added text truncation (i.e. stop laying out if you go out of bounds), which I needed for a different feature, but the combo of text truncation and line resizing means that you can actually relayout text around obstacles very dynamically

#

(imagine I actually put the big E here )

#

that little implementation is a bit buggy because I half assed it for a proof of concept

#

but the important part is that I can

left junco Jun 1, 2025, 10:20 PM

#

insane stuff

#

We need to think up some more exotic typesetting behavior that games might benefit from

molten galleon Jun 2, 2025, 12:29 AM

#

when I was putting this together I could only come up with 2 things it reminded me of: greeting cards (where you have text conforming around some clipart) and newspapers with images slapped between the text

#

but my actual usecase is driven primarily by 2 things, 1 is embedding frogapprove images in frognant chat without needing a whole-ass typeface for them, and the other is things like this

#

where you have dynamic controls in your text

molten galleon Jun 3, 2025, 8:09 PM

#

one more feature in this series: dummy blocks

#

I considered this a while ago, but basically I reserved a special font family that's not the invalid family, which gets special-cased to get skipped in layout and rendering, and I can provide an extra width and height for

#

its function sort of overlaps with setting the line offsets/limits/truncation, but it's much more specifically the sticker/custom emoji feature, since it can behave like any other character with a little bit of bridging to the editor features

#

and of course the rects are drawn in afterwards, and can be anything (more UI controls, images, etc)

#

it also behaves a little bit more like a microsoft word image insertion into text, rather than a tool for morphing text around a first-class asset

molten galleon Aug 13, 2025, 9:01 PM

#

So I tried out RmlUi for the first time just now, I have it scoped out for a secret project

#Rich Text - Uncreatively Named Text Handling Prototype Tool