J40 discussion | JPEG XL | Page 1

ruby fable Sep 15, 2022, 7:41 PM

#

https://github.com/lifthrasiir/j40
Older discussions are around #off-topic .

GitHub

GitHub - lifthrasiir/j40: J40: Independent, self-contained JPEG XL ...

J40: Independent, self-contained JPEG XL decoder. Contribute to lifthrasiir/j40 development by creating an account on GitHub.

#

so, after wasting one full day with build systems, my next topic is probably a proper render system

#

essentially I would need something similar to libjxl's render_pipeline 😮

unkempt knot Sep 15, 2022, 7:49 PM

#

Do you want to be able to do streaming decode with progressive rendering and all that? Because if not, things are quite a bit simpler...

ruby fable Sep 15, 2022, 7:51 PM

#

I already have lots of different "planes" (typed bitmaps, really) around, so I figured out that I already need some sort of pipeline anyway

#

for now, J40 decodes global modular channels (if any) and produces per-group XYB samples that are immediately converted to sRGB before integrated into the aforementioned global modular channels

#

there are multiple issues with this architecture, to name a few J40 in VarDCT always creates u16 planes for color channels, while ideally they should be f32

#

and sRGB conversion should happen a lot later

#

honestly speaking having a pipeline is that I can easily verify and adjust how planes are combined and converted

#

that's a main goal of pipeline, progressive rendering is a nice bonus but ultimately a side effect 🙂

unkempt knot Sep 15, 2022, 7:55 PM

#

I think you're probably right

#

@minor lark is the render pipeline architect in libjxl so he might have some tips...

ruby fable Sep 15, 2022, 7:56 PM

#

I had the following in my multi-month-old notes:

render ops:
J40_OP_LOAD planeref (-- plane)
J40_OP_COPY x y sx sy (dst src -- dst)
J40_OP_IDCT coeffref (-- plane)
J40_OP_UNSQUEEZEH (avg residu -- merged)
J40_OP_UNSQUEEZEV (avg residu -- merged)
J40_OP_UNRCT

#

as you can see this is a render graph encoded in forth-like IL

#

I'm yet to fully figure out which ops would be needed and which ops can be axed

ruby fable Sep 16, 2022, 11:54 PM

#

today in J40: fuzzing! which means that I have really... a lot of things to fix.

ruby fable Sep 17, 2022, 9:24 PM

#

with fuzzing, I really feel needs for testing architecture. I actually had an ad hoc testing infra using labrat https://github.com/squarewave/labrat but I quickly learned its limitations.

GitHub

GitHub - squarewave/labrat: Simple, single-file test harness for C/C++

Simple, single-file test harness for C/C++. Contribute to squarewave/labrat development by creating an account on GitHub.

#

one unorthodox possibility is to use Rust as a testing infrastructure (!)

prime grail Sep 18, 2022, 8:01 AM

#

https://news.ycombinator.com/item?id=32885203 front

ksec

J40: Independent, self-contained JPEG XL decoder

ruby fable Sep 18, 2022, 8:11 AM

#

oh thank you for nothing that.

ruby fable Sep 18, 2022, 4:49 PM

#

today in j40: now pretty much every fuzzer outcome is panicking at // TODO midbits can overflow! comment in j40__hybrid_int, which means I need to do something with this...

ruby fable Sep 19, 2022, 12:43 AM

#

while not guaranteed, you can always file an issue.

#

to me the biggest blocker is that we don't have any freely available copy of ISO/IEC 18181-2

#

other aspects of container format can be inferred from libjxl code, but jbrd is complex

ruby fable Sep 19, 2022, 4:02 AM

#

ruby fable today in j40: now pretty much every fuzzer outcome is panicking at `// TODO midb...

my fuzzing process now regularly stalls at j40__icc, which is given a very large enc_size and zero-bit symbols :p

ruby fable Sep 19, 2022, 11:14 AM

#

okay, I polished and pushed most commits in my working copy

#

you can see how I usually work, I start by tackling a particular task, solve any side tasks as needed, split commits for ease of reading and/or reviewing and push a bulk of commits.

#

(this is a bad strategy when that task turned out to be much larger than imagined though, I would have to detect this to avoid stalling)

#

yes, lots of edge cases!

#

it shouldn't affect valid images, except for the very last commit which finally limits the input to the Main level 5 (API pending).

ruby fable Sep 20, 2022, 7:21 PM

#

today in j40: dealing with fallouts in MSVC.

ruby fable Sep 21, 2022, 2:57 AM

#

https://github.com/lifthrasiir/j40/commit/0ce79d31c833780238e76633c65a8bf9543a865f took a while but here it is.

GitHub

Add dj40-o0g target to Makefile, and support Windows directly. · li...

We emulate make in Windows using ninja, with some batch file hack
to allow both MSVC cl.exe and gcc/clang to be used with a simple
environment variable or argument (CC=msvc etc).

#

you can run make and get dj40.exe if you have Visual Studio (!)

#

possibly, I think it should be done via CI and this commit is a part of preparation

ruby fable Sep 21, 2022, 5:18 AM

#

okay, I've pushed all local commits (which took a while as I had tons of them in the queue)

#

and you can now fuzz this damn thing! well, not too long, my current corpus crashes after 500K iterations :p

#

(after fixing a lot of low-hanging fruits)

unkempt knot Sep 21, 2022, 6:12 AM

#

Of course robustness is always good to have, but maybe it's not super critical for the main use cases of j40 — which I imagine would be things like games that don't want to introduce a dependency on libjxl to decode game assets.

ruby fable Sep 21, 2022, 6:48 AM

#

Yeah, but it is far less than ideal that just a few minutes of fuzzing finds yet another bug.

#

And fuzzing does help for finding leaks, which would be equally important even for trusted inputs as well.

unkempt knot Sep 21, 2022, 7:00 AM

#

true

ruby fable Sep 21, 2022, 7:41 AM

#

the latest bug is confirmed to be a swapped grows vs. gcolumns (i.e. groupwise height and width). it turns out that my quick test corpus doesn't have any image that has more than 1 groups and grows and gcolumns differ... facepalm

ruby fable Sep 21, 2022, 8:07 AM

#

yet another bug:

-         for (i = 0; i < 4; ++i) J40__TRY(j40__modular_channel(st, &m, i, sidx2));
+         for (i = 0; i < m.num_channels; ++i) J40__TRY(j40__modular_channel(st, &m, i, sidx2));

#

basically modular subimages can have transformations and can have a different number of channels to decode

#

yet, yet another bug or something else: palette with nb_colours == 0, is it even possible????

#

ah it is possible, because every palette index will be synthetic, oh okay.

#

so it's just J40 not handling this edge case.

unkempt knot Sep 21, 2022, 8:28 AM

#

yes, the default palette is quite useful: everything with index >= nb_colours maps to two color cubes, and everything with index < 0 maps to default delta palette entries

#

so you can actually encode a pretty nice image without specifying any custom palette colors

ruby fable Sep 21, 2022, 8:29 AM

#

now I have to deal with a possibility that those zero-width images can be further transformed 😛

unkempt knot Sep 21, 2022, 8:31 AM

#

oh right we still nominally insert a 0x0 palette channel in that case, just to keep the invariant that every palette transform adds one metachannel

#

empty channels do not end up getting encoded but you do need to take them into account for the channel indices

ruby fable Sep 21, 2022, 8:32 AM

#

for example there can be 3 palette metachannels and they can undergo RCT... which should be valid but may need a special handling

unkempt knot Sep 21, 2022, 8:32 AM

#

squeeze can also introduce empty channels btw

ruby fable Sep 21, 2022, 8:33 AM

#

yeah, a large enough number of transforms

#

and I think it will be hairier than other transforms

unkempt knot Sep 21, 2022, 8:35 AM

#

yeah rct on channel-palettes could actually be useful, current encoder doesn't do that but it would help a bit for the case where you e.g. have a 10-bit image encoded nominally as a 16-bit one where only 1024 sample values actually get used; in that case the encoder will currently produce 3 channel palettes that are identical but still encoded 3 times; doing an RCT on that would reduce it to 1 channel with some entropy and 2 channels that are just zeroes.

#

so we kept that possibility open in the spec even if the current encoder isn't doing it yet

#

but all the transforms (rct,palette and squeeze) have to operate either on metachannels or on real channels but not a mix of them

ruby fable Sep 21, 2022, 8:37 AM

#

I came to realize that the fuzzing corpus from J40 can be used against other implementations including libjxl

#

I guess libjxl already has a lot of them though

unkempt knot Sep 21, 2022, 8:37 AM

#

because that introduces too much weirdness: it's then no longer clear what channel is meta and what channel is nonmeta

#

yes, you could probably get a fuzzing corpus from libjxl to try it on j40 too

#

in libjxl we also have a funky way to make fuzzing more effective, which is to have a variant of the encoder/decoder where entropy coding is skipped and things are just raw bits that the fuzzer can flip directly — or at least that's how I remember it, @minor lark or @golden basin probably know more about this

ruby fable Sep 21, 2022, 8:44 AM

#

unkempt knot because that introduces too much weirdness: it's then no longer clear what chann...

yeah vshift and hshift are important to keep, otherwise we will have an ambiguity (was afk)

minor lark Sep 21, 2022, 1:02 PM

#

unkempt knot in libjxl we also have a funky way to make fuzzing more effective, which is to h...

nah, just using huffman with 8-bit symbols everywhere

#

(just for the corpus)

#

also removing the check for the final state

unkempt knot Sep 21, 2022, 1:13 PM

#

ah right, huffman with 8-bit symbols so it's actually still a valid bitstream, just poorly compressed and symbols nicely byte-aligned, right?

ruby fable Sep 21, 2022, 1:42 PM

#

Heh, I thought the thread is dead. Will reply after I get back home.

#

Spoiler: not that good I think 😉

#

I believe GCC is a bit more aggressive on autovectorization, which J40 heavily relies on

ruby fable Sep 24, 2022, 4:38 PM

#

today in j40: after a few days of fuzzing I've reached the point where I need a radically different corpus or source code modification to continue fuzzing, which seems like the perfect moment to stop active fuzzing 🙂

#

so far fuzzing covered almost all modular code but not much vardct code, as expected

#

I tried to manually put known vardct images to the corpus but that didn't help much, possibly because those inputs are too large

unkempt knot Sep 24, 2022, 5:31 PM

#

So what's next? Implementing the missing coding tools?

ruby fable Sep 24, 2022, 5:38 PM

#

for now, finishing up restoration filters is the highest priority and that probably involves rendering

ruby fable Sep 25, 2022, 5:43 AM

#

oh wow, it's based on ccgo, which is kinda unexpected

ruby fable Sep 26, 2022, 1:57 AM

#

today in j40: I'm populating the issue tracker, and here is an initial issue about the Rust version: https://github.com/lifthrasiir/j40/issues/10

GitHub

Rust version · Issue #10 · lifthrasiir/j40

As I noted in the orange site, it is a long-term goal to produce a parallel Rust version of J40. There are many unanswered questions before starting this process, however. Should we cooperate with ...

#

@boreal cairn will want to be pinged

boreal cairn Sep 26, 2022, 2:42 PM

#

thanks! I gave my 2 cents to the issue.

#

I don't know enough about J40 or JXL in general to help with the real details, but happy to work out some of the architectural kinks with you if you wanna chat about it. Obviously as soon as their is code I'm happy to contribute!

#

I didn't get to work on my own implementation in a long time now, I will 100% make time for this sooner or later, so either way I'm gonna either contribute to J40 or roll my own thing, whatever happens first 🙂

ruby fable Oct 2, 2022, 9:43 AM

#

back from a bit of vacation, playing too much Horizon: Zero Dawn (what year is this), thinking about rendering again

ruby fable Nov 3, 2022, 1:33 PM

#

I would say that libjxl will be lightweight in this case, because you absolutely need NEON-specific optimizations that are in libjxl

unkempt knot Nov 3, 2022, 2:03 PM

#

The ARM build of libjxl-dec is about 200kb iirc

unkempt knot Nov 3, 2022, 2:36 PM

#

Maybe has debug symbols still in there or something?

tender heath Jan 28, 2023, 6:44 PM

#

Freely suggested improvements to j40 makefile to increase portability and readability and brevity:

CFLAGS     = -O3 $(CFLAGS_WRN)
CFLAGS_DBG = -DJ40_DEBUG -g -Og $(CFLAGS_WRN)
CFLAGS_WRN = -W -Wall -Wconversion -Wc++-compat
LDFLAGS    = -lm
CLANG      = clang

dj40: dj40.c j40.h extra/stb_image_write.h
    $(CC) $(CFLAGS) $(LDFLAGS) -o $@ dj40.c

dj40-cxx: dj40.c j40.h extra/stb_image_write.h
    $(CC) -xc++ $(CFLAGS) $(LDFLAGS) -o $@ dj40.c

dj40-o0g: dj40.c j40.h extra/stb_image_write.h
    $(CC) $(CFLAGS_DBG) -fsanitize=address,undefined $(LDFLAGS) -o $@ dj40.c

j40-fuzz: extra/j40-fuzz.c j40.h
    $(CLANG) $(CFLAGS_DBG) -fsanitize=fuzzer,address,undefined $(LDFLAGS) -o $@ extra/j40-fuzz.c

#

It should be pretty much equivalent...

tender heath Jan 28, 2023, 7:59 PM

#

Since it's an all-in-one-file lib the makefile is probably not that important anyways.

unkempt knot Jul 16, 2023, 7:23 PM

#

@ruby fable any plans to resume work on j40 at some point?

ruby fable Jul 17, 2023, 12:14 AM

#

unkempt knot <@268284145820631040> any plans to resume work on j40 at some point?

I hope so, for the last 6 months I didn't really have much energy to spare (not just about J40 but more generally) though

ruby fable Jul 19, 2023, 2:24 AM

#

@unkempt knot by the way, I think jxl-oxide already surpassed what I wanted to achieve by J40 and wonder if I should keep working on J40 as I originally intended

#

specifically there were two goals I had in mind: producing a complete reimplementation from the spec is one, more or less achieved by now (not by J40, of course)

#

the second goal was to provide a minimal ground for working with JPEG XL

#

the minimal ground here means, for example, a test suite completely independent from libjxl

#

I expected J40 will eventually need them anyway and it might be easier to produce such one from J40 and not from libjxl

#

that's another goal, and I'm still unsure the optimal way to achieve that

#

it was another reason I didn't have much progress recently, if I had some concrete plans maybe I could have tried to make one (I indeed had other side projects that were going very slowly but still steady in the same period), but I didn't have any actionable plan

#

so if libjxl had some (ideally concrete) ideas that might be hard to do themselves, maybe I can look at them instead

ruby fable Aug 27, 2023, 1:49 AM

#

rethinking about J40 rn, mostly about how to restructure it to avoid known pains

#

(yes, I've got a new job since then and a heavy milestone has been passed, so I now have some peace in my mind.)

#

(still recovering from the mental health issues though, mine is not as critical as others' but nevertheless medications really helped, anyway)

#

one of the main PITA in J40 was the cleanup path, which is... uh... always painful in C to be frank

#

it greatly relates to the testing strategy as well, because there should be a guarantee that the cleanup fully restores the known-good state

#

I knew this from the beginning, but after months of hiatus with fresh eyes I feel it more acutely

#

I'm starting to think about a lightweight preprocessor (still written in C) that help quite a bit, but not sure

#

my initial idea was to make the source code itself written in a non-portable C (i.e. allowing some GNU extensions) but it can be converted to a portable C with that preprocessor

#

but __attribute__((cleanup)) was something inferior compared to what I actually wanted, so... I'm not sure how I develop this idea further

fervent crescent Aug 30, 2023, 7:31 AM

#

I always saw other projects implementing cleanup by prologue and epilogue macros.

fervent crescent Aug 30, 2023, 7:52 AM

#

I don't think there's any other options to do it portably.

ruby fable Aug 30, 2023, 8:06 AM

#

yes, the current J40 also does this, but it is increasingly harder to deal with new features compared to other languages like C++. (but I don't like to write C++.)

#

let me give a concrete example. I eventually want to support a progressive decoding, i.e. the decoder can signal increasingly precise renders at any time.

#

if the language supports a coroutine this is really an easy task, because you can park the decoder after the last available input and resume it whenever you want.

#

but it is really annoying and error-prone to write an equivalent state machine in C.

#

as a practical and very relevant example here is a decoder loop of Brotli: https://github.com/google/brotli/blob/ed738e842d2fbdf2d6459e39267a633c4a9b2f5d/c/dec/decode.c#L2264

#

Brotli decoder is composed of tons of state machines with delicate invariants, and it is possible as you can see, but it is inhumane to be honest

#

and JPEG XL is many times more complex than Brotli (in fact, JPEG XL contains a big portion of Brotli)

#

so both J40 and libjxl are designed to roll back to the known-good state when the input is not enough.

#

for example, if the signature has been read but the image header cannot be fully read, you roll back to the end of the signature and next time you will try to re-read the image header.

#

this is suboptimal (if you supply a single byte every time, it will decode the same thing over and over) but practically easier to manage...

#

...if it is easy to roll back. which is mostly the case in C++, while it is not even trivial in C.

#

so in C you can't have an easy coroutine nor an easy state rollback. what to do then? this is my current question.

#

many C projects can cope with macros because they generally have (or constrain themselves to have) a small number of exit conditions per each function.

#

for example:

#

int foo(state_t *st) {
    // precondition: st->a thru st->c are not initialized
    st->a = malloc(sizeof(st->a));
    if (!st->a) goto error;
    st->b = malloc(sizeof(st->b));
    if (!st->b) goto free_a;
    st->c = malloc(sizeof(st->c));
    if (!st->c) goto free_a_and_b;
    // postcondition when return value is 1: st->a thru st->c are all valid
    return 1;

free_a_and_b:
    free(st->b);
free_a:
    free(st->a);
error:
    // postcondition when return value is 0: st->a thru st->c are not initialized
    // and no memory has been leaked
    return 0;
}

#

this is a contrived example, people know this is really wordy so they actually use shortcuts, but you'd get my point

#

people generally write the following instead (and J40 does this as well):

int foo(state_t *st) {
    // ensure that `error` is always free to jump
    st->a = NULL;
    st->b = NULL;
    st->c = NULL;

    st->a = malloc(sizeof(st->a));
    if (!st->a) goto error;
    st->b = malloc(sizeof(st->b));
    if (!st->b) goto error;
    st->c = malloc(sizeof(st->c));
    if (!st->c) goto error;
    return 1;

error:
    free(st->a); // safe to call for NULL
    free(st->b);
    free(st->c);
    return 0;
}

fervent crescent Aug 30, 2023, 8:29 AM

#

How about using a callback cleanup mechanism?

Basically you introduce the concept of a callback structure, and a stack of those. Add callbacks to the structure and push it to the stack. Popping them from the stack would call those callback functions.

How does it sound?

ruby fable Aug 30, 2023, 8:43 AM

#

first, C doesn't have a portable nested function (sadly). and second, I don't know how many of them are required :S

#

for now I'm currently considering about an arena-based approach, trading the strict memory usage with a convenience.

#

but the arena will not fully solve my problem because I have so many states...

ruby fable Aug 30, 2023, 9:18 AM

#

to be accurate, if there is a way to make a non-portable but working equivalent of Go defer, I'll seriously consider that

#

I'm okay with the non-portability if it i) can be mechanically translated later and ii) works as is with GCC and clang

#

that would comfortably cover pretty much every use case

fervent crescent Aug 30, 2023, 9:27 AM

#

ruby fable to be accurate, if there is a way to make a non-portable but working equivalent ...

There isn't to my knowledge.

#

Plans to have it in C23 got cancelled.

#

It's effectively pushed back to C30.

ruby fable Aug 30, 2023, 9:28 AM

#

yeah I know (cf. https://thephd.dev/lambdas-nested-functions-block-expressions-oh-my), but I can't use it in the portable, translated version anyway, so the base version doesn't have to be that portable

fervent crescent Aug 30, 2023, 10:01 AM

#

Your only non-portable option in that case would be GCC's cleanup attribute, which you did try.

But yes, make sure that the source file is machine-preprocessable to portable C code if you decide to go with that option.

broken flint Mar 26, 2024, 10:21 PM

#

@ruby fable i know quite some time has passed since this discussion stopped, but I normally use this macro to implement a Go-like defer in C:

#define PPCAT2(n,x) n ## x
#define PPCAT(n,x) PPCAT2(n,x)
#define DEFER2(stmt, counter) \
    void PPCAT(__cleanup, counter) (int* u) { stmt; } \
    int PPCAT(__var, counter) __attribute__((unused, cleanup(PPCAT(__cleanup, counter ))));
#define DEFER(stmt) DEFER2(stmt, __COUNTER__)

#

which is based on __attribute__((cleanup)) which you mention above, so I'm not sure if this is something you already attempted or not

#

Sample usage:

void *buf = malloc(128);
DEFER(free(buf));

#

this works on GCC only though as clang doesn't implment nested functions in C

#

I have also one question; is there a way to use cjxl to force to encode files that j40 will know how to decode? Or put in other terms, is there a way to disable features that j40 doesn't support, so that the current version of j40 can successfully be used without fear of generating images that can't be later decoded?

ruby fable Mar 27, 2024, 12:01 AM

#

broken flint which is based on `__attribute__((cleanup))` which you mention above, so I'm not...

I didn't use that mainly because it is not really portable, and I wanted to use it unconditionally 🙂

#

for j40-compatible images, any low enough level (I think it was up to -e 7?) works I think. see the README.

broken flint Mar 27, 2024, 12:07 AM

#

Thanks, should I expect some runtime error for images using unsupported features, or just a corrupted picture?

ruby fable Mar 27, 2024, 12:44 AM

#

modulo any bugs, it will probably reject them.

#

I haven't touched J40 for a long time now though, so there will be several bugs around...

#

I'm slowly investigating several paths to revive the project, but all paths depend on how to reliably write a C code without making it too large, and that's a big problem

ruby fable Mar 27, 2024, 7:16 AM

#

ruby fable I'm slowly investigating several paths to revive the project, but all paths depe...

I should make more accurate statement on this:

#

I will revive J40 if I have a way to write a C code roughly equivalent to the current J40 code without exactly using barebones C

#

if Zig could have been translated to C I would have picked it, but AFAIK Zig-to-C is not supported

broken flint Mar 27, 2024, 9:11 AM

#

in my humble opinion, you're already using several clever macro tricks in the current codebase to push the C boundaries

#

i don't think there's much more than can be done in the realm of C

#

i wish clang didn't decide to skip nested functions in C, that would have helped me greatly, but alas they went for that decision years ago and haven't looked back

#

that would solve defer, but then if you want to have more solid coroutines or stuff like that, I guess that's not something that really matches C

ruby fable Mar 27, 2024, 9:14 AM

#

yeah, that is my current dilemma 😦

broken flint Mar 27, 2024, 9:15 AM

#

i guess translation from a higher level language is one option as you said

ruby fable Mar 27, 2024, 9:15 AM

#

something like https://github.com/google/wuffs/ but for a broader library would be beneficial

GitHub

GitHub - google/wuffs: Wrangling Untrusted File Formats Safely

Wrangling Untrusted File Formats Safely. Contribute to google/wuffs development by creating an account on GitHub.

#

(Wuffs is truly great and I would like it to be more widespread, but it doesn't fit my use case)

broken flint Mar 27, 2024, 9:17 AM

#

yes i like wuffs too but i think it fits more into the realm of safety, not sure if that's also your focus

#

i haven't really evaluated that as a higher level langauge by itself

ruby fable Mar 27, 2024, 9:18 AM

#

I'd like to ensure that my library is reasonably safe, but not necessarily in the formally verifiable way (which is... hard I know)

#

it doesn't look so but there are several guiding principles throughout the J40's code to avoid usual problems, like consistent coding styles

broken flint Mar 27, 2024, 9:19 AM

#

yeah, though some fuzzying will bring you somewhere into a safe area

ruby fable Mar 27, 2024, 9:20 AM

#

J40 has been surely fuzzed, but it will need an additional restructuring to make it fuzz-friendly

#

I think the current fuzzing attempt covers roughly a half of the entire code

broken flint Mar 27, 2024, 9:21 AM

#

I will test decompression speed on my target platform for this project, which is a Nintendo 64; that'll give me a first number to see whether a full porting is viable or not

ruby fable Mar 27, 2024, 9:22 AM

#

oh, that would be adventerous to be sure 😉

#

does libjxl compile in that platform after all?

broken flint Mar 27, 2024, 9:23 AM

#

i haven't tried; I have been working on that platform for quite some time and I ported several modern formats like h264 and opus to it. I'm looking for a solution for lossy encoding, so i figured it out i'd start from jpegxl and move back in case of trouble 🙂

#

notice that porting doesn't mean only recompiling (that's the easy part), because optimizing for n64 means offloading part of the calculations to a DSP with SIMD instructions that must be programmed in assembly

#

so the work ends up being similar to a task lilke "adding arm+neon acceleration to a C codebase" or something like that

ruby fable Mar 27, 2024, 9:25 AM

#

is it just out of curiosity or do you have some concrete goal like a homebrew game?

broken flint Mar 27, 2024, 9:26 AM

#

i maintain an open source library that's used for homebrew games (https://github.com/DragonMinded/libdragon), I'd like to offer a lossy image compression solution to my users

ruby fable Mar 27, 2024, 9:27 AM

#

does JPEG work? if it isn't, there is not much chance for JPEG XL either (because it includes a baseline JPEG as a part of backward compatibility)

broken flint Mar 27, 2024, 9:27 AM

#

yes, jpeg was also used by commercial games back at a time

ruby fable Mar 27, 2024, 9:27 AM

#

oh, that's good to hear

broken flint Mar 27, 2024, 9:27 AM

#

we are pushing the boundaries more than they were ever able to do, this is why i was targeting something more modern

ruby fable Mar 27, 2024, 9:28 AM

#

I think a VarDCT subset of JPEG XL might be actually viable enough

#

that is, a single-frame lossy subset

#

(JPEG XL is designed for many more use cases, so you don't need the full library)

broken flint Mar 27, 2024, 9:28 AM

#

yeah probably. BTW I have ported mpeg1 already so for jpeg actually i should have most of the blocks

#

the DSP with SIMD is fixed point though, i think VarDCT is defined with floating points?

#

H264 is luckily integer only, that helped quite a bit 🙂

#

and Opus reference implementation supports both floating and fixed, that also helped

ruby fable Mar 27, 2024, 9:30 AM

#

yeah, J40 also assumes a working floating point impl

broken flint Mar 27, 2024, 9:30 AM

#

there are floating points on the CPU; it's more about the parts that I want to offload to the DSP ... those would have to be converted to fixed point

unkempt knot Mar 27, 2024, 9:37 AM

#

ruby fable I think a VarDCT subset of JPEG XL might be actually viable enough

You can't have fully VarDCT-only since even VarDCT uses Modular for the LF image etc, but yes, we're thinking about a "lightweight" profile that would also have a hardware implementation. It would restrict the use of Modular to put constraints on the kind of MA trees and predictors that can be used, probably not have extra channels at all, definitely no splines and patches, etc. How it will look will depend on what the hardware folks are willing to implement, it's still too early to tell.

ruby fable Mar 27, 2024, 9:40 AM

#

unkempt knot You can't have fully VarDCT-only since even VarDCT uses Modular for the LF image...

not to necessarily mean a proper subset of lossy JPEG XL, the LF image can be possibly encoded in other means for example.

unkempt knot Mar 27, 2024, 9:43 AM

#

There's a subset of Modular that is just no-context with uniform West prediction, that's basically what JPEG does. Probably we can define a somewhat larger subset of Modular that still gives some of the gains while still being simple enough for a hardware implementation, where you obviously don't want to deal with arbitrary MA trees and funky predictors like the self-correcting Weighed predictor.

ruby fable Mar 27, 2024, 9:43 AM

#

yeah, that might be a possible alternative

#

the point is that, I think N64 will need some specialized format and/or subset for a desired performance

broken flint Mar 27, 2024, 12:34 PM

#

I think it’s fine, we usually control both sides of the pipeline (encoding and decoding)

#

For h264 I compress videos only in baseline profile and I disable the in loop filter for instance as that creates performance issues

#

For opus I select Celt only (disable SILK) and fix a few internal parameters leaving a bit less flexibility to the encoder

#

So yes I’m ready to disable a few things at encoding time, that’s not an issue for my use case

#

Im just wondering if I should base the work on j40 or not given that its development is paused; I dont necessarily need it to be maintained and improved if what’s there is sufficient, I just fear of bugs

quasi carbon Apr 19, 2024, 7:23 AM

#

@broken flint I think I can answer that, I had trouble until I settled for -e 4

#

(decodable with j40.h)

broken flint Apr 19, 2024, 5:17 PM

#

quasi carbon <@336116349262888963> I think I can answer that, I had trouble until I settled f...

Thanks!

broken flint Jan 25, 2025, 7:01 PM

#

@ruby fable i have a question on j40; can you please explain at the high level why j40__advance is designed as a coroutine? In what case it is necessary for it to yield leaving the work uncompleted?

ruby fable Jan 26, 2025, 12:10 AM

#

broken flint <@268284145820631040> i have a question on j40; can you please explain at the hi...

the whole setup was designed for incremental parsing, as any individual step can stop at the end of currently available inputs in addition to genuine errors.

broken flint Jan 26, 2025, 12:17 AM

#

ruby fable the whole setup was designed for incremental parsing, as any individual step can...

but is there an API for that? It seems like you can fetch from either a file or a memory callback but in both cases they seem to assume full consuming of the input

ruby fable Jan 26, 2025, 12:18 AM

#

nope, it was never implemented to this day

#

I should mention that this approach is also similar to what libjxl does incremental parsing

#

(which inspired my design)

#

I already knew that a resumable coroutine would be necessary for incremental parsing in general, but C is too primitive to support that in a pleasant way, so I used that coroutine macro hack to retain a reasonable chunk of resumable routines

#

that said, in retrospect I think it wasn't enough because each individual routine does have to roll back perfectly on error, which was quite hard to do in general (especially in C!)

broken flint Jan 26, 2025, 12:23 AM

#

i think it's complicated in general

#

a stackfull coroutine would have a much easier life of course

ruby fable Jan 26, 2025, 1:50 AM

#

if we are not using C... 😉

broken flint Jan 27, 2025, 10:49 AM

#

yes but on the other hand, C is the standard for embedding and it is like that specifically because it is a simple language 🙂

#

so of course one has to take compromises here; in my case for instance i don't need incremental parsing so i will just remove that

jagged iron Aug 3, 2025, 1:29 PM

#

@ruby fable Hello there how's things going with the library?

ruby fable Aug 5, 2025, 11:11 PM

#

jagged iron <@268284145820631040> Hello there how's things going with the library?

see above for the current situation.

#

I'm currently trying to revive the project with some AI sprinkles, let me see whether it would work or not...

#J40 discussion