#Limits of image compression

1 messages · Page 1 of 1 (latest)

mortal sorrel
#

Hi all,

I do not have a computer science background or similar but I find the developments around image compression super fascinating as they are combining perception, information theory and cs. I was wondering this question though, that maybe people with experience with image compression could answer:

Would it be theoretically possible that we develop algorithms that compress photographic content visually lossless to lets say a 1/50 of the original size or are we actually already near a limit what can be achieved realistically? The fact that in the 30 years since the original jpeg we haven't achieved something like this, hints at this direction. I know improving on jpeg and now avif and jxl will be very difficult but this is more a general question about What are the theoretical and practical limits of lossy image compression?

sacred basin
#

it is definitely possible and in fact some recent ML-based algorithms already achieve that under some conditions, but the question itself is ultimately not well defined because the "visually lossless" is really hard to define.

#

generally lossy image compression works by transforming the original array of pixels into some representation which can be much more aggressively changed without actual changes

#

if we can aggressively change the representation, we can get rid of as much part of it as possible to get the compressed data.

#

JPEG, JPEG XL and AVIF all are based on the Discrete Cosine Transform (DCT), which is known to be a good transformation for this purpose. but it is hardly unique, JPEG XR for example used a substantially different variant of DCT, while some used Discrete Wavelet Transform (DWT) etc.

#

so if we can find a better transformation somehow, and that is feasible to implement efficiently, the whole game can change. ML happens to be one way to achieve that (many ML models work on "embedding" or "latent", that is a fancy name for the internal representation used by these models). there may be other ways too.

orchid helm
#

noise used to be an artifact from compression but can be 'separated' in its own channel to ensure the 'denoised' image compresses better

#

but compression's already really efficient so i suspect further advances will need to come from better capture from camera sensors

sacred basin
#

but the noise doesn't have to be exactly reproduced, its distribution is more important to reproduce

#

so I believe that is not the limiting factor per se, in fact recent enough codecs (JPEG XL and AVIF) already try to model them

#

I think the main problem we have today is the medium frequency content, which has to be more or less exactly reproduced by isn't well captured by transformations

#

so that they have to be encoded as residuals, which don't compress well

#

the distribution of residuals can be of course calculated, but that's not enough for medium-frequency contents

#

in fact drawings and synthetic images used to be such medium frequency contents decades ago

#

now they are better handed by modern codecs though

orchid helm
#

reconstructing images using ML could be worth investigating but i doubt itll ever substitute to proper images
for example, capture a low resolution image with a few areas of interest specified to be photographed at real quality (mainly faces). creatively upscale the whole thing using your device's AI chip logic and slap the faces' where they belong

sacred basin
#

I too don't have much hope for the contemporary ML-based transformation, but the principle itself is promising

#

maybe we are able to arrive at some promising embedding matrix useful for the vast majority of images

blissful sinew