JPEG Compression (sophielwang.com)

by vinhnx 125 comments 399 points
Read article View on HN

125 comments

[−] chromakode 59d ago
The DCT is a cool primitive. By extracting the low frequency coefficients, you can get a compact blurry representation of an image. This is used by preload thumbnail algorithms like blurhash and thumbhash. It's also used by some image watermarking techniques to target changes to a detail level that will be less affected by scaling or re-encoding.

I made a notebook a few years back which lets you play with / filter the DCT coefficients of an image: https://observablehq.com/d/167d8f3368a6d602

[−] dipflow 59d ago
Whenever I see those 'blocky' artifacts on a low-quality image, I used to just think of it as 'bad tech.' After reading this, it's cool to realize you're actually seeing the 8x8 DCT grid itself. You're literally seeing the math break down because there wasn't enough bit-budget to describe those high-frequency sine waves. It’s like looking at the brushstrokes on a digital painting.
[−] netsharc 59d ago
The default implementation of the decoding adds the artifacts.

This tool uses more clever math to replace what's missing: https://github.com/victorvde/jpeg2png

[−] axiolite 59d ago
It just blurs out the details. I'd rather have a sharp image with artifacts.
[−] crazygringo 59d ago
Why?

You're not seeing the actual details either way.

The blurred version feels honest -- it's not showing you anything more than what has been encoded.

The sharp image feels confusing -- it's showing you a ton of detail that is totally wrong. "Detail" that wasn't in the original, but is just artifacts.

Why would you prefer distracting artifacts over a blurred version?

[−] pxndxx 59d ago
The details were destroyed long ago by the poor compression, you aren't getting them back either way.
[−] axiolite 58d ago
You're talking utter nonsense.

Get a picture of grass, save it as a JPEG at 15% quality... It still looks like grass. Then run it through jpeg2png... The output looks like a green smear. You might not even be able to tell that it's supposed to be grass. jpeg2png just blurs the hell out of images.

Here's a side-by-side: https://ibb.co/99C0F34d

[−] iggldiggl 59d ago
Also if your software for whatever reasons is using the original libjpeg in its modern (post classic version 6b) incarnation [1], right from version 7 onwards the new (and still current) maintainer switched the algorithm for chroma up-/downsampling from classic pixel interpolation to DCT-based scaling, claiming it's mathematically more beautiful and (apart from the unavoidable information loss on the first downscaling) perfectly reversible [2].

The problem with that approach however is that DCT-scaling is block-based, so for classic 4:2:0 subsampling, each 16x16 chroma block in the original image is now individually being downscaled to 8x8, and perhaps more importantly, later-on individually being upscaled back to 16x16 on decompression.

Compared to classic image resizing algorithms (bilinear scaling or whatever), this block-based upscaling can and does introduce additional visual artefacts at the block boundaries, which, while somewhat subtle, are still large enough to be actually borderline visible even when not quite pixel-peeping. ([3] notes that the visual differences between libjpeg 6b/turbo and libjpeg 7-9 on image decompression are indeed of a borderline visible magnitude.)

I stumbled across this detail after having finally upgraded my image editing software [4] from the old freebie version I'd been using for years (it was included with a computer magazine at some point) to its current incarnation, which came with a libjpeg version upgrade under the hood. Not long afterwards I noticed that for quite a few images, the new version introduced some additional blockiness when decoding JPEG images (also subsequently exacerbated by some particular post-processing steps I was doing on those images), and then I somehow stumbled across this article [3] which noted the change in chroma subsampling and provided the crucial clue to this riddle.

Thankfully, the developers of that image editor were (still are) very friendly and responsive and actually agreed to switch out the jpeg library to libjpeg-turbo, thereby resolving that issue. Likewise, luckily few other programs and operating systems seem to actually use modern libjpeg, usually preferring libjpeg-turbo or something else that continues using regular image scaling algorithms for chroma subsampling.

[1] Instead of libjpeg-turbo or whatever else is around these days.

[2] Which might be true in theory, but I tried de- and recompressing images in a loop with both libjpeg 6b and 9e, and didn't find a significant difference in the number of iterations required until the image converged to a stable compression result.

[3] https://informationsecurity.uibk.ac.at/pdfs/BHB2022_IHMMSEC....

[4] PhotoLine

[−] adgjlsfhk1 59d ago
eh, it is bad tech. modern compression algorithms hide the blocks a lot more because blocking is the most visible artifact
[−] pornel 59d ago
It's a perfectly pragmatic engineering choice. Blocking is visible only when the compression is too heavy. When degradation is imperceptible, then the block edges are imperceptible too, and the problem doesn't need to be solved (in JPEG imperceptible still means 10:1 data size reduction).

Later compression algorithms were focused on video, where the aim was to have good-enough low-quality approximations.

Deblocking is an inelegant hack.

Deblocking hurts high quality compression of still images, because it makes it harder for codecs to precisely reproduce the original image. Blurring removes details that the blocks produced, so the codec has to either disable deblocking or compensate with exaggerated contrast (which is still an approximation). It also adds a dependency across blocks, which complicates the problem from independent per-block computation to finding a global optimum that happens to flip between frequency domain and pixel hacks. It's no longer a neat mathematical transform with a closed-form solution, but a pile of iterative guesswork (or just not taken into account at all, and the codec wins benchmarks on PSNR, looks good in side by side comparisons at 10% quality level, but is an auto-airbrushing texture-destroying annoyance when used for real images).

The Daala project tried to reinvent it with better mathematical foundations (lapped transforms), but in the end a post-processing pass of blurring the pixels has won.

[−] bambax 59d ago
So the idea behind JPEG is the same as behind MP3: we filter out what we don't perceive.

I wonder if other species would look at our images or listen to our sounds and register with horror all the gaping holes everywhere.

[−] danwills 59d ago
Having played a bit with Discrete FFT (with FFTW on 2D images in a Shake plugin we made at work ages ago) makes the DCT coefficients make so much more sense! I really wonder whether the frequency-decomposition could happen at multiple scale levels though? Sounds slightly like wavelets and maybe that's how jpeg2000 works?.. Yeah I looked it up, uses DWT so it kinda sounds like it! Shame it hasn't taken off so far!? Or maybe there's an even better way?
[−] petercooper 59d ago
I've been working on a pure Ruby JPEG encoder and a bug led me to an effect I wanted. The output looked just like the "crunchy" JPEGs my 2000-era Kodak digital camera used to put out, but it turns out the encoder wasn't following the zig-zag pattern properly but just going in raster order. I'm now on a quest to figure out if some early digital cameras had similar encoding bugs because their JPEG output was often horrendous compared to what you'd expect for the filesize.
[−] Paulo75 59d ago
The part about green getting 58.7% weight in the luminance calculation is one of those details that seems arbitrary until you realize it's literally modeled on the density of cone cells in the human retina. The whole algorithm is basically a map of what human eyes can't see.
[−] meindnoch 59d ago
What would happen if the Cr and Cb channels used different chroma subsampling patterns? E.g. Cr would use the 4:2:0 pattern, and Cb would use the 4:1:1 pattern.
[−] momojo 59d ago
I've seen many a JPEG explainer, but this one wins for most aesthetic. The interactive visuals were also nice. My only criticism is the abrupt ending; should have concluded with the "now lets put it all together" slider.
[−] mdavid626 59d ago
JPEG still rules the world. Many new alternatives were developed, none of them really that much better than JPEG.
[−] NooneAtAll3 59d ago

> Application error: a client-side exception has occurred (see the browser console for more information).

seems like website doesn't work without webgl enabled... why?

[−] vanderZwan 59d ago
This is a really great article, and I really appreciate how it explains the different parts of how JPEG works with so much clarity and interactive visualizations.

However, I do have to give one bit of critique: it also makes my laptop fans spin like crazy even when nothing is happening at all.

Now, this is not intended as a critique of the author. I'm assuming that she used some framework to get the results out quickly, and that there is a bug in how that framework handles events and reactivity. But it would still be nice if whatever causes this issue could be fixed. It would be sad if the website had the same issue on mobile and caused my phone battery to drain quickly when 90% of the time is spent reading text and watching graphics that don't change.

[−] tmilard 59d ago
Thanks for the sharing : I now understood more how sampling of image works. And going from RGB to lunimesence+chroma works. interesting and usefull
[−] Alen_P 59d ago
Really enjoyed this. It's easy to forget how much engineering went into JPEG. The explanation of compression and quality tradeoffs was clear without oversimplifying. Impressive how well the format still holds up today. Curious how you think it compares to newer formats like AVIF or WebP in everyday use.
[−] vismit2000 59d ago
The Unreasonable Effectiveness of JPEG: A Signal Processing Approach - https://www.youtube.com/watch?v=0me3guauqOU
[−] tehjoker 59d ago
Maybe it's because I've read a few pieces on JPEG before so I have some prior knowledge, but I was looking to review this and this presentation was one of the clearest I've seen. Good job!
[−] Zamicol 59d ago
Very impressive work. Well done on the blog.

This reminds of of the sort of work Nayuki does: https://www.nayuki.io

[−] account42 58d ago

> Application error: a client-side exception has occurred (see the browser console for more information).

OK, but what does this have to do with JPEG compression.

[−] 7777777phil 59d ago
[−] aanet 59d ago
Just a lil comment to say that the blog is impressive -- the aesthetics, the explanations, the visuals -- all look great. Kudos.
[−] vmilner 59d ago
The weirdest thing to me is that the quantisation matrix isn’t symmetrical in the top left to bottom right diagonal.
[−] jbverschoor 59d ago
Application error: a client-side exception has occurred (see the browser console for more information).
[−] greenavocado 59d ago
I would love to see this document extended to explain the optional arithmetic coding in JPEG
[−] j2kun 59d ago
Beyond the content, I have to say I love the aesthetic vibe of this website.