Aras Pranckevičius @aras

Recent searches

Search options

Only available when logged in.

**Bartosz Taudul** @wolfpld · Nov 8, 2024

Nov 8, 2024

Should I bother adding qoi support? My gut feeling is that it's a pretty meh codec that doesn't push any of the existing boundaries. Does anyone use it at all?

It's actually quite annoying how much attention and widespread adoption it's gotten, when far more viable improvements are right there on the table, but no one's interested in them.

Bartosz Taudul @wolfpld@mastodon.gamedev.place

For example, I took my 16k square test image and saved it in gimp with 0 compression, resulting in a 1 GB file. This presumably applies the png filters, but skips deflate. Then I did this:

gzip -9:
57 MB, 59.8 s to save, 2.6 s to load.

zstd -18:
43 MB, 25.9 s to save, 0.28 s to load.

xz -9:
39 MB, 17.3 s to save, 0.48 s to load.

Nov 08, 2024, 01:06 PM·

0boosts·0favorites

**tumdum** @tumdum@icosahedron.website · Nov 8, 2024

Nov 8, 2024

tumdum @tumdum@icosahedron.website

@wolfpld none of those have a specification that fits on one page

**Dougall** @dougall@mastodon.social · Nov 8, 2024

Nov 8, 2024

Dougall @dougall@mastodon.social

@wolfpld IIRC it significantly beat all general purpose compression for compression speed at the time of release.

It led to the creation of https://github.com/richgel999/fpng and https://github.com/veluca93/fpnge which are backwards compatible with png and faster than qoi... So it wasn't a waste of time, but I wouldn't support it.

FPNGE slides: https://www.lucaversari.it/FJXL_and_FPNGE.pdf

Old benchmarks (from https://x.com/jonsneyers/status/1483000547934449668):

Benchmark graph, comparing speed and file size

**Bartosz Taudul** @wolfpld · Nov 8, 2024

Nov 8, 2024

Bartosz Taudul @wolfpld

@dougall If I remember correctly, Rich was toying with png (specifically replacing zlib with zstd) before he stumbled upon qoi.

Are fpnge/fjxl viable at all? Or do they need special support in the decompressor to be fast and not fall into the slow backwards compatibility mode?

The problem with new image formats is that they need widespread adoption to be useful, but somehow we are still stuck with png and jpeg (and maybe some webp here and there).

**Dougall** @dougall@mastodon.social · Nov 8, 2024

Nov 8, 2024

Dougall @dougall@mastodon.social

@wolfpld Yeah, it was topical...

I haven't looked at fjxl.

fpnge is an AVX2 proof-of-concept (no ARM support), so I don't consider it viable, but it is just a compressor. It's not really trying to make decompression faster.

fpng is similar but works on ARM, and has a fast decoder specifically for files it made itself. I think this made it strictly more competitive with qoi, but it's only 10% faster than Wuffs' general-purpose PNG decompression (so I wouldn't bother with using the decoder).

**Bartosz Taudul** @wolfpld · Nov 8, 2024

Nov 8, 2024

Bartosz Taudul @wolfpld

@dougall Yeah, the specific use case I'm most interested in is reading png files created with tools I have no control over, so these solutions are not particularly interesting to me. And png decoding is the bottleneck.

**Larry (Mr.Optimization)** @fast_code_r_us@floss.social · Feb 26

Feb 26

Larry (Mr.Optimization) @fast_code_r_us@floss.social

@wolfpld @dougall I was able to speed up standard zlib inflate by adding faster pattern copy. The zlib code works 1 byte at a time to not have unaligned access problems and to properly copy short patterns. I optimized this with register sized writes (32/64-bit) and smart pattern copying. For well-compressed files (aka long runs of repeating patterns), it's dramatically faster at decoding.

**Bartosz Taudul** @wolfpld · Feb 26

Feb 26

Bartosz Taudul @wolfpld

@fast_code_r_us @dougall Have you looked at how it works in zlib-ng? They're doing SIMD all over the place, so I would imagine there would be no such problems there.

In Arch, you can install the zlib-ng-compat package to replace zlib, so now I don't even bother building zlib-ng in my projects. Especially considering how problematic it is to get libpng to use the library you just built instead of the system one.

**Larry (Mr.Optimization)** @fast_code_r_us@floss.social · Feb 26

Feb 26

Larry (Mr.Optimization) @fast_code_r_us@floss.social

@wolfpld That code has the right idea, but I went a little further. For my own uses, I did away with the "safe" part and require the output buffer to be 8 bytes larger than required to allow writing past the end. A small speed bump, but I only wrote it for myself.

**Larry (Mr.Optimization)** @fast_code_r_us@floss.social · Feb 26

Feb 26

Larry (Mr.Optimization) @fast_code_r_us@floss.social

@wolfpld @dougall BTW - you've intrigued me with your "world's fastest" claim on the ETC compressor. I had to optimize one for Google about 8 years ago and they wouldn't let me upstream the improvements. I'm going take a look at yours...

Drag & drop to upload

Recent searches

Search options

Administered by:

Server stats:

Recent searches

Search options

Administered by:

Server stats:

Back