Skip to content

Top New Best Ask Show Jobs

QOI – The “Quite OK Image Format” for fast, lossless image compression | Better HN

QOI – The “Quite OK Image Format” for fast, lossless image compression (opens in new tab)

(github.com)

200 pointsJeanMo4y ago103 comments

103 comments

sreekotay4y ago

If QOI is interesting because of speed, you might take a look at fpng, a recent/actively developed png reader/writer that is achieving comparable speed/compression to QOI, while staying png compliant.

https://github.com/richgel999/fpng

Disclaimer: have not actively tried either.

jws4y ago

I find it interesting that QOI avoids any kind of Huffman style coding.

Huffman encoding lets you store frequently used values in fewer bits than rarely occurring values, but the cost of a naïve implementation is a branch on every encoded bit. You can mitigate this by making a state machine keyed by "accumulated prefix bits" and as many bits as you want to process in a whack, these tables will blow out your L1 data cache and trash a lot of your L2 cache as well.¹

The "opcode" strategy in QOI is going to give you branches, but they appear nearly perfectly predictable for common image types², so that helps. It has a table of recent colors, but that is only of a few cache lines.

In all, it seems a better fit for the deep pipelines and wildly varying access speeds across cache and memory layers which we find today.

␄

¹ I don't think it ever made it into a paper, but in the mid-80s, when the best our Vax ethernet adapters could do was ~3Mbps I was getting about 10Mbps of decompressed 12 bit monochrome imagery out of a ~1.3MIP computer using this technique.

² I also wouldn't be surprised if this statement is false. It just seems that for continuous tone images one of RGBA, DIFF, or LUMA is going to win for any given region of a scan line.

chrismorgan4y ago

Meta: ␄ (https://en.wikipedia.org/wiki/End-of-Transmission_character) isn’t the right control character when footnotes follow; ␃ (https://en.wikipedia.org/wiki/End-of-Text_character) is a better fit, and ␌ (https://en.wikipedia.org/wiki/Form_feed) would be a decent choice too.

(I write comments with footnotes in the same style as you, but use “—⁂—” as the separator, via Compose+h+r (name from the HTML tag horizontal rule). Good fun being able to use Compose+E+O+T, Compose+E+T+X and Compose+F+F in this comment; I added the full set to my .XCompose years ago.)

adgjlsfhk14y ago

One thing to note is that QOI composes really nicely with high quality entropy encoders like LZ4 and ZSTD. LZ4 gives a roughly 5% size reduction with negligible speed impact, and ZSTD gives a 20% size reduction with moderate speed impact (https://github.com/nigeltao/qoi2-bikeshed/issues/25).

ErikCorry4y ago

I would think that you could use a hybrid approach where you have a table that is perhaps 9 or 10 bits wide and covers many of the more common codes, which will by definition be more common. Should be small enough to fit in the cache. Then do something slower for the very long codes. This way you avoid difficult branches most of the time.

ErikCorry4y ago

Funnily enough it's just a few days since I did some similar code to support writing PNGs from a small embedded device. In this case the full deflate algorithm seemed like overkill in memory and CPU requirement, and most of the images were probably going to be served over a LAN anyway. https://twitter.com/toitpkg/status/1471986776357097475

https://github.com/toitlang/toit/commit/65c6c1bd7138f9ebced4... It's not as highly optimized as this effort though, and it just uses the standard huffman table that is built into deflate, rather than a static-but-custom one.

cornstalks4y ago

A couple previous interesting discussions from this past month:

- "The QOI File Format Specification" 214 points | 3 days ago | 54 comments: https://news.ycombinator.com/item?id=29625084

- "QOI: Lossless Image Compression in O(n) Time" 1057 points | 29 days ago | 293 comments: https://news.ycombinator.com/item?id=29328750

FullyFunctional4y ago

Since we are rehashing this for the 3rd (4th?) time, I'll repeat mine (and apparently many others) key critique: there is no thought at all to enabling parallel decoding, be it, thread-parallel or SIMD (or both). That makes it very much a past millennium style format that will age very poorly.

At the very least, break it into chunks and add an offset directory header. I'm sure one could do something much better, but it's a start.

EDIT: typo

stathibus4y ago

Who cares that it's not set up for simd?

Seriously, who?

This project is interesting because of how well it does compared to other systems of much higher complexity and without optimizing the implementation to high heaven. We can all learn something from that.

FullyFunctional4y ago

Good question. The answer is all the poor souls that N years later find themselves stuck with a data in a legacy format that they have to struggle to decode faster.

Of all the artifacts in our industry, few things live longer than formats. Eg. we are still unpacking tar files (Tape ARchieve), transmitted over IPv4, decoded by machines running x86 processors (and others, sure). All of these formats couldn't possible anticipate the evolution that follow nor predicted the explosive popularity they would have. And all of these (the latter two notably) have overheads that have real material costs. IPv6 fixed all the misaligned fields, but IPv4 is still dominant. Ironically, RISC-V didn't learn from x86 but added variable length instructions making decoding harder to scale than necessary.

I'm not sure what positive lessons you think we should learn from QOI. It's not hard to come up with simple formats. It's much harder coming up with a format that learns from past failures and avoids future pitfalls.

midjji4y ago

A surprisingly good image format is to use a per line, or block, ar encoder, then compress the result with gzip on a low setting. It parallelizes very nicely, and beats png for encode, decode, and is trivial to implement.

ErikCorry4y ago

You should be able to do the same technique, but in a PNG-compatible format.

sushibowl4y ago

isn't that very similar to what PNG does? I'm not sure what you mean by "ar encoder", but PNG uses a per-line filter and then adds DEFLATE on top of that.

meltedcapacitor4y ago

A thread can scan the opcodes only to find cut-off points and distribute actual decoding to other cores. Surely you can do that with some simd magic, as well as the decoding threads, without needing to encode properties of today's simd in the encoding.

seoaeu4y ago

No it can't. The encoder doesn't insert any "cut-off points". In fact, nearly every chunk encodes the current pixel value in terms of one of the previous pixel values, so without knowing those it is impossible for a second core to start up and initialize its decoder state enough to produce correct output.

throwamon4y ago

Didn't HN disallow recent reposts? This (or its spec) was already posted 3 days ago (twice) and then 2 days ago...

https://news.ycombinator.com/item?id=29625084

https://news.ycombinator.com/item?id=29631717

https://news.ycombinator.com/item?id=29643370

versteegen4y ago

It's because this link (https://github.com/phoboslab/qoi) has been approved by mods for reposting: it appears on the "pool" list [1] [2]. Which is a bit odd because as you point out, a different link [3] for the same project already received lots of attention.

[1] https://news.ycombinator.com/pool

[2] https://news.ycombinator.com/item?id=26998308

[3] https://news.ycombinator.com/item?id=29625084

I think QOI inspired the creation of https://github.com/richgel999/fpng which creates standard PNGs and compares itself directly to QOI.

phoboslab4y ago

Don't expect too much of QOI.

I wanted a simple format that allows you to load/save images quickly, without dealing with the complexity of JPEG or PNG. Even BMP, TIFF and other "legacy" formats are way more complicated to handle when you start looking into it. So that's what QOI aims to replace.

There's a lot of research for a successor format ongoing. Block based encoding, conversion to YUV, more OP-types etc. have all shown improved numbers. Better support for metadata, different bit-depths and allowing restarts (multithreading) is also high on the list of things to implement.

But QOI will stay as it is. It's the lowest of all hanging fruits that's not rotten on the ground.

hulitu4y ago

> I wanted a simple format that allows you to load/save images quickly, without dealing with the complexity of JPEG or PNG. Even BMP, TIFF and other "legacy" formats are way more complicated to handle when you start looking into it. So that's what QOI aims to replace.

XPM ? compressed with gzip ?

abainbridge4y ago

XPM and gzip are still not that simple. QOI is much simpler.

PostThisTooFast4y ago

What are we to make of that warning? What shortcomings do you think people will find?

flohofwoe4y ago

It's simply better suited for some types of images than others (e.g. the resulting size is sometimes bigger than expected). The main advantage is the very simple encoder and decoder with a specification that fits on a single page (and which still yields surprisingly good results for many image types):

https://qoiformat.org/qoi-specification.pdf

causality04y ago

Mostly the fact people have found and will find shortcomings that won't be fixed because the project is done, like everything being big-endian.

petitg19874y ago

I just implemented this format in my game engine and the performances are crazy: images loading is 3.2 times faster (compared to png) and 40 times faster to generate game screenshot!

junon4y ago

And the size difference? mapping in a raw pixel data binary is infinitely faster than any image encoding, but takes up the most space of course.

petitg19874y ago

The generated screenshots are lighter (about 5%). However, the resource images in QOI format that I load are in average a little bigger (about 5% and sometimes until 35%). I guess it is not the perfect solution for AAA games which already use more than 30go nowadays.

zigzag3124y ago

Is there any open source audio compression format like that? Lossless and very fast. I haven't found any yet.

EDIT: I'm thinking about a format that would be suitable as a replacement for uncompressed WAV files in DAWs. Rendered tracks often have large sections of silence and uncompressed WAVs have always seemed wasteful to me.

LeoPanthera4y ago

FLAC is always lossless, but has a variable compression ratio so you can trade compression for speed.

Using the command line "flac" tool, "flac -0" is the fastest, "flac -8" is the slowest, but produces the smallest files.

In my experience, 0-2 all produce roughly equivalent sized files, as do 4-8.

I tried passing stereo wavs in 2 x 16bits (4bytes) as rgba for qoi but I haven't been very successful.

wombatmobile4y ago

I'd also like to know what's the best (or any) lossless audio compression process/tools.

My application is to send audio (podcast recordings) to a remote audio engineer friend who will do the post processing, then round trip it to me to complete the editing.

Wav is so big it makes a 1 hr podcast a difficult proposition.

MP3 is unsuitable because compression introduces too many artefacts the quality suffers unacceptably.

What do other people do in this circumstances?

phonon4y ago

1 hour of CD quality mono FLAC encoded is about 100-150 MB. Is that small enough?

selectodude4y ago

FLAC and ALAC can be losslessly converted to back to WAV and cuts the file size in half.

StreamBright4y ago

ALAC? FLAC? What is the problem with these?

zigzag3124y ago

FLAC is limited to 24 bit depth. I was thinking of intermediate format suitable for use in DAWs and samplers that also supports floating point to avoid clipping.

3234y ago

WavPack might fit the bill. It has decent software support. Not sure if DAWs can use it natively, they might unpack it to a temp folder.

https://www.wavpack.com/

zigzag3124y ago

Reaper does. Unfortunately, WavPack has a bit too much performance overhead.

gzip -1 is lossless and fast. It will somewhat compress pcm data :)

zigzag3124y ago

You would loose fast seeking ability with gzip. Or am I mistaken?

flohofwoe4y ago

MOD files ;)

(but seriously, MODs can encode hours of audio into kilobytes, the downside is of course that they require a special authoring process which seems to be a bit of a lost art today)

meltedcapacitor4y ago

It is nice but pity it does not have a "turn right" opcode: start going left, on the turn opcode, continue decoding pixels after turning 90 degrees to the right, until you hit a previously decoded pixel or the wall of the bounding box defined after the first two turns, in which case you turn automatically. The file ends when there's nowhere to turn.

This would eliminate the need for a header (bloat!) as the end of file is clearly defined, the size is defined after decoding the top and right line (second turn), and it's not so sensitive to orientation (a pathological image can compress very differently in portrait vs landscape in line oriented formats). Color profile can be specified in the spec.

Also allows skipping altogether some image-wide bands or columns that are of the background color (defined by the first pixel) as you do not need to walk over all the pixels.

adgjlsfhk14y ago

Writing an encoder for that sounds like a nightmare though. Also the speed would suck since you would have unpredictable memory accesses.

meltedcapacitor4y ago

An encoder just walking a regular spiral (no uniform bands detection) is not hard. The band thing is an accidental artefact of the idea but plain run length encoding probably already captures most of the effect so no imperative to actually implement it.

Speed, yes, it is a fair objection, until hardware adopts spiral encoding :-)

booi4y ago

Seems like they benchmarked it against libpng which shows anywhere from 3-5x faster decompression and 30-50x compression. That's pretty impressive and even though libpng isn't the most performant of the png libraries, it's by far the most common.

I think the rust png library is ~4x faster than libpng which could erase the decompression advantage but that 50x faster compression speed is extremely impressive.

Can anybody tell if there's any significant feature differentials that might explain the difference (color space, pixel formats, .. etc)?

sakras4y ago

I think fundamentally it’s faster just because it’s dead simple. It’s just a mash of RLE, dictionary encoding, and delta encoding, and it does it all in a single pass. PNG has to break things into chunks, apply a filter, deflate, etc.

pornel4y ago

Filters are a form of delta encoding, and are optional for PNG encoders. Deflate is a form of dictionary encoding with RLE. There's no "breaking into chunks" in PNG — PNG can encode the entire image as a single iDAT chunk (and chunks themselves are so trivial they have no impact on speed).

You can choose not to do filtering when encoding PNG. Fast deflate settings are literally RLE-only, and you can see elsewhere in this thread people have developed specialized encoders that ignore most deflate features.

The only misfeature PNG has that slows down encoding is CRC. Decoders don't have to check the CRC, but encoders need to put one in to be spec-compliant.

aspyct4y ago

I would love to have something to compress the raw files from my camera. They're huge, I have to keep a ton of them, and I also need to transmit them over internet for my backup.

I tried a few standard compression format, with very little luck.

Canon has devised a very smart (slightly lossy) compression format for newer cameras, but there's no converter that I know of for my old camera files.

So, unless I shell out large amounts of money for a new camera, I'm stuck sending twice the data over the internet. Talk about pollution...

rocqua4y ago

There is the option of converting to DNG files. Which allow for really good lossless compression. This does come at the cost of changing the file format, and risks losing metadata. That's why I personally decided to just buy more storage instead.

Come to think of it, have you tried running a modern compression algorithm on the data? I don't think I did. Could be cool if combined with ZFS or similar to get the compression done transparently.

aspyct4y ago

Converting the CR2 (Canon) to DNG tends to double or even triple it's size, but I haven't tried compressing it afterwards.

I should, as you suggest, test a more exhaustive list of formats, who knows...

At a previous job was looking at different binary parsing methods. This project looks quite interesting having binary format descriptions in YAML that then can be generated into your language of choice.

https://formats.kaitai.io/png/

jqpabc1234y ago

Interesting format. It would be much more interesting if browsers supported it.

Not sure what you're expecting given how old it is. Why not write a polyfill as an exercise for yourself? Convert it to png, then save as an image tag to a data url.

Here look some people adapted to ios in one hour faffing around on twitch: https://www.twitch.tv/videos/1241476768?tt_medium=mobile_web...

ReactiveJelly4y ago

It's always gonna be chicken-and-egg for this, and browsers won't spend the time sandboxing and supporting a codec until it's already popular.

So this will probably see a JS / Webasm shim, and if that proves popular, Blink and Gecko will consider it.

The day might come soon when browsers just greenlight a webasm interface for codecs. "We'll put packets in through this function, and take frames out through this function, like ffmpeg. Other than that, you're running in a sandbox with X MB of RAM, Y seconds of CPU per frame, and no I/O. Anything you can accomplish within that, with user opt-in, is valid."

flohofwoe4y ago

Here you go ;)

https://floooh.github.io/qoiview/qoiview.html

A QOI decoder should fit into a few hundred bytes of WASM at most, maybe a few kilobytes for a "proper" polyfill.

j / k navigate · click thread line to collapse