H.264 for image compression

Recently Google has presented the WebP initiative. The initiative proposes a substitute of JPEG for image compression using the intra frame compression technique of the WebM project (Codec VP8).
Not everybody know it but also H.264 is very good at encoding still picture and differently from WebM or WebP the 97% of PC are already capable to decode it (someone has said ‘Flash Player’ ?).
The intra-frame compression techniques implemented in H.264 are very efficient, much more advanced than the old JPG and superior to WebP too. So let’s take a look at how to produce such file and the advantage of using it inside Flash Player.

JPG image compression

JPG is an international standard approved in 1992-94. It has been one of the most important technology for the web because without an efficient way to compress still pictures the web would not be what it is today. JPEG is usually capable to compress image size 1:10. The encoder performs these steps:

1. Color Space convertion from RGB to YCbCr
2. Chroma sub sampling, usually to 4:2:0 (supported also 4:2:2 or 4:4:4)
3. Discrete Cosine Transform of 8×8 blocks
4. Quantization
5. Entropy Coding (ZigZag RLE and Huffman)

The algorithm is well known and robust and is used in almost every electronic device with a color display, but obviously in the last 15 years the scientists have developed more advanced algorithms to encode still pictures. One of this is JPEG2000 which leverages Wavelets to encode picture. But the problem of improving intra frame compression is very important also in video encoding because this is the kind of compression used for Keyframes. So H.263 before and H.264 after proposed more optimized ways to encode a single picture.

H.264 intra frame compression

H.264 contains a number of new features that allow it to compress images much more efficiently than JPG.

New transform design

Differently from JPG, an exact-match integer 4×4 spatial block transform is used instead of the well known 8×8 DCT. It is conceptually similar to DCT but with less ringing artifacts.  There is also a 8×8 spatial block transform for less detailed areas and chroma.

A secondary Hadamard Transform (2×2 on chroma and 4×4 on luma) can be usually performed on “DC” coefficients to obtain even more compression in smooth regions.

There is also an optimized quantization and two possible zig-zag pattern for Run Length Encoding of transformed coefficients.

Intra-frame compression

H.264 introduces complex spatial prediction for intra-frame compression.
Rather than the “DC”-only prediction found in MPEG2 and the transform coefficient prediction found in H.263+, H.264 defines 6 prediction directions (modes) to predict spatial information from neighbouring blocks when encoded using 4×4 transform. The encoder tries to predict the block interpolating the color value of adiacent blocks. Only the delta signal is therefore transmitted.

There are also 4 prediction modes for smooth color zones (16×16 blocks). Residual data are coded with 4×4 trasforms and a further 4×4 Hadamard trasform is used for DC coefficients.

Improved quantization

A new logarithmic quantization step is used (compound rate 12%). It’s also possible to use Frequency-customized quantization scaling matrices selected by the encoder for perceptual-based quantization optimization.

Inloop deblocking filter

An adaptive deblocking filter is applied to reduce eventual blocking artifacts at high compression ratio.

Advanced Entropy Coding

H.264 can use the state of the art in entropy coding: Context Adaptive Binary Arithmetic Coding (CABAC) which is much more efficient than the standard Huffman coding used in JPG.

JPEG vs H.264

The techniques used in H.264 double the efficiency of the compression. That is, you can achieve the same quality at half the size. The efficiency is even higher at very high compression ratio (1:25 +) where JPG introduces so many artifact to be completely unusable.

This is a detail of a 1024×576 image compressed to around 50KBytes both in JPG (using PaintShopPro) and H.264 (using FFMPEG). The picture has a x2 zoom to better show the artifacts:

I have estimated a reduction in size of around 40-50% at the same perceived quality, especially at high compression ratios.

WebP vs H.264

WebP is based on the intra frame compression technique of the codec VP8. I compared H.264 with VP8 in this article. VP8 is a good codec and its intra frame compression is very similar to H.264. The difference is that VP8 does not support the 8×8 block transform (which is a feature of H.264 High profile) and can only encode in 4:2:0 (H.264 support 4:4:4).  So both should have approximately the same performance at the common (in 4:2:0). The problem of WebP is the support which is now almost zero while H.264 can be decoded by Flash (97% of desktop + android + rim) and also by iOS devices (via HTML5).

How to encode and display on a page

Now let’s start to encode pictures in H.264. The container could be .mp4 or .flv. FLV is lighter than .mp4 but .mp4 has far more support outside Flash. This is the command line to use with FFMPEG:

ffmpeg.exe -i INPUT.jpg -an -vcodec libx264 -coder 1 -flags +loop -cmp +chroma -subq 10 -qcomp 0.6 -qmin 10 -qmax 51 -qdiff 4 -flags2 +dct8x8 -trellis 2 -partitions +parti8x8+parti4x4 -crf 24 -threads 0 -r 25 -g 25 -y OUTPUT.mp4

The -crf parameter changes the quality level. Try with values from 15 to 30 (the final effect depends by frame size). You can also resize the frame prior to encode using the parameter -s WxH (es: -s 640×360).

To display the picture encoded in H.264 you can use this simple AS3 code:

var nc:NetConnection = new NetConnection();
var ns:NetStream = new NetStream(nc);
video.smoothing = true;
nc.client = this;
ns.client = this;
stage.scaleMode = "noBorder";

The Advantage of using Flash for serving picture in H.264

The main advantage of using H.264 for pictures is in the superior compression ratio. But it is not practical in an every day scenario to substitute every istance of the common <img> tag with an SWF.
However there’s a kind of application that can have enormous benefits from using this approach: the display of big, high quality copyrighted pictures. Instead of access low quality, watermarked JPG, it could be possible to server such big, high quality pictures as H.264 streams from a Flash Media Server and protect the delivery using RTMPE protocol and SWF authentication. On top of that, for a Bullet-proof protection, you could even protect the H.264 payload encrypting the content with a robust DRM like Adobe Access 2.0 . Not bad.

12 thoughts on “H.264 for image compression

  1. Pretty neat…but I wouldn’t use the words “Bullet-proof protection”, unless Flash can somehow manage to pop the Print Screen key off my keyboard 😛

    1. Yes, you are right. Indeed FP 10.1 DRM supports HDMI protection but it is supported only in Windows Vista / 7 so screen grabbing is an issue.
      But suppose to have a big picture that you can scroll and zoom inside a preview window with a watermark stamped at runtime. It would be difficult to
      recostruct the original picture piece by piece.

  2. Hello,

    Very interesting technic to save bandwith with photo galleries.

    I have just gave it a try with latest ffmpeg at date (23/10/10), and ffmpeg seems just outputing a empty mp4 container (output.mp4, 260 bytes length).

    Which version of ffmpeg did you use?

    Thank you


  3. I’d like to point out that your example picture is a .jpg, so is not an accurate representation of the artifacts. (You stored the two versions into one jpeg after compressing in h264/jpeg) It would be better to use a lossless format like .png–that way you’re only seeing the artifacts from the first compression as intended.

  4. Hmm.. wait a minute, in your command line example, you’re using a JPG as input?
    Wasn’t the purpose of using h264 that it produces less artifacts than jpeg, but then you use jpeg as input.

    Reminds me of somebody converting MP3 to high bitrate AAC….

  5. Hi, I would like to find a way to convert an h264 frame to jpeg. I need to migrate the implementation on a Windows CE platform. Does anyone have any suggestion as to how to achieve this with the current open source implementation such as FFMpeg

  6. On OS X, with an ffmpeg installed via Homebrew, the following command works and results in a file that is compatible with Chrome, QuickTime and the latest iDevices:
    ffmpeg -i FILENAME.png -an -vcodec libx264 -coder 1 -flags +loop -cmp +chroma -subq 10 -qcomp 0.6 -qmin 10 -qmax 51 -qdiff 4 -profile:v high -level 4.2 -pix_fmt yuv420p -vf “scale=trunc(iw/2)*2:trunc(ih/2)*2” -trellis 2 -partitions +parti8x8+parti4x4 -crf 24 -threads 0 -r 25 -g 25 -y FILENAME.mp4

    Basically you have to remove “-flags2 +dct8x8” from the command in the original post, I added a couple more parameters for apple optimizations

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s