How H.264 works – Part III: VP6 vs H.264.

The time has come for the third and last part of our H.264 analisys. You can find the first and the second part here and here. In this concluding part we compare H264, the new “kid on the block”, with Vp6, the Flash Video “Veteran”.

VP6 video codec

For the Flash Player 8, Macromedia needed a state-of-the-art video codec to sustein the adoption of the Flash Player as the ideal video delivery platform. The choice fallen on On2’s VP6. You may ask why did Macromedia chosen in 2005 a proprietary technology instead of H.264 whose specification was already released. The reason was the excessive complexity of H.264 and the licending.

As a result of licending schema, the costs of deploying professional H.264 videos can be very high. Under the MPEG LA terms, in larger deployments encoder and decoder fees are $0.20 per unit up to five million units (beyond the price is $0.10). There are also content-based fees. The cap for these fees was $3.5 million (per firm) in 2005 and 2006, $4.25 million in 2007 and 2008 and $5 million in 2009 and 2010.
Under the terms of the VIA License, in larger deployments, encoder and decoder fees are $0.25 per unit, as well as content fees. As an annual cap on content fees, non-PC OEMs pay $2.5 million a year and PC OEMs pay $4 million.
These royalty are only for the intellectual property licenses and not for a working codec, thus to this cost we must add the cost of developing an efficient H.264 compliant codec and this was not an easy task.

The choice of VP6, which is royalty-free, permitted Macromedia to avoid the payment of several million dollars in fees. But this early advantage has initially hidden the Vp6 main disadvantage. On2 holds firmly the rights for the production and use of Vp6 encoders.The lack of free Vp6 encoder has slowed down a wider adoption of this technology.

On2 is a firm specialized in high-quality video codec. In the last 15 years, On2 released a set of very good codecs: VP3 (later released as open source and become the base for Theora project), VP4, VP5, VP6 and VP7. Macromedia licensed both VP6 and VP7 but, by now, has included only a VP6 decoder in Flash Player 8+ and an encoder in the Flash Player 8 Professional Video Exporter. After the deal Adobe-Macromedia, It’s not clear what use Adobe will do of the VP7 license.

On2 declares VP6 to be a H.264-class codec and even to surpass it in Power Signal / Noise Ratio (PSNR) in many scenarios. Beeing a proprietary technology, we don’t have detailed informations about compression techniques used.  All informations below derive from unofficial web materials, VP3 format documentation, On2’s white papers and interviews.

Technical details

VP6 makes use of Intra compressed frames (I-frames) and unidirectionally predicted frames (P-frames) only . There are no B-type frames, but P frames can have multiple reference frames in the past.

VP6 uses a somewhat traditional 8×8 iDCT-class transform for spatial to frequency domain transformation (VP7 uses a 4×4 trasform, similar to H.264) . Intra-compression makes use of spatial prediction modes, although not as much advanced as those found in VP7 or in H.264 (probably similar to what is used in H.263++).

Macroblocks are arrays of 16×16 pixels and motion prediction is done with one vector per macroblock or 4 vectors (one for each 8×8 block). There are a number of motion vector prediction modes and for each macroblock is possible to choose between two reference frames: the previous frame, and a previously bookmarked frame.

In older VPx codecs, this second reference frame was, necessarily, the previous I-frame (keyframe). The bookmarked reference frame approach is more accurate and is very usefull to reduce bitrate in fast-changing scenes. Using more than 2-3 reference frames produces indeed modest improvements in compression ratio.

VP6 uses an “adaptive sub-pixel motion estimation”. The filters used for motion estimation are sensitive to content which allows the motion estimation to better preserve detail.
Quarter pel motion compensation is supported as well as unrestricted motion compensation. The range of motion compensation has been increased from previous codecs (extended long range).

VP6 takes advantages also of a better prediction of low-order frequency coefficients and an improved quantization strategy that preserves more details in the output.

For entropy coding, VP6 uses various techniques based on complexity and/or overall frame size,  including VLC and context modeled binary coding.

To achieve any requested data rate, the codec chooses automatically to adjust quantization levels, adjust encoded frame dimensions, or drop frames altogether. Much of the efficiency of the codec is dued to an efficient in-loop deblocking filter (like in H.264).

The relatively high bandwidth available today to Internet users and the improvement in processing power (with multi-core cpu) are strongly pushing HD contents on the web. Starting from FP9 update 2, Flash Player is able to leverage modern multi-core cpu capabilities and video acceleration via Hardware. Today the Flash Player is finally able to decode hi-definition videos on a personal computer at full screen. Since Full High definition (1920×1080) decoding is still a complex task and a lot of processor out there have only one core, Adobe and On2 have defined a new fast “simple profile” for VP6, calles VP6-S. This profile, aimed at facilitating the decoding of HD footages for older systems, disables some cpu-hungry techniques like deblocking filter and arithmeric coding, and reduces the complexity of sub-pixel filtering. At high-bitrate (1.5-2Mbit/s or more) there are only a small loss in efficiency but a 30/40% less processing power needed.

VP6 vs H.264 and Conclusions

For what we have said, it is clear that VP6 is much simplier than H.264. And this is its main advantage. VP6 requires less complexity both in decoding and encoding stage.

how can VP6 obtain encoding performance comparable to H.264 ?

We must consider that VP6 uses fewer but efficient encoding techniques. A mix of adaptive sub-pixel motion estimation, better prediction of low-order frequency coefficients, improved quantization strategy, de-blocking and de-ringing filters or enhanced context based entropy coding, produces compressed movies with a very good “perceptual” quality.

Furthermore, we have already described the difference between a codec technology and a codec implementation. H.264 has a lot of modes, options and annexs, and to built an efficient encoder is quite difficult.
Fortunately the implementation of the H.264 decoder in the Flash Player is quite complete and supports Base, Main, High and High 10 profiles completely.  The decoder exploits multi-core cpus and hardware video layer acceleration for full-screen playback. MainConcept (A firm specialized in video codec) has written the decoder in only 100Kbytes (compressed).
The encoders on the market are every day more efficient but only recently has been possible to see the real potentiality of H.264.

Concluding, I think Vp6 has been a very good codec and even today it offers an interesting power/performance ratio. H.264 requires more power, requires licensing but produce better results expecially at lower bitrate.

I’ll post soon a couple of examples of state-of-the-art H.264 encoding where you will be able to judge what is possible with the latest encoders.

2 thoughts on “How H.264 works – Part III: VP6 vs H.264.

Leave a comment