How H.264 works – Part II.
Let’s continue the analysis of H.264 main compression specifications and techniques:
H.264 uses P frames (predicted) and B frames (interpolated) with full pixel, half-pixel and quarte pixel resolution. Prediction at half-pixel uses a 6-tap filter for pel interpolation, quartel-pixel precision use bilinear interpolation of half-pixels.
Motion compensation is done using seven macroblock configurations with block size as large as 16×16 and as small as 4×4. Each macroblock can have a different reference picture. The picture below lists the possible motion extimation vector’s configurations.
B frames are predicted from previous and/or future pictures with 5 prediction Modes (intra, forward, backward, interpolated and direct) designed to suit different scenarios.
Weighted prediction allows an encoder to specify the use of a scaling and offset when performing motion compensation providing a significant benefit in performance in special cases, such as fade-to-black, fade-in, and cross-fade transitions.
It is also possible to use B-Frames as reference for other B-Frames (B-pyramid).
In-Loop Deblocking Filter
Loop filtering is mandatory in the encoder, it identify a blocking situation depending by two threshold factors (alpha and beta). A lot of efficiency is due to the loop filter. The strength of filter depends on intra/inter coding, differential vectors, quantization level. Up to 40% of total processing power may be required by this kind of filter. Filtering the reference frames prior to use them in prediction can significantly improve the objective and perceptual quality expecially at low or medium bitrates.
For entropy coding, H.264 may use an enhanced VLC, a more complex context-adaptive variable-length coding (CAVLC) or an ever more complex Context-adaptive binary-arithmetic coding (CABAC) which are complex techniques to losslessly compress syntax elements in the video stream knowing the probabilities of syntax elements in a given context. The use of CABAC can improve the compression of around 5-7%. CABAC may requires a 30-40% of total processing power to be accomplished.
These techniques, along with several others, help H.264 to perform significantly better than any prior standard, under a wide variety of circumstances in a wide variety of application environments. H.264 can often perform radically better than MPEG-2 video—typically obtaining the same quality at half of the bit rate or less. It performes also better than MPEG4-class video codecs like DIVX. Today, H.264 belongs to the State of the Art in video encoding and has obtained a wide adoption in various industry applications ranging from mobile video (3gpp) to High Definition contents production (AVC-HD cams) and delivery (HD-DVD, BlueRay disks and Satellate HD broadcasts).
H.264 is a very complex standard and there are other interesting features like lossless encoding, interlaced frame optimized strategies (MBAFF – PAFF), data partitioning, slices and frame reordering, error resilience strategies.
A number of profiles exists witch define exactly what available techniques and strategy are used. Simplier profiles requires less processing power and less memory but achieve a worst quality/bitrate ratio.
- Baseline Profile (BP): Primarily for lower-cost applications with limited computing resources, this profile is used widely in videoconferencing and mobile applications.
- Main Profile (MP): Originally intended as the mainstream consumer profile for broadcast and storage applications, the importance of this profile faded when the High profile was developed for those applications.
- Extended Profile (XP): Intended as the streaming video profile, this profile has relatively high compression capability and some extra tricks for robustness to data losses and server stream switching.
- High Profile (HiP): The primary profile for broadcast and disc storage applications, particularly for high-definition television applications (this is the profile adopted into HD DVD and Blu-ray Disc).
High 10 Profile (Hi10P): Going beyond today’s mainstream consumer product capabilities, this profile builds on top of the High Profile—adding support for up to 10 bits per sample of decoded picture precision.
|I and P Slices||Yes||Yes||Yes||Yes||Yes|
|SI and SP Slices||No||Yes||No||No||No|
|Multiple Reference Frames||Yes||Yes||Yes||Yes||Yes|
|In-Loop Deblocking Filter||Yes||Yes||Yes||Yes||Yes|
|CAVLC Entropy Coding||Yes||Yes||Yes||Yes||Yes|
|CABAC Entropy Coding||No||No||Yes||Yes||Yes|
|Flexible Macroblock Ordering (FMO)||Yes||Yes||No||No||No|
|Arbitrary Slice Ordering (ASO)||Yes||Yes||No||No||No|
|Redundant Slices (RS)||Yes||Yes||No||No||No|
|Interlaced Coding (PicAFF, MBAFF)||No||Yes||Yes||Yes||Yes|
|4:2:0 Chroma Format||Yes||Yes||Yes||Yes||Yes|
|Monochrome Video Format (4:0:0)||No||No||No||Yes||Yes|
|4:2:2 Chroma Format||No||No||No||No||No|
|4:4:4 Chroma Format||No||No||No||No||No|
|8 Bit Sample Depth||Yes||Yes||Yes||Yes||Yes|
|9 and 10 Bit Sample Depth||No||No||No||No||Yes|
|11 to 14 Bit Sample Depth||No||No||No||No||No|
|8×8 vs. 4×4 Transform Adaptivity||No||No||No||Yes||Yes|
|Quantization Scaling Matrices||No||No||No||Yes||Yes|
|Separate Cb and Cr QP control||No||No||No||Yes||Yes|
|Separate Color Plane Coding||No||No||No||No||No|
|Predictive Lossless Coding||No||No||No||No||No|
TO BE CONTINUED…