How H.264 works – Part II.

Let’s continue the analysis of H.264 main compression specifications and techniques:

Motion Compensation

H.264 uses P frames (predicted) and B frames (interpolated) with full pixel, half-pixel and quarte pixel resolution. Prediction at half-pixel uses a 6-tap filter for pel interpolation, quartel-pixel precision use bilinear interpolation of half-pixels.
Motion compensation is done using seven macroblock configurations with block size as large as 16×16 and as small as 4×4. Each macroblock can have a different reference picture. The picture below lists the possible motion extimation vector’s configurations.

B frames are predicted from previous and/or future pictures with 5 prediction Modes (intra, forward, backward, interpolated and direct) designed to suit different scenarios.

Weighted prediction allows an encoder to specify the use of a scaling and offset when performing motion compensation providing a significant benefit in performance in special cases, such as fade-to-black, fade-in, and cross-fade transitions.
It is also possible to use B-Frames as reference for other B-Frames (B-pyramid).

In-Loop Deblocking Filter

Loop filtering is mandatory in the encoder, it identify a blocking situation depending by two threshold factors (alpha and beta). A lot of efficiency is due to the loop filter. The strength of filter depends on intra/inter coding, differential vectors, quantization level. Up to 40% of total processing power may be required by this kind of filter. Filtering the reference frames prior to use them in prediction can significantly improve the objective and perceptual quality expecially at low or medium bitrates.

Entropy Coding

For entropy coding, H.264 may use an enhanced VLC, a more complex context-adaptive variable-length coding (CAVLC)  or an ever more complex Context-adaptive binary-arithmetic coding (CABAC) which are complex techniques to losslessly compress syntax elements in the video stream knowing the probabilities of syntax elements in a given context. The use of CABAC can improve the compression of around 5-7%. CABAC may requires a 30-40% of total processing power to be accomplished.

These techniques, along with several others, help H.264 to perform significantly better than any prior standard,  under a wide variety of circumstances in a wide variety of application environments. H.264 can often perform radically better than MPEG-2 video—typically obtaining the same quality at half of the bit rate or less. It performes also better than MPEG4-class video codecs like DIVX. Today, H.264 belongs to the State of the Art in video encoding and has obtained a wide adoption in various industry applications ranging from mobile video (3gpp) to High Definition contents production (AVC-HD cams) and delivery (HD-DVD, BlueRay disks and Satellate HD broadcasts).

Other features

H.264 is a very complex standard and there are other interesting features like lossless encoding, interlaced frame optimized strategies (MBAFF – PAFF), data partitioning, slices and frame reordering, error resilience strategies.

Codec Profiles

A number of profiles exists witch define exactly what available techniques and strategy are used. Simplier profiles requires less processing power and less memory but achieve a worst quality/bitrate ratio.

  • Baseline Profile (BP): Primarily for lower-cost applications with limited computing resources, this profile is used widely in videoconferencing and mobile applications.
  • Main Profile (MP): Originally intended as the mainstream consumer profile for broadcast and storage applications, the importance of this profile faded when the High profile was developed for those applications.
  • Extended Profile (XP): Intended as the streaming video profile, this profile has relatively high compression capability and some extra tricks for robustness to data losses and server stream switching.
  • High Profile (HiP): The primary profile for broadcast and disc storage applications, particularly for high-definition television applications (this is the profile adopted into HD DVD and Blu-ray Disc).

High 10 Profile (Hi10P): Going beyond today’s mainstream consumer product capabilities, this profile builds on top of the High Profile—adding support for up to 10 bits per sample of decoded picture precision.

Baseline Extended Main High High 10
I and P Slices Yes Yes Yes Yes Yes
B Slices No Yes Yes Yes Yes
SI and SP Slices No Yes No No No
Multiple Reference Frames Yes Yes Yes Yes Yes
In-Loop Deblocking Filter Yes Yes Yes Yes Yes
CAVLC Entropy Coding Yes Yes Yes Yes Yes
CABAC Entropy Coding No No Yes Yes Yes
Flexible Macroblock Ordering (FMO) Yes Yes No No No
Arbitrary Slice Ordering (ASO) Yes Yes No No No
Redundant Slices (RS) Yes Yes No No No
Data Partitioning No Yes No No No
Interlaced Coding (PicAFF, MBAFF) No Yes Yes Yes Yes
4:2:0 Chroma Format Yes Yes Yes Yes Yes
Monochrome Video Format (4:0:0) No No No Yes Yes
4:2:2 Chroma Format No No No No No
4:4:4 Chroma Format No No No No No
8 Bit Sample Depth Yes Yes Yes Yes Yes
9 and 10 Bit Sample Depth No No No No Yes
11 to 14 Bit Sample Depth No No No No No
8×8 vs. 4×4 Transform Adaptivity No No No Yes Yes
Quantization Scaling Matrices No No No Yes Yes
Separate Cb and Cr QP control No No No Yes Yes
Separate Color Plane Coding No No No No No
Predictive Lossless Coding No No No No No
Baseline Extended Main High High 10

TO BE CONTINUED…

Leave a comment