How H.264 works – Part II. – Video Encoding & Streaming Technologies

Let’s continue the analysis of H.264 main compression specifications and techniques:

Motion Compensation

H.264 uses P frames (predicted) and B frames (interpolated) with full pixel, half-pixel and quarte pixel resolution. Prediction at half-pixel uses a 6-tap filter for pel interpolation, quartel-pixel precision use bilinear interpolation of half-pixels.
Motion compensation is done using seven macroblock configurations with block size as large as 16×16 and as small as 4×4. Each macroblock can have a different reference picture. The picture below lists the possible motion extimation vector’s configurations.

B frames are predicted from previous and/or future pictures with 5 prediction Modes (intra, forward, backward, interpolated and direct) designed to suit different scenarios.

Weighted prediction allows an encoder to specify the use of a scaling and offset when performing motion compensation providing a significant benefit in performance in special cases, such as fade-to-black, fade-in, and cross-fade transitions.
It is also possible to use B-Frames as reference for other B-Frames (B-pyramid).

In-Loop Deblocking Filter

Loop filtering is mandatory in the encoder, it identify a blocking situation depending by two threshold factors (alpha and beta). A lot of efficiency is due to the loop filter. The strength of filter depends on intra/inter coding, differential vectors, quantization level. Up to 40% of total processing power may be required by this kind of filter. Filtering the reference frames prior to use them in prediction can significantly improve the objective and perceptual quality expecially at low or medium bitrates.

Entropy Coding

For entropy coding, H.264 may use an enhanced VLC, a more complex context-adaptive variable-length coding (CAVLC) or an ever more complex Context-adaptive binary-arithmetic coding (CABAC) which are complex techniques to losslessly compress syntax elements in the video stream knowing the probabilities of syntax elements in a given context. The use of CABAC can improve the compression of around 5-7%. CABAC may requires a 30-40% of total processing power to be accomplished.

These techniques, along with several others, help H.264 to perform significantly better than any prior standard, under a wide variety of circumstances in a wide variety of application environments. H.264 can often perform radically better than MPEG-2 video—typically obtaining the same quality at half of the bit rate or less. It performes also better than MPEG4-class video codecs like DIVX. Today, H.264 belongs to the State of the Art in video encoding and has obtained a wide adoption in various industry applications ranging from mobile video (3gpp) to High Definition contents production (AVC-HD cams) and delivery (HD-DVD, BlueRay disks and Satellate HD broadcasts).

Other features

H.264 is a very complex standard and there are other interesting features like lossless encoding, interlaced frame optimized strategies (MBAFF – PAFF), data partitioning, slices and frame reordering, error resilience strategies.

Codec Profiles

A number of profiles exists witch define exactly what available techniques and strategy are used. Simplier profiles requires less processing power and less memory but achieve a worst quality/bitrate ratio.

Baseline Profile (BP): Primarily for lower-cost applications with limited computing resources, this profile is used widely in videoconferencing and mobile applications.
Main Profile (MP): Originally intended as the mainstream consumer profile for broadcast and storage applications, the importance of this profile faded when the High profile was developed for those applications.
Extended Profile (XP): Intended as the streaming video profile, this profile has relatively high compression capability and some extra tricks for robustness to data losses and server stream switching.
High Profile (HiP): The primary profile for broadcast and disc storage applications, particularly for high-definition television applications (this is the profile adopted into HD DVD and Blu-ray Disc).

High 10 Profile (Hi10P): Going beyond today’s mainstream consumer product capabilities, this profile builds on top of the High Profile—adding support for up to 10 bits per sample of decoded picture precision.

	Baseline	Extended	Main	High	High 10
I and P Slices	Yes	Yes	Yes	Yes	Yes
B Slices	No	Yes	Yes	Yes	Yes
SI and SP Slices	No	Yes	No	No	No
Multiple Reference Frames	Yes	Yes	Yes	Yes	Yes
In-Loop Deblocking Filter	Yes	Yes	Yes	Yes	Yes
CAVLC Entropy Coding	Yes	Yes	Yes	Yes	Yes
CABAC Entropy Coding	No	No	Yes	Yes	Yes
Flexible Macroblock Ordering (FMO)	Yes	Yes	No	No	No
Arbitrary Slice Ordering (ASO)	Yes	Yes	No	No	No
Redundant Slices (RS)	Yes	Yes	No	No	No
Data Partitioning	No	Yes	No	No	No
Interlaced Coding (PicAFF, MBAFF)	No	Yes	Yes	Yes	Yes
4:2:0 Chroma Format	Yes	Yes	Yes	Yes	Yes
Monochrome Video Format (4:0:0)	No	No	No	Yes	Yes
4:2:2 Chroma Format	No	No	No	No	No
4:4:4 Chroma Format	No	No	No	No	No
8 Bit Sample Depth	Yes	Yes	Yes	Yes	Yes
9 and 10 Bit Sample Depth	No	No	No	No	Yes
11 to 14 Bit Sample Depth	No	No	No	No	No
8×8 vs. 4×4 Transform Adaptivity	No	No	No	Yes	Yes
Quantization Scaling Matrices	No	No	No	Yes	Yes
Separate Cb and Cr QP control	No	No	No	Yes	Yes
Separate Color Plane Coding	No	No	No	No	No
Predictive Lossless Coding	No	No	No	No	No
	Baseline	Extended	Main	High	High 10

TO BE CONTINUED…

How H.264 works – Part II.

Motion Compensation

In-Loop Deblocking Filter

Entropy Coding

Published by sonnati

Leave a comment Cancel reply

Motion Compensation

In-Loop Deblocking Filter

Entropy Coding

Share this:

Related

Published by sonnati

Leave a comment Cancel reply