Flash + H.264 = H.264 Squared – Part II
Finally I have found the time to write the second part of my article on how to enhance video playback using Flash.
But let’s start from the beginning. Do you remember this recent experiment, where I showed a near-HD* video encoded at only 250Kbit/s ?
A lot of people congratulated with me for the quality / bitrate ratio while someone else complained about the fact that the final result is far from beeing “canonical” HD.
Yes, it is not HD anymore but how could it be at a 1/20th of the original (already compressed) bitrate ? The rational of such experiments is not to reach a trasparent encoding at mad bitrates but find a balance between quality and bitrate where quality needs to be satisfactory for a web delivery scenario and the bitrate needs to be as low as possible.
Low bitrates mean wider reach and less delivery costs so my quest has these objectives: find methodologies, approaches and techniques to reduce delivery costs and offer good quality.
Part of this strategy involves the use of Flash Player to enhance video playback using filters and playback control. Here I explain my path to ultra low compression + playback enhancement.
For who does not know my experiment I suggest to take a look at this page before proceding.
1. Rate distortion curves
Every lossy video encoding technique like H.264, VP6, VP8 and so on reduces the amount of details that are transmitted to the output in order to compress the video bitstream (psico-visual model). Fine details are the first to go and as you raise the compression ratio an incresingly amount of details and features will be washed away from the original picture sequence. Different codec executes this in different way, so it’s very important to perform the compression in a clever manner in order to reduce as much as possible the bitrate without hurting too much the perceived final quality.
This is an example of Rate Distortion curve (approximate) that illustrate the qualitative realationship between bitrate and subjective quality for a given 720p HD content (the exact curve depend by content complexity and encoding techniques). Perceived quality changes not linearly with bitrate. The curve shows a flattening trend at higher bitrates while at lower bitrates the perceived quality deteriorates very quickly.
This set of curves illustrates the relationship between quality and bitrate at different resolutions. As you can see the curves cross each others at specific “tipping points”. The meaning of this is that there is no advantage in lowering too much the bitrate for a given resolution because at some point you will have a collapse in quality. After the tipping point it’s much better to change to a lower resolution, you will have an overall better perceived quality.
it is easy to understand that at very low bitrate the encoder can have difficulties in “filling” with meaningful informations every block or macroblock of the picture. When you work in such critical area it is better to lower the total number of blocks to encode. In this way the average amount of information per block will be higher and the encoder will be able to reduce the artifacts and reconstruct the picture in a more pleasant way.
So let’s return to my quest. I want to encode a 720p video at very low bitrate, but below 500Kbit/s everything is a mess: jagged borders, washed colors, blocking artifacts, b-frames flickering and so on. Reducing the resolution a bit can help to reduce the artifacts. Obviously for the details there is no hope. They are gone forever.
2. Enters the Flash Player
If the details have been washed away we can still try to reconstruct them using filters. The Flash Player has a very good level of control on video and media elements. So it is possible to create filters with very specific instruments (convolution matrices or pixel bender filters). The simplier possibility is to use a convolution filter to sharpen the image and try reconstruct part of the high frequencies contents of the original picture. This is only a perceptual reconstruction but can help to improve the subjective quality.
In the picture above you can see an example of that. On the right you have a movie from YouTube. It is a 720p encoded at 2Mbit/s. On the left you have the same, already compressed file, encoded at 500Kbit/s and enhanced with an enhancement filter. As you can see the image is better: more contrast, more texturing, more details, more sharpness.
Important note: enhancement filters can enhance artifact too so it is very important to have a clean, smooth picture eventually with few details but also few artifacts.
Furthermore it is also possible to add video noise.
3. The importance of noise
Every cam introduce a noise in the aquiring process especially when shooting with low lights. Every CCD,CMOS or film does that and a very light noise (aka film grain) is typical of high quality, high bitrates video. It is very hard to retain film noise during the encoding. This is because from a matematical point of view such grain is a very low power, high frequency signal…exactly the kind of information the lossy encoder tries to cut down to reduce the bitrate and compress the video. But reducing too much the high frequencies leads to a “flat” picture. Reintroducing noise in the decoding phase can enhance the perceived quality and hide some compression artifacts.
So, after a long series of test, I have found my personal equation to encode a near-HD video at 250Kbit/s:
Video Encoding Pipeline
1280×720 -> temporal denoising -> resizing to 1024×576 (or 960 x 544) -> encoding at 250Kbit/s
Video Decoding Pipeline (in Flash Player)
Expand the encoded video from 1024×576 (or 960 x 544) to 1280×720 -> enhance video’s sharpness* -> add film grain**
It’s very important to note that:
* Sharpness is restored on a 1280×720 mesh and not on the video resolution. This introduces details at the final, higher resolution.
** The same is done on grain. The added grain has a pixer resolution of 1280×720. The sum of 960×544 video scaled to 1280×720, details at 1280×720 and noise at 1280×720 reconstruct a credible 1280×720 video.
Let’s take a look at this comparison (click for a detailed version of the picture):
A = This is the original video (720p @ 4Mbit/s). Notice the film grain, the beard and the sweater’s details.
B = This is the same video @ 250Kbit/s. Notice that the beard is gone, on the shoulder line there are artifacts and the sweater’s details are not accurate.
C = This is the resized video (960×544) @ 250Kbit/s. Notice that the average details level is similar but there are less artifacts on the shoulder line and on the sweater.
D= This is the enhanced video, restored to 720p. Notice the enhancement on the beard, the good noise restoration and the details of the sweater.
A = Original 720p video. Notice the beard and hairs detail and the film grain.
B = Video encoded at 250Kbit/s. The skin is too flat, beard and hairs have low details.
C = Video resized to 960×544. Slightly better details retention.
D = Video scaled and enhanced. Compare with the original: D is compressed 16 times more than A.
Now it’s interesting to take another look at my last experiment from this new perspective. The original video is a 720p video found on YouTube. I have clipped it and scaled down to 960×480, encoded at 250Kbit/s, enhanced in the Flash Player and restored to 1280×720 with the addition of a film grain, exactly like discussed above.
The results are there for all to see, but there’s a problem. The processing power required for such post processing is high. Decoding a 960×544 instead of a 1280×720 helps but performing the filtering and adding the grain in transparency is a complex task. But don’t mind…I’m working on that (has someone said GPU acceleration & composition ?). Flash is very good for assuring an excellent user experience. The flexibility of the tool allows developes to enhance the standard features and optimize the delivery.
Concluding, these are only experiments, the frontier of the what is possible today for the video on Internet. But in my opinion, exploring the limits is important to understand how to perform better in every day work too.
I will be at MAX 2010 in Los Angeles to discuss about various techniques for Flash Video Optimizations. If someone want to discuss about these and other topics, see you there.