Archive for the ‘Uncategorized’ Category

Let’s rediscover the good old PSNR

26 November 2018 Leave a comment

In the last years, I’ve been involved in interesting projects around how to measure and/or estimate perceptual quality in video encoding. Measuring quality is by itself useful during encoding optimizations development and monitoring to assess the benefits of an optimized pipeline. But it’s even more interesting to estimate quality you can achieve with specific parameters before encoding so to be able to implement advanced logic in Content-Aware Encoding.

It’s a complex topic but I’d like to focus in this post on the role that PSNR can still have in measuring quality.

Is PSNR not much correlated to perception?

PSNR is well known to be not much correlated to perception. SSIM is a bit better but both show to be scarcely correlated from a general point of view, or at least so it seems:


The scatter-plot shows a number of samples encoded at different resolutions and levels of impairment and PSNR vs MOS (perceptual quality in standard TV-like viewing conditions). You can see that the relationship between PSNR and MOS is not very linear and for example a value of 40dB corresponds to MOS ranging from 1.5 to 5, depending on the content.

It is clear that we cannot use PSNR as an indicator of absolute quality in an encoding. Only at very high values, like 48dB the significance of the measure is useful.

It is because of that scarce correlation with MOS (the optimal metric would stay on the green line in the graph above) that Netflix has defined a metric like VMAF.

VMAF uses a different approach: it calculates multiple elementary metrics and using machine learning build a “blended” estimator that works much better than the individual elementary metrics.

I have worked on a different, but conceptually similar, perception-aware metric in the past. So I know that such metrics may have a problem: are a bit expensive to calculate. This is not because of ML that’s very fast when inferencing, but because you need to use accurate and slow elementary metrics (or a higher number of faster metrics) to have good results in the estimation.

Can PSNR still play a role?

In the common experience, PSNR still communicates something to the compressionists. Professionals like Jan Ozer continue to advocate PSNR for certain type of assessment and I agree that, for example, it is very appreciable especially in relative comparisons, probably thanks to its “linearity” inside specific testing frames. Knowing how ML based metrics work, I admit that PSNR is much more linear and monotonic “locally” while this is not guaranteed in case of ML-based estimators (it depends heavily on the ML-algorithm).


So, let’s take a look at this scatter plot. The cloud of points provides little information but if we connect the points related to the same video source we see that a structure starts to emerge.

The relationship between PSNR and MOS for the same source can be linearized in the most important part of the chart with an error that in some project can be negligible.

So what we are lacking to use PSNR as a perceptual quality estimator?

We need some other information extracted from the source that provides us with a starting point and a slope. With a fixed point on the chart (es: the PSNR needed to reach 4.5 MOS for a given source) and a slope (the angular coefficient of the approximated linearization of the PSNR-MOS relationship for that specific content), we could be able to use PSNR as an absolute perceptual quality estimator simply projecting the PSNR to the line and recovering the corresponding MOS.


Now I’m experimenting exactly around how to quickly find a starting point and a slope from the characteristics of the specific source (scene by scene, of course). The objective is to find a quick method to estimate those params (point and slope) so to be able to measure absolute perceptual quality an order of magnitude quickly than with VMAF (probably with lower accuracy too, but still with good local linearity and monotonic behavior).

This may be very useful in advanced Content-Aware Encoding logic to measure/estimate the final MOS of the current encoding and, for example, adjust params to achieve the desired level (CAE in live encoding is the typical use case).




Categories: Uncategorized

Be quick! 5 promo codes to save 200$ for a MAX 2011 full conference pass

22 September 2011 1 comment

I’m happy to offer to 5 of my readers the opportunity to obtain a discount of $200 for a full conference pass at Adobe MAX 2011. The first 5 of you that will use the promo code ESSONNATI while registering at the conference ( will obtain the discounted rate of $1,095 instead of $1,295. Be quick! Max is approaching fast!

It you get in, remember to attend my presentation: “Encoding for performance on multiple device”

Categories: Uncategorized

Mobile development with AIR and Flex 4.5.1

8 July 2011 13 comments

Recently I have made a very pleasant experience developing a set of mobile applications for the BlackBerry Playbook using Adobe Flex 4.5.1.

In the past I have been critical of Adobe because I believed that Flex for Mobile was not sufficiently smooth on devices and the workflow not efficient, but after this project I had to think again. The main application is not very complex but has given to me the opportunity to evaluate in a real scenario the efficiency of the framework and the performance level on multiple devices.

The final impression is that Adobe is doing really well and after a year of tests and  improvements Flex is becoming an efficient and powerfull cross-device development framework. There are yet some points to improve and some features to implement/enhance but I’m not so critial anymore.

The application I developed is a classic multi media app developed by a media client (Virgin Radio Italy) to offer several multimedia contents for the entertainment of  their mobile users. The app offers:

– A selection of thematic web radio plus the live broadcast of the main radio channel (Virgin Radio Italy)
– A selection of podcasts (MP3) from the main programs of the radio
– The charts/playlist created by the Virgin’s sound designers or voted by the users
– A multi-touch photo gallery
– A selection of VOD contents like video clips, interviews, concerts

The application is now under approval and should be available in the Playbook’s AppWorld in a few days. In the while you can take a look at the UX with this preview video:

Categories: Uncategorized

More informations about Flash Player 11, AIR 2.7 and above

20 April 2011 6 comments

In a recent post I summarized the informations that it’s possible to find in Internet about the features that will be implemented in the next releases of Flash Player (not necessarily 11, maybe 10.x) and AIR.

At the Flash Camp Brazil, Arno Gourdol (Adobe Flash Runtime Team) has provided a lot of additional informations about what we will able to see in the near feature. Many improvements are seen in a Mobile perspective and this is obvious because mobile clients are expected to be more than desktop clients by 2013.

Take a look at the list of future improvements:

1. Faster Garbage Collection
An incremental GC will be implemented to avoid GC clusters. More control and reduced allocation cost.

2. Faster ActionScript
Typer-based optimizations and the new type “float” (32.bit) will be introduced to enhance JIT compiler performance.

3. Multi Threading
A thread worker model will be implemented to have multiple Actionscript tasks running without blocking UI and to leverage multi-core

4. Threaded video pipeline
Video decoding optimization for standard Video and StageVideo object that leverage multi-core and GPU in parallel

5. HW Renderer
Use as much as possible GPU for grafic, vector and 3D. Stage3D. Hardware compositing.

I think that especially the last point is very important for mobile if Adobe want to fill the gab between native mobile application and AIR/ Flex application. Take a look at the picture below to understand what I mean: with HW acceleration AIR 2.7 will be able improve significantly performance especially on iOS devices and even for Flex application. Let’s hope to see it very soon.

Categories: Uncategorized

BBC iPlayer and Flash Player, 3 years of love story

28 December 2010 Leave a comment

In the late December 2007 BBC released the Flash based iPlayer creating the concept of  “catch-up TV” and starting one of the most important Success Story in video distribution over the Internet. The service has had a very rapid adoption rate and is now incredibly popular in UK with over 140 million video requests recorded in November 2010.  Being a long format video site, the total amount of streamed video is very high: approximately 25 millions hrs/month. This make iPlayer one of the top “video streaming source” in the world. For example Yahoo! and VEVO, which are the second and third video sites in U.S. for number of users, deliver “only” around 15 millions of hours / month.

The success of the service can be attributed to the very fluid user experience and good video quality. The use of Flash Player has been the main catalyst of this process and even now that the iPlayer is available on multiple devices and platforms the desktop version is still the most used. Only 4% of video are consumed on mobile devices, 7% on PS3 and Wii (Both using Flash), 16% is consumed on Virgin TV STB and the remaining 73% on a PC desktop. Part of the mobile traffic is related to the Android versione of the iPlayer which is Flash Based too.

Complessively more than 80% of the video streaming is consumed using the Flash Player version of iPlayer.  This percentage is similar to the market share of Flash for Internet video delivery at a worldwide level. Today BBC is using features that only Adobe Flash can offer: streaming protection, dynamic buffering, bitrate switching, DRM protected download (with Adobe Access and AIR client), and very high market penetration (near 100% for desktops plus Android, PS3 and Wii). So it’s not a mistery why the service has become so popular. But you know that BBC is not alone because all the TOP 10 video sites offer video using Flash as the main infrastructure.

To know more statistics about the iPlayer, read the last report published by BBC.

Categories: Uncategorized

Testing StageVideo in Flash Player 10.2

2 December 2010 7 comments

Adobe has launched the public beta of Flash Player 10.2. This minor update offers us a limited but very important set of improvements:

  • Internet Explorer 9 hardware accelerated rendering – Flash Player 10.2 exploits GPU accelerated graphics (vector rendering, composition,etc..) in Internet Explorer 9.
  • Stage Video hardware acceleration – H.264 decoding, scaling and compositing is performed entirely by the GPU.
  • Native custom mouse cursors – Developers can define custom native mouse cursors, enabling user experience enhancements and improving performance.
  • Support for full screen mode with multiple monitors – Full screen content will remain in full-screen on secondary monitors, especially usefull during video playback.

From my personal point of view the most important improvement is the new StageVideo API. I have introduced the technology in this post, talking about the AIR for TV runtime, but now it is becoming reality for the desktop too.

StageVideo technology allows a direct use of  video acceleration features of the underling hardware. When using StageVideo object intead of the classic Video object you have some limitation (essentially beacause it is not part of the display list but is an external video plane composited by the GPU with the Flash stage), but you have an excellent performance with zero dropped frame, high rendering quality and exceptional performance.

The aim of StageVideo is to optimize decoding performance even in low power CPU scenario (netbook, set top boxes, smart phones and tablet) where it is very import to exploit the dedicated HW acceleration features instead of using the CPU.

Flash Player 10.1 (Mobile or not), already exploited HW acceleration of H.264 decoding and scaling but, especially on Mac and Mobile, some step was still performed by CPU. For example the compositing in the display list was very CPU intensive on bot Mac and Mobile. With Stage Video this is going to change.

For the moment Flash Player 10.2 is only available for the desktop, but AIR for TV has already the support for StageVideo. I hope to see very soon the Mobile version of Flash Player 10.2 because this kind of platform is the one that can take more advantages by StageVideo (perfect performance, higher quality, lower battery consumption).

The performance

If you want to compare the performance of a video decoded with StageVideo and the classic method, install the player and go to this test page. You can also go to YouTube that is already started to support StageVideo. Below you find my results:

Note: the video is 1920×1080 25p at very high bitrate so I suggest you to start playback and then put it in pause (with SPACE) and let it buffer for a while. Press O to switch from StageVideo to standard Video object and monitor the CPU usage.

Laptop Core 2 Duo 2.1 GHz – Windows Vista – IE8 – GeForce 8400M

With standard Video Object: 45-50% (H.264 decoding is accelerated on win 7/vista but not the full path to screen)

With StageVideo : 3-5%

Desktop Quad Core 2.4 GHz – Windows XP sp3 – IE7 – ATI Radeon 3400

With standard Video Object: 30-35% (H.264 is decoded by software here on 4 cores)

With StageVideo: 15-20% (this GPU is not very powerful but still impressing considering that VLC requires 20-25%)

Desktop Core i7 2.8 GHz – Windows 7 x64 -FF 3.6 – ATI Radeon 5750

With standard Video Object: 10% (H.264 decoding is accelerated on win7 /vista but not the full path to screen)

With StageVideo: 0% ! (Yes, you read right, zero percent)

Mac iBook Core 2 Duo 2.2 GHz – OSX – Safari – GeForce 8400M

With standard Video Object: >40% (H.264 decoding “should” be accelerated on nvidia cards and latest OSX and Safari)

With StageVideo: <20%

Very very impressive

To know more details about how to support StageVideo in your players read this article by Thibault Imbert.

Categories: Uncategorized

H.264 for image compression

19 October 2010 12 comments

Recently Google has presented the WebP initiative. The initiative proposes a substitute of JPEG for image compression using the intra frame compression technique of the WebM project (Codec VP8).
Not everybody know it but also H.264 is very good at encoding still picture and differently from WebM or WebP the 97% of PC are already capable to decode it (someone has said ‘Flash Player’ ?).
The intra-frame compression techniques implemented in H.264 are very efficient, much more advanced than the old JPG and superior to WebP too. So let’s take a look at how to produce such file and the advantage of using it inside Flash Player.

JPG image compression

JPG is an international standard approved in 1992-94. It has been one of the most important technology for the web because without an efficient way to compress still pictures the web would not be what it is today. JPEG is usually capable to compress image size 1:10. The encoder performs these steps:

1. Color Space convertion from RGB to YCbCr
2. Chroma sub sampling, usually to 4:2:0 (supported also 4:2:2 or 4:4:4)
3. Discrete Cosine Transform of 8×8 blocks
4. Quantization
5. Entropy Coding (ZigZag RLE and Huffman)

The algorithm is well known and robust and is used in almost every electronic device with a color display, but obviously in the last 15 years the scientists have developed more advanced algorithms to encode still pictures. One of this is JPEG2000 which leverages Wavelets to encode picture. But the problem of improving intra frame compression is very important also in video encoding because this is the kind of compression used for Keyframes. So H.263 before and H.264 after proposed more optimized ways to encode a single picture.

H.264 intra frame compression

H.264 contains a number of new features that allow it to compress images much more efficiently than JPG.

New transform design

Differently from JPG, an exact-match integer 4×4 spatial block transform is used instead of the well known 8×8 DCT. It is conceptually similar to DCT but with less ringing artifacts.  There is also a 8×8 spatial block transform for less detailed areas and chroma.

A secondary Hadamard Transform (2×2 on chroma and 4×4 on luma) can be usually performed on “DC” coefficients to obtain even more compression in smooth regions.

There is also an optimized quantization and two possible zig-zag pattern for Run Length Encoding of transformed coefficients.

Intra-frame compression

H.264 introduces complex spatial prediction for intra-frame compression.
Rather than the “DC”-only prediction found in MPEG2 and the transform coefficient prediction found in H.263+, H.264 defines 6 prediction directions (modes) to predict spatial information from neighbouring blocks when encoded using 4×4 transform. The encoder tries to predict the block interpolating the color value of adiacent blocks. Only the delta signal is therefore transmitted.

There are also 4 prediction modes for smooth color zones (16×16 blocks). Residual data are coded with 4×4 trasforms and a further 4×4 Hadamard trasform is used for DC coefficients.

Improved quantization

A new logarithmic quantization step is used (compound rate 12%). It’s also possible to use Frequency-customized quantization scaling matrices selected by the encoder for perceptual-based quantization optimization.

Inloop deblocking filter

An adaptive deblocking filter is applied to reduce eventual blocking artifacts at high compression ratio.

Advanced Entropy Coding

H.264 can use the state of the art in entropy coding: Context Adaptive Binary Arithmetic Coding (CABAC) which is much more efficient than the standard Huffman coding used in JPG.

JPEG vs H.264

The techniques used in H.264 double the efficiency of the compression. That is, you can achieve the same quality at half the size. The efficiency is even higher at very high compression ratio (1:25 +) where JPG introduces so many artifact to be completely unusable.

This is a detail of a 1024×576 image compressed to around 50KBytes both in JPG (using PaintShopPro) and H.264 (using FFMPEG). The picture has a x2 zoom to better show the artifacts:

I have estimated a reduction in size of around 40-50% at the same perceived quality, especially at high compression ratios.

WebP vs H.264

WebP is based on the intra frame compression technique of the codec VP8. I compared H.264 with VP8 in this article. VP8 is a good codec and its intra frame compression is very similar to H.264. The difference is that VP8 does not support the 8×8 block transform (which is a feature of H.264 High profile) and can only encode in 4:2:0 (H.264 support 4:4:4).  So both should have approximately the same performance at the common (in 4:2:0). The problem of WebP is the support which is now almost zero while H.264 can be decoded by Flash (97% of desktop + android + rim) and also by iOS devices (via HTML5).

How to encode and display on a page

Now let’s start to encode pictures in H.264. The container could be .mp4 or .flv. FLV is lighter than .mp4 but .mp4 has far more support outside Flash. This is the command line to use with FFMPEG:

ffmpeg.exe -i INPUT.jpg -an -vcodec libx264 -coder 1 -flags +loop -cmp +chroma -subq 10 -qcomp 0.6 -qmin 10 -qmax 51 -qdiff 4 -flags2 +dct8x8 -trellis 2 -partitions +parti8x8+parti4x4 -crf 24 -threads 0 -r 25 -g 25 -y OUTPUT.mp4

The -crf parameter changes the quality level. Try with values from 15 to 30 (the final effect depends by frame size). You can also resize the frame prior to encode using the parameter -s WxH (es: -s 640×360).

To display the picture encoded in H.264 you can use this simple AS3 code:

var nc:NetConnection = new NetConnection();
var ns:NetStream = new NetStream(nc);
video.smoothing = true;
nc.client = this;
ns.client = this;"OUTPUT.mp4");
stage.scaleMode = "noBorder";

The Advantage of using Flash for serving picture in H.264

The main advantage of using H.264 for pictures is in the superior compression ratio. But it is not practical in an every day scenario to substitute every istance of the common <img> tag with an SWF.
However there’s a kind of application that can have enormous benefits from using this approach: the display of big, high quality copyrighted pictures. Instead of access low quality, watermarked JPG, it could be possible to server such big, high quality pictures as H.264 streams from a Flash Media Server and protect the delivery using RTMPE protocol and SWF authentication. On top of that, for a Bullet-proof protection, you could even protect the H.264 payload encrypting the content with a robust DRM like Adobe Access 2.0 . Not bad.

Categories: Uncategorized