In my last post I have discussed about what I think to be the current arch-enemy of video encoding: “banding“.
Banding can be the consequence of quantization in various scenarios today, particularly when the source is a gradient or a low power textured area and your CAE (Content Aware Encoding) algorith is using an excessive QP.
Banding is more frequent in 8bit encoding but is possible also in 10bit encoding and is also frequent in high quality source files, or mezzanines when they have been subject to many encoding processes.
Modern block based codecs are all prone to banding. Indeed I find h265, VP9 and AV1 to be even more prone to banding than h264 because of wider block transforms (and that contributed to an increase of banding in Youtube and Netflix videos in recent times).
As discussed in the previous post, it is easy to incur in banding also because it is subtle and it’s not easy to measure it. Metrics like PSNR, SSIM but even VMAF are not sensible to banding even if it is easy for an average viewer to spot it, at least in optimal viewing condition.
This is an example of banding:
The background shows a consistent amount of banding especially in motion, when the “edges” of the bands move coherently and form a perceptually significative and annoying pattern. Below the picture has the Gamma exalted to better show the banding.
Seek to prevent
To prevent banding is first of all necessary to be able to identify it. This by itself is a complex problem.
Recently I’ve tried to find a way (there are many different approaches) to estimate the likeliness of having perceptually significative banding in a specific portion of a video.
I’m using an auto-correlation approach that is giving interesting preliminary results. So this “banding metric” analyzes only the final picture, without reference to source files (than, in case of mezzanine or sources you obviouly do not have anyway).
For example: here we have a short video sequence. When you watch at it in optimal viewing condition, you can spot some banding on flat areas. The content is quite dark (maybe you can spot someone of familiar in the background 😉 so, as usual, in the continuation I’ll show preferably the frames with exalted gamma.
The algorithm produces the following frame-by-frame report where an index of banding is expressed for each quadrant of the picture (Q1 = Top Left quadrant, Q2 = Top Right quadrant, Q3 = Bottom Left quadrant, Q4 = Bottom Right quadrant).
Below you can see the Frame 1 with exalted gamma. From the graph above, we see that the quadrant with higher banding likeliness is Q2. For the moment I’ve not yet calculated the most appropriate threashold for perceptually visible banding, but empirically it is near 0.98 (horizontal red line) . So in this frame, we have low likeliness to have banding and only a minor probability for Q2.
In the frame below we have an incresing amount of banding, especially in Q1 but also in Q2 (on the tree and sky). The graph above shows an increasing probability of perceptually visible banding in quandrant Q1 and Q2 and infact they are above the threashold, while Q3 and Q4 are below.
Then there’s a scene change, and for the new scene the graph reports an high probability of banding for quadrant Q1 and Q3 (click on the image below to zoom) an oscillating behaviour for Q2 (the hands are moving and the dark parts exibit banding in some parts of the scene) while the Q4 quadrant is completely immune from banding.
Has discussed, it’s very important to start from the identification and the measurement of banding because if you can find it, you can correct encoding algorithms to better retain details and avoid introducing this annoying artifact. It’s also useful to analyze sources and reject them when any banding is found, otherwise any other consequent encoding will only worsen the problem. The journey to defeat banding is only at the beginning… wish me good luck 😉