The global bandwidth consumption is growing every day and one of the main causes is the explosion of bandwidth hungry Internet services based on Video On Demand (VOD) or live streaming. Youtube is accounted for a considerable portion of the whole Internet bandwidth usage, but also Hulu and NetFlix are first class consumers.
One of the cause of this abnormal consumption (apart from the high popularity) is the low level of optimization used in video encoding: for example Youtube encodes 480p video @1Mbit/s, 720p @2.1Mbit/s and 1080p @3.5Mbit/s which are rather high values. But also NetFlix, BBC and Hulu use conservatives settings. You may observe that NetFlix and Hulu use adaptive streaming to offer different quality levels depending by network conditions but such techniques are aimed at improving the QoS and not reduce the bandwidth consumption, so it’s very important to offer a quality/bitrate ratio as high as possible and not underestimate all the consequences of an un-optimized encoding.
The main consequence of a not optimized video is an high overall bandwidth consumption and therefore an high CDN bill. But for those giants this is not always a problem because, thanks to very high volumes, they can negotiate a very low cost per GByte.
However it is not only a matter of pure bandwith cost. There are many other hidden “costs”. For example at peek hours may be difficult to stream HD videos from Youtube without frequent, and annoying, rebuffering. Furthermore a lot of users nowadays use mobile connections for their laptop/tablet and rarely such connections offer more than 1-2 Mbit/s of real average bandwidth. If the video streaming service, differently from YouTube, uses dynamic streaming (like Hulu, NetFlix, EpicHD, etc…) the user is still able to see the video without re-buffering but it is very likely that he will obtain one of the lower quality versions of the stream in this bandwidth constrained scenarios and not the high quality one.
Infact, the use of dynamic streaming is today very often used as an alibi for poorly optimized encoding workflows…
This state of insufficient bandwidth is more frequent in less developed countries. But even highly developed countries can have problems if we think at the recent data trasfer restrictions introduced in Canada or in USA by some network providers (AT&T – 150GB/month, for example).
These limits are established especially because at peak hours the strong video streaming consumption can saturate the infrastructure, even of an entire nation as was happening in 2008-2009 in UK after the launch and the consequent extraordinary success of BBC’s iPlayer.
So dynamic streaming can help but must not to be used as an excuse to poorly optimized encodings and it is absurd to advertise a streaming service as HD when it requires to have 3-4Mbit/s+ of average bandwidth to stream the higher quality bitrate while in USA the average is around 2.9Mbit/s (meaning that more than 50% of users will stream a lower quality stream and not the HD one).
How many customers are really able to see an HD stream from start to end in a real scenario with this kind of bitrates ?
The solution is : invest in video optimization
Fortunately today every first class video provider uses H.264 for their video and H.264 offers still much room for improvement.
In the past I have shown several examples of optimized encodings. They were often experiments to explore the limit of H.264, or the possibilities of further quality improvements that the Flash Player can provide to a video streaming service (take a look at my “best articles” area).
In such experiments I have usually tried to encode a 720p video at a very low bitrate like 500Kbit/s. 500Kbit/s is more than a psycological threshold because at this level of bitrate it is really-really complex to achieve a satisfactory level of quality in 720p. Therefore my first experiments used to be accomplished on not too much complex contents.
But in these last 3 years I have improved considerably my skills and the knowledge of the inner principles of H.264. I have worked for first class media companies and contributed to the creation of advanced video platforms capable to offer excellent video quality for desktop (Flash, Silverlight, Widevine), mobile (Flash, HLS, native) and STB (vbr or cbr .ts).
So now I’m able to show you some examples of complex content encoded with very good quality/bitrate ratios in a real world scenario.
I’m not afraid
To show you this new level of optimization of H.264 I have choosen one of the most watched video in Youtube: Not afraid by Eminem.
This is a complex clip with a lot of movements, dark scenes, some transparencies, lens flares and a lot of fine details on artist’s face.
Youtube offers the video in these 4 versions (plus a 240p):
1080p @ 3.5Mbit/s
720p @ 2.1Mbit/s
480p @ 1Mbit/s
360p @ 0.5Mbit/s
Starting from this “state of art”, I have tryed to show what can be obtained with a little bit of optimization.
Why not try to offer the quality of the first three stream options but at half the bitrate ? Let’s say:
1080p @ 1.7Mbit/s
720p @ 1Mbit/s
576p @ 0.5Mbit/s
A such replacement would lead to two consequences:
A. Total bandwidth consumption reduced approximately by 2.
B. Much more users would be able to watch high quality video, even in low speed scenarios (mobile, capped connections, peak hours and developing countries).
But first of all, let’s take a look at the final result. Here you find a comparison page. On the left you have the YouTube video, on the right the optimized set of encodings. It is not simple to compare two 1080p or 720p videos (follow the instructions in the comparison page), so I have extracted some screenshots to compare the original Youtube version with the optimized encoding.
Notice the skin details and imperfections. The optimized encoding offers virtually the same quality at half the bitrate. Consequently, the quality of 1080p at 15% less bitrate than the 720p version of Youtube.
2. Youtube 720p @ 2Mbit/s vs optimized 720p @ 1Mbit/s
Again virtually the same quality at half the bitrate. Consequently 720p video can be offered instead of 480p which has the same bitrate:
3. Youtube 480p @ 1Mbit/s vs 720p @ 1Mbit/s
Optimized 720p offers higher quality (details, grain, spatial resolution) at the same bitrate.
4. Youtube 480p @ 1Mbit/s vs optimized 576p@ 500Kbit/s
Instead of using a 854×480 @ 500Kbit/s resolution I have preferred to use a 1024×576 (576p). I have also tryed to encode in 720p @ 600-700Kbit/s with very good results but I liked the factor 2 reduction in bitrate, so in the end, I opted for 576p which offered more stable results across the whole video. In this case the quality, details level and spatial resolution is higher than the original but at half the bitrate.
5. Youtube 360p @ 500Kbit/s vs optimized 576 @ 500Kbit/s
Again much higher spatial resolution, details level and overal quality at the same bitrate.
For the sake of optimization
How have I obtained a bitrate / quality ratio like this ? Well, it is not simple but I will try to explain the base principle.
Modern encoders do a lot of work to optimize the encoding from a mathematical / machine point of view. So for example a metric is used for Rate Distortion Optimization (like PSNR or SSIM). But this kind of approach is not always usefull at low bitrates, or when a high quality/bitrate ratio is required. In this scenario the standard approach may not lead to the best encoding because it is not capable to forecast what pictures are more important to enhance the quality perceived by the average user. Not every keyframes or portions of video are equally important.
These examples of optimized encodings are obtained with a mix of automated video analysis tool (for dynamic filtering, for istance) and human-guided fitting approach (for keyframe placement and quality burst). I’m actually developing a fully automated pipeline but by now, if an expert eye guides the process, it produces better results.
Unfortuntely there is a downside in using an ultra-optimized encoding: the encoding time rises consistently, so it is not realistic to think that Youtube could re-encode every single video with new optimized profiles.
But, you know, when we talk about big numbers, there’s an empiric law which may help use in a real world scenario: the Pareto principle. Let’s apply the Pareto principle to Youtube…
The Pareto principle
The Pareto principle (aka the 80-20 law) states that, for many events, roughly 80% of the effects come from 20% of the causes. Applying this rule to YouTube, it’s very likely that 80% of traffic comes from 20% of videos. A derivation of Pareto law known as 64-4 rule states that 64% of effects come from 4% of causes (and so on). So optimizing a reduced set of most popular videos would lead to huge savings and optimal user experience with only a limited amount of extra effort (the 4%).
But “Not Afraid” belongs to the top 10 of most popular video on YouTube, so it’s a perfect candidate for an extremization of Pareto law.
Let’s do some calculation. My samples reduce the bandwith of a factor 2 at every versions. So if we suppose that the most preferite version of the video is 720p and consider that the video has been watched more than 250 M times in the last 12 months, YouTube has consumed : 64MB * 250 M views = 16 PBytes, only to stream Not Afraid for 1 year.
Supposing an “equivalent” cost of 2c$/GByte*, this means 320.000$ (* it’s the lowest cost in CDN industry for huge volumes; probably YouTube uses different models of billing so consider it as a rough evaluation).
So an hand-made encoding of only 1 video could generate a saving of 160.000$. Wow… Encoding even only the TOP10 Youtube videos means probably at least 1M$ of saving…multiply this for the TOP1000 video and probably we talk of tens of millions per year…what to say…youtube, you know where to find me 😉
Moral of the story
The proposed application of Pareto rule is an example of adaptive strategy. Instead of encode all the video with a complex process that could not be affordable, why not encode only a limited subset of very popular videos ? Why not encode them with the standard set and then re-process without hurry only if the rate of popularity rise over an interesting threshold ?
Adaptive strategies are always the most productive. So if you apply this to the Youtube model, you get huge bandwidth (money) savings, if you apply this to a NetFlix model (dynamic streaming) you get a sudden increase in average quality delivered to clients and so on.
Concluding, the moral of the story is that every investment in encoding optimizations and adaptive encoding workflows can have very positive effects on user experience and/or business balance.
PS: I’ll speak about encoding and adaptive strategies during Adobe MAX 2011 (2-5 October) – If you are there and interested in encoding join my presentation : http://bit.ly/qvKjP0