Monday, March 06, 2006

How caching supersedes multicast

After some discussion with Alexander I've spent some time during the last week thinking about and discussing multicast and distribution of television. I decided that it would probably be a good idea to have it written down.

Multicast is an effective method to deliver the same data to the several nodes at the same time, but what are the applications of multicast? There are two basic criteria that must be fulfilled for an application to benefit from multicast.

  • There must more than one node requesting the same data at the same time. If only one node want the data at that specific moment the multicasting becomes unicasting.
  • There can't be any strict requirements that the data will be delivered correctly. It is a one way communication channel, and resending data that one node have lost would deliver it to all nodes. Yes, there are schemes of different complexities to make it "reliable", but they are at best a ugly hack.


This makes the group of applications where multicast is useful very limited. There is actually only one single application that clearly is many nodes requesting the same data at the same time; Television and radio. The triple playing and putting every service into IP packets makes us need multicast enabled networks?

The way television and radio is works today is just a consequence of the technology used to deliver the data to the viewers; the data is broadcast (either via radio waves or via shared media in CATV-networks) to many receivers at the same time. The broadcasting by definition forces the viewers to watch the programs at the same time (or use a time shifting device such as a VHS recorder). But why would I want watch the latest episode of "Lost" Wednesdays at 21:00 just because it is scheduled for broadcast at that time. Almost everyone can say a time that would suit them better. Why should this misfeature of TV and radio be reimplemented when the technology to distribute it is changed?

All TV and radio should be on demand, giving the viewer the power to decide when he or she wants to watch the programme. This is already happening to a large extent, although in a way that is not blessed by the creators of the content, by people downloading with file sharing applications such as bittorrent. The main two reasons people download TV programmes are that the commercial breaks are removed and, probably more important, they can watch it at any time. Another datapoint is how podcasting has changed how people listen to radio. People download the shows they are interested in and listen to it when it suits. The huge popularity of podcasting is a strong indication of that people want to consume media when it suits them, and not at a time scheduled by the broadcasting company because a limitation in the technology.

Such paradigm shift in how media is delivered will make the need of multicast to distribute television and radio void. There will be very few cases when many people watch the same thing at the same time, and that is live broadcasts, which in principle is limited to large sport events such as the Olympic games, world championships or the final of some league. But that would just be a very very small share of the media that benefits from multicast.

But can the network cope with unicasting all that data to the viewers? Lets say that 5 million persons (a big city or a small country) is watching about one hour of TV each evening between 18:00 and 24:00 and that it is a 4Mbps stream. Then we have 5M*4Mbps*1h/6h ~= 3Tbps. That's quite some bandwidth required. So maybe fully deployed on demand TV is impossible?

It can be assumed that there will be temporal locality; people will watch the same things at approximately the same time. The evening a new episode of "Lost" is released many people will watch it, or at least the during the next days. There is also a high probability that there will be some geographical locality; some programmes have a audience that is based on the region they live in, regional news is one example. Both temporal and geographical locality are always strong cases for caching. The caches will work like some kind of relay stations. The placement of the caches is crucial, they can't be too far from the end users and serve too many users because then their load will affect large parts of the network and they can't be too close either because then there will be less benefit of the caching. But it should be easy to create a model to find the optimal placement.

The caches would grab their content from the content-provider when they don't have the data available. The content-provider could be far away, but there should atleast be much peak bandwidth available even though the QoS can't be guaranteed. That shouldn't cause any problems, because it would easily be able to download the full media file much faster than the end user views it. The path from the cache to the end user is much shorter and is much more controllable environment where QoS applies. The caches are actually doing some kind of multicast, but on layer 7 instead of the IP layer, and they also implement timeshifting.

The cache operators would be the new television networks, which buys content and distributes it to the customers. It could be the ISP that own and operates the caches. It could also be a separate entity, maybe even a separate AS, but the caches need to be closer to the end users than the normal peering points allows for, so it would be some kind of new schemes for exchanging traffic between AS:es much closer to the end users. Most likely it would be easier to just put the caches in the ISP's network.

To conclude: media wants to be consumed at the most convenient time, therefore it will not be consumed simultaneously and multicast is not beneficial.

Categories: multicast IpTV VOD