Wednesday, November 18, 2009

Scalable Application Layer Multicast

This paper presents a system, NICE, for application-layer multicast. They look at a class of applications known as "data stream applications", which simultaneously deliver streaming data to large numbers of people. They view multicast as an overlay network, with all the recipients of the multicast communication being the nodes in the overlay. They set out 3 ways to evaluate the "performance" of an application-layer multicast scheme: (1) Quality of the data delivery path using metrics like stress, stretch, and node degrees, (2) Robustness of the overlay, (3) Control overhead.

Their multicast system is hierarchical. End hosts are arranged in layers. The bottom layer (L0) has all hosts in it, organized into small clusters. Hosts are grouped into clusters based on "closeness." Each cluster has a cluster leader who is in the "center"; that leader also belongs to the next layer up (L1). L1 members are clustered into small groups and then each one of those has a cluster leader, who belongs to L2, etc. recursively up to a single node at the top. At the very top of the pyramid/tree is the data provider. The data delivery path goes down the layers, with each cluster leader distributing data to its cluster members.

Members of a cluster exchange "heartbeat" messages. The cluster leader sends out a digest with info about all the cluster members. Members of a cluster tell each other the distance estimate to every other group member. Based on this info, the cluster leader can be changed if the cluster leader is no longer in the "center" based on changing network conditions. The cluster leader will check on cluster size and split a cluster if it gets too big.

When a new node joins the overlay network, first it will contact the the data provider ("RP" for rendezvous point) on the top. The RP will respond with the hosts in the highest layer; the host will contact those hosts to find out which one is the best match. It will then choose one and contact that leader's children to pick among those, until finally it is mapped to a L0 cluster in the hierarchy. Nodes may leave the network either intentionally (with a REMOVE message) or due to failure (in which case their loss is noted by a lack of heartbeats). New cluster leaders may be chosen when nodes enter or leave a cluster.

I understand their overlay network topology and maintenance protocols, but I feel like I'm missing something in terms of how the actual data is transferred. OK, so data goes down the tree. What happens with failures/retransmissions? How are those relayed, who is responsible for them in the hierarchy? Are they just assuming that reliability is being handled beneath the network by TCP? What if one node is very faulty (thereby generating lots of failures/retransmissions), won't nodes higher up in the hierarchy see their bandwidth suffer because the cluster leaders will need to handle the failure/retransmission traffic from the flaky node in L0?

-------------

Notes -

app layer with NICE
----
UDP
IP

No comments:

Post a Comment

About Me

Berkeley EECS PhD student