Modelling of Minimum Cost Overlay Multicast Tree in Two Layer Networks

Multicast realises the data delivering to a group of destinations simultaneously while using the minimum network resources. The first implementation of multicasting has been built using specialized multicast routers and is well known as IP Multicast. Such a network comprises some drawback including complex addressing and routing scheme, it requires the deployment of special routers that are rather expensive and finally, there has not been proposed a reasonable business model regarding cross-providers multicast realization. Thereby, for the past few years, a new interest in delivering multicast traffic has arisen and some multicast systems defined for end-hosts overlay network have been successfully proposed. Overlay multicast implements a multicast technique at the top of computer networks and creates a virtual topology of clients which duplicate packets and maintain a multicast structure. This overlay structure forms an independent layer with logical links between the nodes without the knowledge about the underlaying topology. In the work we take a look at flow cost minimization of multicast stream in a system which combines the advantages of overlays and underlaying network awareness. This papers presents three independent linear-based models aimed at optimization of multicast tree topology and its network level unicast realization. The proposed formulations can be applied for deriving either lower bound of flows costs in existing systems or for designing new cooperative multilayer protocols for effective multicast transmission. Keywords—Modelling, overlay multicast, multilayer networks.


I. INTRODUCTION
M ULTICAST network technique efficiently delivers in- formation to a group of destinations simultaneously, with minimum bandwidth and loading of a server or a source.Multicasting implemented by network-aware approaches using Internet Protocol (IP-Multicast) is afflicted with problems derived form scalability, addressing scheme management, flow or congestion control but in contrast, features of overlay architecture provides with its potential scalability or ease of deployment new, network-layer-independent protocols at relatively low costs even for network delivery method which multicast is.Regarding overlay network, this strategy expands end-system multicast [1]- [4] and using overlay based systems has become an increasingly popular approach for multicast and streaming live media over the Internet.For the past decade, a lot of research, development, and testing efforts have been devoted to develop overlay multicast technique [5]- [15].In general, this approach is referred to P2P multicast The research leading to these results is partially supported by the National Science Centre under the grant which is being realized in years 2011-2014.
or application layer multicast (streaming), where participating peers actively contribute their upload bandwidth capacities to serve other peers in the same streaming session by forwarding their available content.Overlay multicast flows are realized as multiple unicast flows (e.g.TCP or UDP) at level of a lower layer with physical resources.
Multilayer networks describe networks composed of more than one layer of resources [16].Overlay multicast protocols create overlay topology which represent set of logical demands without general knowledge of physical network structure.This topology is managed by protocols and mechanisms working in the application layer.Thereby, we can distinguish separate overlay layer (upper layer) with separate class of resources.The lower layer includes physical network devices and protocols (lower layer).Overlay traffic is realized by means of flows in the network layer.Although, logical overlay links represent logical capacities, physical links in network layer are limited to real capacities and real connections.
So far the discussion of the state-of-the-art has been concentrated on single layer overlay multicast.On the other hand, most research papers on multilayer networks consider a given set of demand in a logical overlay layer, and are devoted to unicast based flows realized among different layers of resources (compare, e.g.[16], [17] and references therein).In this paper we combine the overlay multicast systems and deploy multilayered fashion of network for flows optimization in an overlay multicast with cognition of physical layer topology.Such a system that can be aware of physical network topology provides forming an overlay routing schema in more efficient way in terms of flows costs minimization.We propose and discuss three various linear optimization models.In order to provide a feasible multicast tree creation, which specifies the routing demand, all three formulations employ single commodity flows in the upper layer (logical layer).The main differences between the models concern the realization of multicast links by mean of multiple unicast flows in the lower layer.The first model is node-oriented and utilizes multicommodity flows in lower layer to route multicast traffic from the upper layer.The second one presents an alternative edge-oriented notation, while the last one is based on sets of predefined paths, which are used to form logical multicast links of overlay layer.This paper is a continuation and contains an extended version of our previous work [18] presented at 5th International Conference on Broadband and Biomedical Communications IB2Com 2010, held in Malaga, Spain on December 15-17, 2010.The rest of this paper is organized as follows.We briefly present and compare multicast techniques which depend on network layer and overlay concept as well in Section II.Then, in Section III, we describe an optimization problem of multicast flow assignment in two layer networks.The main goal of the problem can be divided into two main elements: creating single multicast tree in overlay layer and its physical unicast-based realization at network level to satisfy demanded streaming rate.Section IV contains three various linear-based formulations of the problem and analyses the sizes of the models including numbers of variables and constraints as well.Finally, we summarize the paper indicating the directions of future work in Section V.

A. Network Layer Multicast
Network layer multicast is realized by routers inside the network: data packets are replicated at routers (Fig. 1).The mostly developed and well discussed approach for network level multicast was proposed by Deering and Cheriton about two decades ago [19] and is known as IP Multicast.This architecture implements multicast functionality in the IP layer and from the network flows view it is the most efficient way to perform group data distribution because packet replication is reduced to the minimum necessary.
However, deployment of IP Multicast is limited and hindered due to several sorts of reasons.First, it requires special addressing schema.Routing and forwarding tables are supposed to contain additional entries corresponding to each unique multicast group address.On the contrary to unicast IP addresses the IP Multicast addresses are not easily aggregatable, what introduces problems related to scalability or dynamics of the architecture.Moreover routers are obliged to perform more complicated calculations which increases the overheads and introduces requirement for implementation of extra mechanisms.Second, some issues concerning reliability or congestion control are not clear which causes possibility of different router configurations depending on ISP.Third problem can be addressed to pricing for multicast traffic.

B. Overlay Multicast
This alternate technique for multicasting shifts the multicast functionality to the application layer, i.e., the end-hosts play the role of routers in replication of data packets (Fig. 2).Logically, the peers form and maintain an overlay multicast Multicast tree realized by end-hosts in overlay layer without knowledge of underlying physical layer.structure for efficient data transmission.And physical realization of the logical links is performed as sets of directed unicasts, mostly using well-defined IP unicast communication (Fig. 3).However, since the overlay multicast realization in network layer utilizes the same network link multiple times, it is less efficient than IP Multicast in order to minimizing network redundancy.But eventually, multicast defined for overlay networks, has been outranking IP Multicast during recent years.It derives from the fact that the overlay architecture for multicast streaming provides potential scalability and easy deployment of new protocols independent of the network layer solutions at relatively low costs.
There are several concepts of building multicast topology.Basic examples include Scrib [20], Narada [2] or HMTP [21] and references therein.Other proposals and interesting aspects of overlay multicast might be found in [22]- [27].A comprehensive comparison comprising advantages and disadvantages of overlay multicast systems features is widely described in [28].
In further work we abstract of any signalling and control topology in overlay and we assume general single tree topology for data delivering.

III. PROBLEM DESCRIPTION
We consider a network-aware overlay end-host system, with a set of physical routers and links, and a set of end hosts to be formed in a single multicast tree fashion.The problem is to define an overlay multicast tree to join all end hosts and unicast routing realization of each overlay logical multicast link in physical layer taking into account overall flows cost minimization in physical layer.
The network can be represented as a directed graph G = (N, A), where N is the set of physical and end-host nodes and A is the set of arcs.Taking into account an overlay multicast system, V ⊆ N represents a set of overlay nodes (end hosts) as it is shown in Fig. 4. Main assumption for logical-layer system is that there is possible direct connection between each pair of end hosts.
The main aim of the problem is to assign flows simultaneously in overlay logical layer and in physical layer in order to minimize physical transportation costs.Thus optimization result can be defined twofold: first, single multicast tree topology in overlay layer based on logical links is formed, second optimal routing for created multicast demand is provided in physical layer.

IV. MODELLING
In the following formulations we consider an overlay system, which can be represented as a complete directed subgraph which consists of V \{r} nodes and in addition node r with arcs to any v ∈ V : v = r.The lower layer is composed of N nodes and set of directed arcs (i, j) ∈ A. To model arcs we use a connection matrix where (i, j) : i ∈ N ; j ∈ N \{i, r} and arc's capacity table c ij with elements different from 0 if and if only arc (i, j) exists in the system.Further notation replaces sets N and V by indexing scheme 1, ..., N , 1, ..., V respectively, similar to notations used in [16].
All three formulations use the same two sets of variables x and f that model multicast routing structure in the upper layer.The set of x indicates which overlay arcs are included into the multicast topology, while f comprises auxiliary variables in order to bound the overlay topology and satisfy it to be a tree.We assume the content to be multicasted is given as a constant stream and equals to s (e.g.kbps, packet per seconds).

A. Node-oriented Notation
The model uses node-indexing scheme that represents the multilayered structure of the network.It was first introduced in our previous work [18].ξ ij defines cost of data unit transferred from i to j c ij defines capacity limit of link from i to j s streaming rate of overlay multicast tree variables x wv = 1 if the spanning overlay tree contains overlay arc (w, v); 0 otherwise (binary variable) f wv flow on overlay multicast link from w to v (integer, auxiliary variable) y wvij flow on a physical arc from i to j which realizes overlay-level arc (w, v) in kbps (continuous/integer) w =v j ={w,r,i} w v =w,r The main goal of the problem is referred to (1) and minimizes the total cost of flows that traverse the lower layer in the two layer network.In this model, equation (2) concerns the tree completion and states that every overlay node v (except of the root) has exactly one and only one parent (from the set of overlay nodes 1, ..., V ).Constraints (3) bind multicastlayer tree variables x wv and auxiliary single commodity flow variables f wv .These forcing constraints state that the multicast flow on arc (w, v) equals to zero if x wv = 0. On the other hand, if multicast tree contains an overlay arc (w, v) flow f wv on this arc cannot exceed (V − 1) of unit flows, if w = r, and (V −2) otherwise.However, to force nonzero flow f wv on arc (w, v) which belongs to the multicast tree where x wv = 1, constraints (4) must be satisfied.The balance equation ( 4) is derived from the flow conservation concept and guarantees that the solution of single commodity flows is a directed spanning tree rooted at node r.For every overlay node v appart from root r, number of incoming unit flows must be greater than number of outgoing unit flows by 1, i.e., one unit of flow is always conserved at node v.A single solution of auxiliary virtual single commodity flows f wv and the mapped multicast tree x wv can be illustrated as in Fig. 5. Take note that there are no capacity constraints in overlay layer and number of unit flows on each overlay arc is limited by a number of arcs in the spanning (multicast) tree.Thus f wv represents virtual flows without any capacity unit being assigned to it.
Since x wv defines the multicast routing demand in the overlay upper layer, it is now required to realize the demand by multiple flows throughout the lower layer.The set of constraints 5 describes a flow balance based on multicommodity flows and are defined regarding ∀w, ∀v : v = w, r.Therefore, every logical multicast link from overlay (x wv ) might be routed among lower layer resources as separate multiple fractional flows.The value of x wv = 1 indicates there is a demand of the size of s to be routed from w to v.That causes a value of outflows form w, j ={w,r,i} y wvwj to be equal to s, hence all incoming unicast flows directed from w to v and relayed by any intermediate nodes j equals to j ={v} y wvjv = s.In case a node i is only an intermediate node (i = w, v), the flow balance of a demand (w, v) at this node is 0, i.e. the number of inflows is equivalent to the number of outflows.
Finally, the network resources are limited by the links capacities of lower layer and the capacity of any arc (i, j) in the network cannot be exceeded.In order to limit the maximum possible flows on arcs, constraint (6) is introduced.The overall flow realization of all demands from the overlay w v =w,r y wvij on each arc (i, j) cannot be greater than the given capacity limit c ij .

B. Edge-oriented Notation
Edge-oriented notation replaces (i, j) by e in order to note directed edges in the network (as illustrated in Fig. 6).To completely describe a given network structure, binary constants a ei and b ei indicate whether directed link e originates at node i or terminates in i, respectively.By analogy, c e replaces c ij and ξ e replaces ξ ij .Instead of flow variables y wvij a set of z wve is applied.The z wve variable represents a fractional flow of overlay demand (w, v) that is realized on link e.
From a practical point of view, the edge-oriented notation is equivalent to the node-oriented, however it might be more clear and less complicated to be read and analysed than the previous one in case there is a sufficiently small number of edges in the lower layer.
w v =w,r Since indexing e apart from (i, j) is introduced, the objective function (7) appears instead of (1).The flow conservation equations ( 8) replace (5) for every w and for every v : v = w, r.The constraints provide with satisfying unicast routing in the lower layer of all demands from the upper layer.If virtual link (w, v) is defined to realize multicast transmission of size s, the total number of outflows from w to v must be equal to s, all inflows to v are also required to be s, and the balance of (w, v)-flows traversing any node i different than w or v must equal to 0. Upload capacity limit is expressed by (9), which repeats the condition ( 6) from the node-oriented notation.
It is important to note that, all the models use the same sets of variables and constraints that bound multicast tree realization in the upper layer and finally, the edge-oriented problem is formulated by ( 7), ( 2), ( 3), ( 4), ( 8) and (9).

C. Path-oriented Notation
The path-oriented model differs from the formulations above.It requires a predefined sets of paths p ∈ P wv so that the virtual links of an overlay layer are formed using paths of the lower layer.A path contains a set of links that belong to it.We also introduce a new set of assignment variables g wvp which indicate the fraction of flow s replicated at w and being sent to v which traverses path p.For example, let us assume there is an overlay link between nodes w = 1 and v = 4. Taking into account the indexing shown in Fig. 6, the demand x 14 = 1 might be realized using at least two separate paths, P 14 = {p = 1; p = 2} = { (1,6,8,14); (1,3,12,10,14)}.Note that the routing lists P wv do not necessarily contain all possible paths.On the one hand limited number of paths decreases the possibility of finding the same routing realization as in cases of node-and egde-oriented notations, but on the other it can provide us with a smaller number of variables g wvp leading to reduced the time and memory complexity that is required to solve the problem.
To maintain a path-link relation we use a set of binary constants δ wvep which are equal to 1 if link e belongs to the path p which forms the overlay demand (w, v) and 0 otherwise.Additional indices, constants, variables and constraints of the path-oriented notation are listed below.
The objective (10) expresses the minimization of costs of realizing flows on links e which are traversed by all paths p that forms the routing lists between nodes w and v.The completion condition (11) meets the requirement every designated virtual link (w, v) realizes the stream of s, and the stream must be realized by the means of fractional flows in the way that p g wvp = s if x wv = 1 and p g wvp = 0 otherwise.The formula (12) limits the total flow on each link e to not exceed its capacity value c e .
The path-oriented problem is formulated by the objective (10); the upper layer routing (2), (3), (4); and the lower layer flow realization (11) and (12).This path-based approach of flow realization in multilayer networks is adopted and widely discussed in [16], however, the authors do not consider multicast flows at the overlay but the upper layer demands are given administratively.

D. Quantitative Comparison
All three integer programs use the same sets of variables which formulate overlay multicast tree.The size of the set containing variables x wv consists of all possible connections between V −1 nodes except of the root (V 2 −3V +2) and links from w = r to other nodes (V − 1).A number of f wv is the same as x wv .Analogously, the size of constraints regarding upper layer issues, is constant for all formulations.Formulas (2) and (4) define V − 1 constraints each and equation (3) consists of V − 1 plus V 2 − 3V + 2 linear conditions.
Table I presents an upper bound of variables applied to all three models.Taking into account modelling the flows in the lower layer, the node-oriented notation uses at most (V 2 − 2V + 1)(N 2 − 2N + 1) variables of y wvij with regard to that the links to the root r are not considered.This size of y wvij can be narrowed by removing some links that should not exist in the network, e.g.y rwrv for w = v due to the lack of direct physical connections between overlay nodes, or there is no proper interpretation of y wviw because a demand (w, v) cannot be routed to node w, and finally all variables y wvij for c ij = 0 are rather constant and might be removed from the model.
The edge-oriented formulation has at most (V 2 − 2V + 1)E variables z wve , where E ≤ (N 2 − 2N + 1).However, it is obvious we can reduce the model by removing such variables as z wve where b ew = 1 or b av = 1 because link e that terminates in w cannot be used for to-node w transmission and a link originating in v cannot carries packet to node v. Similar situation might be observed for the path-based integer program.Let P to define a total number of predefined paths in the system thus (V 2 − 2V + 1) possible demands might be realized on these P paths, however it is impossible to use by e.g.(w, v) the path which has been defined for (v, w).Note that the number of variables may differ in case of various number of edges and predefined paths even if the network contains the same number of nodes, and may be adjusted in order to reduce the complexity.
Table II lists the upper bounds of constraints.Formula (5) of node-oriented formulation includes at most (V 2 −2V +1)N constraints and link capacity limit (6) uses no more conditions than N 2 − N .
The program formulated in edge-oriented way requires (V 2 − 2V + 1)N constraints defined by ( 8) and E defined

Formulation
Upper Layer Lower Layer Node-oriented 9), whereas path-oriented program uses only V 2 − 2V + 1 and E conditions in order to model lower layer realization of overlay demands, given by ( 11) and ( 12) respectively.

V. SUMMARY
In this paper we defined and presented three integer formulations for multicast flow assignment in two layer networks.We assumed the problem as single multicast tree creation in overlay layer and its realization as a set of multi-commodity unicast flows in network lower layer.The main goal of the problem is to minimize the overall transportation costs with regard to the fact that all costs are derived from physical network layer and demanded streaming rate is satisfied.The first model uses a node-oriented notation, where a pair of nodes (i, j) indicates a directed arc, while the edge-oriented notation replaces it simply by e.The third discussed formulation is based on path-link approach, in which a predefines set of paths is required.These models can be applied for finding the lower bound of flow costs in two layer networks or for designing advanced cooperative multilayered protocols for effective overlay multicast transmission.The first two models provide optimal flows assignment in multilayered systems, the path-oriented notation is in general a relaxed version of the problem.However, all of them can assure ease of deployment new protocols with all advantages of overlay multicast and minimize network flows derived from physical structure, simultaneously.Moreover, the models can be used to design and develop exact, approximation or heuristic algorithms.Natural extensions or generalizations of the aforementioned models include the use of multiple trees streaming in the overlay layer, additionally constrained trees (limited with relation to the number of hops, degrees, leaves etc.), limitation in cross-ISP connections at network level and even extra constraints comprising fairness or survivability issues.Furthermore, the modelling of trees may take into consideration multi-commodity concepts instead of single commodity but also models based on layered steiner trees, levels or parent relations.On the other hand, the objective function might be aimed at different aspects than flow cost minimization and might define system's throughput maximization, providing optimal reliability, delays minimization or network designing and dimensioning.Finally, the two layer network can be extended to more than two layers only.For such cases, the links of an overlay layer are formed using flows on paths of the lower layer, and this pattern repeats as one goes down the resources hierarchy.We believe all the models will be formulated and discussed in the future, some algorithms or solution methods will be documented and evaluated for numerous scenarios and finally, effective protocols for realizing overlay multicast flows among multilayer networks will be proposed and implemented.

Fig. 2 .
Fig. 2.Multicast tree realized by end-hosts in overlay layer without knowledge of underlying physical layer.

Fig. 3 .
Fig. 3. Overlay multicast tree and its unicast-based realization in lower layer.

Fig. 6 .
Fig. 6.Directed link representation used in edge-oriented and path-oriented notations.

TABLE I NUMBER
OF VARIABLES APPLIED TO INTEGER MODELS