Universidad Publica de Valencia_T2_encaminamiento
-
Upload
matthew-taylor -
Category
Documents
-
view
227 -
download
0
Transcript of Universidad Publica de Valencia_T2_encaminamiento
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
1/114
Transmisin de Datos Multimedia http://www.grc.upv.es/docencia/tdm Master IC 2007/2008
Tema 2:
Aspectos de encaminamiento
Tema 2:
Aspectos de encaminamiento
Algoritmos bsicos de encaminamiento Link state
Distance Vector
Encaminamiento en Internet RIP
OSPF BGP
Multi-Protocol Label Switching (MPLS).
IP multicast
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
2/114
Tran
smisindeDatosM
ultimedia-MasterI
C2007/2008
2
123
0111
value in arrivingpackets header
routing algorithm
local forwarding table
header value output link
0100010101111001
3221
Interplay between routing, forwarding
Computer Networking: A TopDown Approach Featuring the
Internet,3rd edition.
Jim Kurose, Keith RossAddison-Wesley, July 2004.
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
3/114
Tran
smisindeDatosM
ultimedia-MasterI
C2007/2008
3
u
yx
wvz
2
2
13
1
1
2
5
3
5
Graph: G = (N,E)
N = set of routers = { u, v, w, x, y, z }
E = set of links ={ (u,v), (u,x), (v,x), (v,w), (x,w), (x,y), (w,y), (w,z), (y,z) }
Graph abstraction
Remark: Graph abstraction is useful in other network contexts
Example: P2P, where N is set of peers and E is set of TCP connections
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
4/114
Tran
smisindeDatosM
ultimedia-MasterI
C2007/2008
4
Graph abstraction: costs
u
yx
wv
z2
2
13
1
1
2
53
5 c(x,x) = cost of link (x,x)
- e.g., c(w,z) = 5
cost could always be 1, or
inversely related to bandwidth,
or inversely related tocongestion
Cost of path (x1, x2, x3,, xp) = c(x1,x2) + c(x2,x3) + + c(xp-1,xp)
Question: Whats the least-cost path between u and z ?
Routing algorithm: algorithm that finds least-cost path
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
5/114
Tran
smisindeDatosM
ultimedia-MasterI
C2007/2008
5
Routing Algorithm classification
Global or decentralized information?
Global:
all routers have complete topology, link costinfo
link state algorithms
Decentralized: router knows physically-connected
neighbors, link costs to neighbors
iterative process of computation, exchangeof info with neighbors
distance vector algorithms
Static or dynamic?
Static:
routes change slowly over time
Dynamic:
routes change more quickly periodic update
in response to link costchanges
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
6/114
Tran
smisindeDatosM
ultimedia-MasterI
C2007/2008
6
A Link-State Routing Algorithm: Dijsktras Algorithm
net topology, link costs known to allnodes
accomplished via link statebroadcast
all nodes have same info computes least cost paths from one
node (source) to all other nodes
gives forwarding table forthat node
iterative: after k iterations, know leastcost path to k destinations
Notation:
c(x,y): link cost from node x to y; = if notdirect neighbors
D(v): current value of cost of path from source todestination v
p(v):predecessor node along path from source to
v
N': set of nodes whose least cost path definitivelyknown
1 Initialization:
2 N' = {u}3 for all nodes v4 if v adjacent to u5 then D(v) = c(u,v)6 else D(v) = 7
8 Loop9 find w not in N' such that D(w) is a minimum10 add w to N'11 update D(v) for all v adjacent to w and not in N' :12 D(v) = min( D(v), D(w) + c(w,v) )13 /* new cost to v is either old cost to v or known
14 shortest path cost to w plus cost from w to v */15 until all nodes in N'
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
7/114
Tran
smisindeDatosM
ultimedia-MasterI
C2007/2008
7
Dijkstras algorithm example
A F
B
D E
C2
2
2
3
1
1
1
3
5
step SPT D(b), P(b) D(c), P(c) D(d), P(d) D(e), P(e) D(f), P(f)
0 A 2, A 5, A 1, A ~ ~
5
B C D E F
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
8/114
Tran
smisindeDatosM
ultimedia-MasterIC2007/2008
8
Dijkstras algorithm example
A F
B
D E
C2
2
2
3
1
1
1
3
5
step SPT D(b), P(b) D(c), P(c) D(d), P(d) D(e), P(e) D(f), P(f)
0 A 2, A 5, A 1, A ~ ~1 AD 2, A 4, D 2, D ~
5
B C D E F
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
9/114
Tran
smisindeDatosM
ultimedia-MasterIC2007/2008
9
Dijkstras algorithm example
A F
B
D E
C2
2
2
3
1
1
1
3
5
step SPT D(b), P(b) D(c), P(c) D(d), P(d) D(e), P(e) D(f), P(f)
0 A 2, A 5, A 1, A ~ ~
1 AD 2, A 4, D 2, D ~
2 ADE 2, A 3, E 4, E
5
B C D E F
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
10/114
Tran
smisindeDatosM
ultimedia-MasterIC2007/2008
1
0
Dijkstras algorithm example
A F
B
D E
C2
2
2
3
1
1
1
3
5
step SPT D(b), P(b) D(c), P(c) D(d), P(d) D(e), P(e) D(f), P(f)
0 A 2, A 5, A 1, A ~ ~
1 AD 2, A 4, D 2, D ~2 ADE 2, A 3, E 4, E
3 ADEB 3, E 4, E
5
B C D E F
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
11/114
TransmisindeDatosM
ultimedia-MasterIC2007/2008
1
1
Dijkstras algorithm example
A F
B
D E
C2
2
2
3
1
1
1
3
5
step SPT D(b), P(b) D(c), P(c) D(d), P(d) D(e), P(e) D(f), P(f)
0 A 2, A 5, A 1, A ~ ~
1 AD 2, A 4, D 2, D ~2 ADE 2, A 3, E 4, E
3 ADEB 3, E 4, E
4 ADEBC 4, E
5
B C D E F
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
12/114
TransmisindeDatosM
ultimedia-MasterIC2007/2008
1
2
Dijkstras algorithm example
A F
B
D E
C2
2
2
3
1
1
1
3
5
step SPT D(b), P(b) D(c), P(c) D(d), P(d) D(e), P(e) D(f), P(f)
0 A 2, A 5, A 1, A ~ ~
1 AD 2, A 4, D 2, D ~2 ADE 2, A 3, E 4, E
3 ADEB 3, E 4, E
4 ADEBC 4, E
5
B C D E F
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
13/114
TransmisindeDatosM
ultimedia-MasterIC2007/2008
1
3
Dijkstras algorithm example
A
ED
CB
F
Resulting shortest-path tree from A:
B
DE
C
F
(A,B)
(A,D)(A,D)
(A,D)
(A,D)
destination link
Resulting forwarding table in A:
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
14/114
TransmisindeDatosM
ultimedia-MasterIC2007/2008
1
4
Dijkstras algorithm, discussion
Algorithm complexity: n nodes
each iteration: need to check all nodes, w, not in N
n(n+1)/2 comparisons: O(n2)
more efficient implementations possible: O(nlogn)Oscillations possible:
e.g., link cost = amount of carried traffic
A
D
C
B1 1+e
e0
e
1 1
0 0
A
DC
B
2+e 0
00 1+e 1
A
DC
B
0 2+e
1+e1 0 0
A
DC
B
2+e 0
e0 1+e 1
initially recompute
routing
recompute recompute
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
15/114
TransmisindeDatosM
ultimedia-MasterIC2007/2008
1
5
Distance Vector Algorithm
Bellman-Ford Equation (dynamic programming)
Define: dx(y) := cost of least-cost path from x to y
Then
where min is taken over all neighbors v of x
vdx(y) = min {c(x,v) + dv(y) }
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
16/114
TransmisindeDatosM
ultimedia-MasterIC2007/2008
1
6
Bellman-Ford example
u
yx
wv
z2
2
1 3
1
1
2
53
5
Clearly, dv(z) = 5, dx(z) = 3, dw(z) = 3
du(z) = min { c(u,v) + dv(z),
c(u,x) + dx(z),c(u,w) + dw(z) }= min {2 + 5,
1 + 3,
5 + 3} = 4
Node that achieves minimum is nexthop in shortest path forwarding table
B-F equation says:
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
17/114
TransmisindeDatosM
ultimedia-Master
IC2007/2008
1
7
Distance Vector Algorithm
Node x maintains the Distance vector: Dx = [Dx(y): y N ]
Where Dx(y) = estimate of least cost from x to y
Node x also maintains its neighbors distance vectors
For each neighbor v, x maintainsDv = [Dv(y): y N ]
Node x knows the cost to each neighbor v: c(x,v)
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
18/114
TransmisindeDatosM
ultimedia-Master
IC2007/2008
1
8
Distance vector algorithm
Basic idea:
Each node periodically sends its own distance vector estimate toneighbors
When a node x receives new DV estimate from neighbor, it updatesits own DV using B-F equation:
Under minor, natural conditions, the estimate Dx(y) converge to theactual least cost dx(y)
Dx(y) minv{c(x,v) + Dv(y)} for each node y N
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
19/114
TransmisindeDatosM
ultimedia-Master
IC2007/2008
1
9
Distance Vector Algorithm
Iterative, asynchronous: each localiteration caused by:
local link cost change
DV update message fromneighbor
Distributed:
each node notifies neighborsonly when its DV changes
neighbors then notify their
neighbors if necessary
waitfor (change in local linkcost of msg from neighbor)
recomputeestimates
if DV to any dest has
changed, notifyneighbors
Each node:
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
20/114
Tra
nsmisindeDatosM
ultimedia-Master
IC2007/2008
2
0
x y z
x
yz
0 2 7
from
cost to
from
from
x y z
xyz
0 2 3
from
cost tox y z
x
yz
0 2 3
from
cost to
x y z
xyz
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
0 2 3
from
cost to
x y z
xy
z
0 2 3
from
cost tox y z
xy
z
0 2 7
from
cost tox y z
xy
z
7 1 0
cost to
2 0 1
2 0 17 1 0
2 0 17 1 0
2 0 1
3 1 0
2 0 13 1 0
2 0 1
3 1 0
2 0 1
3 1 0
time
x z12
7
y
node x table
node y table
node z table
Dx(y) = min{c(x,y) + Dy(y), c(x,z) + Dz(y)}= min{2+0 , 7+1} = 2
Dx(z) = min{c(x,y) + Dy(z), c(x,z) + Dz(z)}
= min{2+1 , 7+0} = 3
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
21/114
Tra
nsmisindeDatosM
ultimedia-Master
IC2007/2008
2
1
Comparison of LS and DV algorithms
Message complexity
LS: with n nodes, E links, O(nE)msgs sent
DV: exchange betweenneighbors only
convergence time varies
Speed of Convergence
LS: O(n2) algorithm requiresO(nE) msgs
may have oscillations DV: convergence time varies
may be routing loops
count-to-infinity problem
Robustness: what happens if routermalfunctions?
LS:
node can advertise incorrect linkcost
each node computes only itsown table
DV:
DV node can advertise incorrectpath cost
each nodes table used byothers
error propagate thru network
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
22/114
Transmisin de Datos Multimedia http://www.grc.upv.es/docencia/tdm Master IC 2007/2008
Tema 2:
Aspectos de encaminamiento
Tema 2:
Aspectos de encaminamiento
Algoritmos bsicos de encaminamiento Link state Distance Vector
Encaminamiento en Internet RIP
OSPF
BGP
Multi-Protocol Label Switching (MPLS).
IP multicast
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
23/114
Tra
nsmisindeDatosM
ultimedia-Master
IC2007/2008
2
3
Hierarchical Routing
Our routing study thus far - idealization all routers identical
network flat
not true in practice
scale: with 200 million destinations:
cant store all dests in routing tables!
routing table exchange would swamp links!
administrative autonomy
internet = network of networks
each network admin may want to control routing in its own network
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
24/114
Tra
nsmisindeDatosMultimedia-Master
IC2007/2008
2
4
Hierarchical Routing
Gateway router
Direct link to router in another AS
aggregate routers into regions,autonomous systems (AS) routers in same AS run same routing protocol
intra-AS routing protocol
routers in different AS can run different intra-AS routing protocol
8
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
25/114
Tra
nsmisindeDatosMultimedia-Master
IC2007/2008
2
5
3b
1d
3a
1c 2aAS3
AS1
AS21a
2c
2b
1b
Intra-ASRoutingalgorithm
Inter-ASRouting
algorithm
Forwardingtable
3c
Interconnected ASes
Forwarding table is configured byboth intra- and inter-AS routingalgorithm
Intra-AS sets entries forinternal dests
Inter-AS & Intra-As sets
entries for external dests
8
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
26/114
Tra
nsmisindeDatosMultimedia-Master
IC2007/2008
2
6
3b
1d
3a
1c2aAS3
AS1
AS21a
2c2b
1b
3c
Inter-AS tasks
Suppose router in AS1 receivesdatagram for which dest is outside of
AS1
Router should forward packettowards one of the gateway
routers, but which one?
AS1 needs:
to learn which dests arereachable through AS2 andwhich through AS3
to propagate this reachabilityinfo to all routers in AS1
Job of inter-AS routing!
8
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
27/114
Tra
nsmisindeDatosMultimedia-Master
IC2007/2008
2
7
Learn from inter-ASprotocol that subnetx is reachable viamultiple gateways
Use routing infofrom intra-AS
protocol to determinecosts of least-cost
paths to eachof the gateways
Hot potato routing:Choose the gateway
that has thesmallest least cost
Determine fromforwarding table theinterface I that leads
to least-cost gateway.Enter (x,I) in
forwarding table
Example: Choosing among multiple ASes
Now suppose AS1 learns from the inter-AS protocol that subnet x isreachable from AS3 and from AS2.
To configure forwarding table, router 1d must determine towardswhich gateway it should forward packets for dest x.
This is also the job on inter-AS routing protocol!
Hot potato routing: send packet towards closest of two routers.
8
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
28/114
Tra
nsmisindeDatosMultimedia-Master
IC2007/2008
2
8
Intra-AS Routing
Also known as Interior Gateway Protocols (IGP) Most common Intra-AS routing protocols:
RIP: Routing Information Protocol
OSPF: Open Shortest Path First
IGRP: Interior Gateway Routing Protocol (Cisco proprietary)
08
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
29/114
Tra
nsmisindeDatosMultimedia-Master
IC2007/200
2
9
RIP ( Routing Information Protocol)
Distance vector algorithm Included in BSD-UNIX Distribution in 1982
Distance metric: # of hops (max = 15 hops)
DC
BA
u v
w
x
yz
destination hopsu 1v 2w 2
x 3y 3z 2
From router A to subsets:
08
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
30/114
Tra
nsmisindeDatosMultimedia-Master
IC2007/200
3
0
RIP advertisements
Distance vectors: exchanged among neighbors every 30 sec viaResponse Message (also called advertisement)
Each advertisement: list of up to 25 destination nets within AS
Table processing:
RIP routing tables managed by application-level process called route-d(daemon)
advertisements sent in UDP packets, periodically repeated
physical
link
network forwarding(IP) table
Transprt(UDP)
routed
physical
link
network(IP)
Transprt(UDP)
routed
forwardingtable
08
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
31/114
Tra
nsmisindeDatosMultimedia-Master
IC2007/200
3
1
RIP: Link Failure and Recovery
If no advertisement heard after 180 sec --> neighbor/link declareddead
routes via neighbor invalidated
new advertisements sent to neighbors
neighbors in turn send out new advertisements (if tables changed) link failure info quickly propagates to entire net
poison reverse used to prevent ping-pong loops (infinite distance = 16hops)
08
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
32/114
Tra
nsmisindeDatosMultimedia-Master
IC2007/200
3
2
OSPF (Open Shortest Path First)
open: publicly available Uses Link State algorithm
LS packet dissemination
Topology map at each node
Route computation using Dijkstras algorithm
OSPF advertisement carries one entry per neighbor router
Advertisements disseminated to entire AS (via flooding)
Carried in OSPF messages directly over IP (rather than TCP or UDP
08
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
33/114
Tra
nsmisindeDatos
Multimedia-Master
IC2007/20
3
3
OSPF advanced features (not in RIP)
Security: all OSPF messages authenticated (to prevent maliciousintrusion)
Multiple same-cost paths allowed (only one path in RIP)
For each link, multiple cost metrics for different TOS (e.g., satellite
link cost set low for best effort; high for real time) Integrated uni- and multicast support:
Multicast OSPF (MOSPF) uses same topology data base as OSPF
Hierarchical OSPF in large domains.
008
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
34/114
TransmisindeDatos
Multimedia-Master
IC2007/20
3
4
Hierarchical OSPF
Two-level hierarchy: local area,backbone.
Link-state advertisements onlyin area
each nodes has detailed area
topology; only know direction(shortest path) to nets in otherareas.
Area border routers: summarizedistances to nets in own area,
advertise to other Area Border routers. Backbone routers: run OSPF routing
limited to backbone.
Boundary routers: connect to otherASs.
008
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
35/114
TransmisindeDatos
Multimedia-Master
IC2007/20
3
5
Internet inter-AS routing: BGP
BGP (Border Gateway Protocol): thede facto standard BGP provides each AS a means to:
1. Obtain subnet reachability information from neighboring ASs.
2. Propagate the reachability information to all routers internal to the AS.
3. Determine good routes to subnets based on reachability informationand policy.
Allows a subnet to advertise its existence to rest of the Internet: Iam here
008
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
36/114
TransmisindeDatos
Multimedia-Master
IC2007/20
3
6
BGP basics
Pairs of routers (BGP peers) exchange routing info over semi-permanent TCP connections: BGP sessions
Note that BGP sessions do not correspond to physical links.
When AS2 advertises a prefix to AS1, AS2 is promising it will
forward any datagrams destined to that prefix towards the prefix. AS2 can aggregate prefixes in its advertisement
3b
1d
3a
1c
2aAS3
AS1
AS21a
2c
2b
1b
3c
eBGP session
iBGP session
008
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
37/114
TransmisindeDatos
Multimedia-Master
IC2007/2
3
7
BGP route selection
Router may learn about more than 1 route to some prefix. Routermust select route.
Elimination rules:
Local preference value attribute: policy decision
Shortest AS-PATH Closest NEXT-HOP router: hot potato routing
Additional criteria
2008
ff
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
38/114
TransmisindeDatos
Multimedia-Master
IC2007/2
3
8
Why different Intra- and Inter-AS routing ?
Policy: Inter-AS: admin wants control over how its traffic routed, who routes
through its net.
Intra-AS: single admin, so no policy decisions needed
Scale: hierarchical routing saves table size, reduced update traffic
Performance:
Intra-AS: can focus on performance
Inter-AS: policy may dominate over performance
T 2
T 2
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
39/114
Transmisin de Datos Multimedia http://www.grc.upv.es/docencia/tdm Master IC 2007/2008
Tema 2:
Aspectos de encaminamiento
Tema 2:
Aspectos de encaminamiento
Algoritmos bsicos de encaminamiento
Link state Distance Vector
Encaminamiento en Internet RIP
OSPF
BGP
Multi-Protocol Label Switching (MPLS).
IP multicast
2008
MPLS Th M ti ti
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
40/114
TransmisindeDatos
Multimedia-MasterIC2007/2
4
0
MPLS - The Motivation
IP Protocol Suite - the most predominant networking technology.Voice & Data convergence on a single network infrastructure.
Continual increase in number of users.
Demand for higher connection speeds.
Increase in traffic volumes.
Ever-increasing number of ISP networks.
MPLS Working Groups and Standards Standardized by the IETF - currently in Draft stage.
MPLS recommendations are done by IP players for IP services
MPLS core components are generic
MPLS doesnt use specific technology process
2008
MPLS d ISO d l
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
41/114
TransmisindeDatos
Multimedia-MasterIC2007/2
4
1
MPLS and ISO model
PPP
Physical (Optical - Electrical) 1
2
IP 3
4
Applications7to
5
FrameRelay
ATM (*)
TCP UDP
PPP Frame Relay ATM (*)
MPLS
(*) ATM overlay model(without addressing and
P-NNI) is considered asan ISO layer 2 protocol.
IETF main goal is that
when a layer is added,no modification isneeded on the existinglayers.
All new protocol mustbe backwardcompatible
/2008
Some MPLS Terms
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
42/114
TransmisindeDatos
Multimedia-MasterIC2007/
4
2
Some MPLS Terms...
LER - Label Edge Router LSR - Label Switch Router
FEC - Forward Equivalence Class
Label - Associates a packet to a FEC
Label Stack - Multiple labels containing information on how a packetis forwarded.
Shim - Header containing a Label Stack
Label Switch Path - path that a packet follows for a specific FEC LDP - Label Distribution Protocol, used to distribute Label
information between MPLS-aware network devices
Label Swapping - manipulation of labels to forward packets towards
the destination.
/2008
MPLS Architecture
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
43/114
Tr
ansmisindeDatos
Multimedia-MasterIC2007/
4
3
Routing protocol OSPF OSPF OSPF
AttributesPrecedence
Local tableLabel table Local table Local table
LSP (Label-Switched Paths)Label swapping Label removal
Classification
Label assignment
Ingress
Node
Core
Node
Egress
Node
Label SwitchLayer 2
Layer 1
Layer 2
Layer 1
Layer 2
Layer 1
Layer 2
Layer 1
Layer 2
Layer 1
MPLS Architecture
FEC table Local table Local table Local table
Forward Equivalence Class
/2008
MPLS process
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
44/114
Tr
ansmisindeDatos
Multimedia-MasterIC2007/
4
4
Label swapping
Label removal
Classification
Label assignment
Label swapping
Label removal
Classification
Label assignment
OSPF / RIP / IS-IS
Label Switch Path
Label table
Ingress
Node
Core
Node
Egress
Node
Layer 2
Layer 1
Layer 2
Layer 1
Layer 2
Layer 1
Precedence
Label table Label table
Layer 2
Layer 1
Layer 2
Layer 1
FEC FEC FEC
MPLS process
7/2008
MPLS Cloud
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
45/114
Tr
ansmisindeDatos
Multimedia-MasterIC2007
4
5
LSR
LER
LSR
LER
IP PacketIP Packet w/ Label
L3 RoutingL3 Routing
Label SwappingLabel Swapping
LER: Label Edge Router
LERLER
L3 RoutingL3 Routing
L3 Routing
MPLS Cloud
LSR: Label Switch(ing) Router
7/2008
FEC Classification
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
46/114
Tr
ansmisindeDatos
Multimedia-MasterIC2007
4
6
Ingress Label FEC Egress Label
6 138.120.6/24 - xxxx 9
Ingress Label AttributeFEC Egress LabelIngress Label FEC Egress Label
6 138.120.6/24 - xxxx 9
Attribute
A
6 138.120.6/24 - xxxx 12B
FECs are manually initiated by the operator
A FEC is associated to at least one Label
FEC Classification
A packet can be mapped to a particular FEC based on the following criteria: destination IP address,
source IP address,
TCP/UDP port,
in case of inter AS-MPLS, Source-AS and Dest-AS, class of service,
application used,
any combination of the previous criteria.
7/2008
What is a Label?
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
47/114
Tr
ansmisindeDatos
Multimedia-MasterIC2007
4
7
L2 Type L2 TypePort PortIngress Label Egress LabelFEC
ATM 1-1 12 (i.e. 4/65) F1 22 (i.e. 5/65)3-4ATM
ATM 1-1 15 (i.e. 0/25) F4 9 (i.e. 101) 5-1FR
Gig Eth 5-1 7 F1 22 (i.e. 4/65)3-4ATM
What is a Label?
A label is a short, fixed length, locally significant identifier used toidentify a FEC.
The label can be identified by the L2 technology identifier (e.g.VPI/VCI for ATM, DLCI for FR or MPLS label for PPP/Ethernet).
MPLS Label Assignment Schemes Topology Driven
Label assignment in response to routing protocols (OSPF and BGP) updates
Control Driven Label assignment in response to RSVP, CR-LDP requests
Traffic Driven Label assignment in response to flow detection & triggering
7/2008
The MPLS Shim Header
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
48/114
Tr
ansmisindeDatos
Multimedia-Maste
rIC2007
4
8
Label
(20 bits)
Exp
(3 bits)
S
(1 bit)
TTL
(8bits)
Label : Label value (0 to 15 are reserved for special use)
Exp : Experimental Use
S : Bottom of Stack (set to 1 for the last entry in the label)TTL : Time To Live
The MPLS Shim Header
The Label (Shim Header) is represented as a sequence of Label Stack Entry
Each Label Stack Entry is coded by 4 bytes (32 bits) as described 20 Bits is reserved for the Label Identifier (also named Label)
Based on the contents of the label a swap,push(impose) orpop(dispose) operation can be performed on the packet's label stack
7/2008
Label Switched Path
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
49/114
Tr
ansmisindeDatos
Multimedia-Maste
rIC200
4
9
Label Switched Path
5 12
IngressInterface
IngressLabel
FEC EgressInterface
EgressLabel
1 138.120 312
IngressInterface
IngressLabel
FEC EgressInterface
EgressLabel
1 138.120 x4
53
IngressInterface
IngressLabel
FEC EgressInterface
EgressLabel
1 x 138.120
MPLS switch
MPLS switch
MPLS switch
MPLS switch1
2
3
1 2
3
1
2
3
4
1
2
3
138.120
192.168127.20
07/2008
Hop by Hop IP forwarding
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
50/114
Tr
ansmisindeDatos
Multimedia-Maste
rIC200
5
0
MPLS switch
MPLS switch
MPLS switch
MPLS switch1
2
3
1 2
3
1
2
3
4
1
2
3
138.120
192.168127.20
138.120.6.12
138.12
0.6.12
138.120.6.12138.120.6.1
2
138.120.6.12
138.12
0.6.12
??
138.120.6.12
Default3
IngressInterface
IngressLabel
FEC EgressInterface
EgressLabel
1 x None
??
138.12
0.6.12
Default Default
IngressInterface
IngressLabel
FEC EgressInterface
EgressLabel
1 None 3
??
138.12
0.6.12 ??
138.120.6.12
Default
IngressInterface
IngressLabel
FEC EgressInterface
EgressLabel
1 None x4
??138.120.6.1
2
Hop by Hop IP forwarding
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
51/114
07/2008
MPLS Label Distribution Protocol
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
52/114
TransmisindeDatos
Multimedia-Maste
rIC200
5
2
LDP - a set of procedures by which one LSR informs the other ofthe FEC-to-Label binding it has made.
Currently, several protocols used as Label Distribution Protocol(LDP) are available:
RSVP-TE (MPLS extension) LDP and CR-LDP
BGP-4 MPLS extensions
Label Distribution schemes
07/2008
Downstream stream on demand
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
53/114
TransmisindeDatos
Multimedia-Maste
rIC20
5
3
Downstream stream on demand
Mapping12
Mapping5
5 12
IngressInterface
IngressLabel
FEC EgressInterface
EgressLabel
1 138.120 312
IngressInterface
IngressLabel
FEC EgressInterface
EgressLabel
1 138.120 x4
53
IngressInterface
IngressLabel
FEC EgressInterface
EgressLabel
1 x 138.120
Reques
t138.1
20
Request138.120
MPLS switch
MPLS switch
MPLS switch
MPLS switch1
2
3
1 2
3
1
2
3
4
1
2
3
138.120
192.168127.20
The label is requested by the
upstream node and thedownstream node defines the label
used.
007/2008
Unsolicited Downstream
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
54/114
TransmisindeDatos
Multimedia-Maste
rIC20
5
4
Unsolicited Downstream
MPLS switch
MPLS switch
MPLS switch
MPLS switch1
2
3
1 2
3
1
2
3
4
1
2
3
138.120
192.168127.20
Mapping12
Mapping5
5 1212
5
IngressInterface
IngressLabel
FEC EgressInterface
EgressLabel
1 138.120 3
IngressInterface
IngressLabel
FEC EgressInterface
EgressLabel
1 138.120 x4
3
IngressInterface
IngressLabel
FEC EgressInterface
EgressLabel
1 x 138.120
The downstream node defines
the label and advertises it to theupstream node.
007/2008
Edge LSR Features
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
55/114
TransmisindeDatos
Multimedia-Maste
rIC20
5
5
Routing protocols FEC Classification
Initiates LSP setup for Downstream On Demand method
Adaptation of non-MPLS data to MPLS data
Layer 2 translation for MPLS data
Terminated MPLS-VPN
At least one LDP protocol
Edge LSR is counted into the TTL count as a regular router
007/2008
Core LSR Features
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
56/114
TransmisindeDatos
Multimedia-Maste
rIC20
5
6
Routing protocols Propagates Downstream On Demand method (request and mapping)
Layer 2 translation
High speed label forwarding/switching
At least one LDP protocol
007/2008
MPLS Advantages
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
57/114
TransmisindeDatos
Multimedia-Maste
rIC20
5
7
Simplified Forwarding Efficient Explicit Routing
Traffic Engineering
QoS Routing
Mappings from IP Packet to Forwarding Equivalence Class (FEC)
Partitioning of Functionality
Common Operation over Packet and Cell media
Tema 2:
Tema 2:
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
58/114
Transmisin de Datos Multimedia http://www.grc.upv.es/docencia/tdm Master IC 2007/2008
Aspectos de encaminamientoAspectos de encaminamiento
Algoritmos bsicos de encaminamiento Link state
Distance Vector
Encaminamiento en Internet RIP
OSPF
BGP
Multi-Protocol Label Switching (MPLS).
IP multicast
2007/2008
Multicast = Efficient Data Distribution
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
59/114
TransmisindeDatosMultimedia-Maste
rIC2
5
9
Src Src
2007/2008
Why Multicast ?
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
60/114
T
ransmisindeDatosMultimedia-Maste
rIC2
6
0
Need for efficient one-to-many delivery of same dataApplications:
News/sports/stock/weather updates
Distance learning
Configuration, routing updates, service location Pointcast-type push apps
Teleconferencing (audio, video, shared whiteboard, text editor)
Distributed interactive gaming or simulations
Email distribution lists Content distribution; Software distribution
Web-cache updates
Database replication
2007/2008
Why Not Broadcast or Unicast?
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
61/114
T
ransmisindeDatosMultimedia-Maste
rIC2
6
1
Broadcast: Send a copy to every machine on the net
Simple, but inefficient
All nodes must process packet even if they dont care
Wastes more CPU cycles of slower machines (broadcast radiation) Network loops lead to broadcast storms
Replicated Unicast:
Sender sends a copy to each receiver in turn
Receivers need to register or sender must be pre-configured Sender is focal point of all control traffic
Reliability => per-receiver state, separate sessions/processes at sender
2007/2008
Multicast Apps Characteristics
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
62/114
T
ransmisindeDatosMultimedia-Maste
rIC
6
2
Number of (simultaneous) senders to the group The size of the groups
Number of members (receivers)
Geographic extent or scope
Diameter of the group measured in router hops The longevity of the group
Number of aggregate packets/second
The peak/average used by source
Level of human interactivity
Lecture mode vs interactive
Data-only (eg database replication) vs multimedia
2007/2008
Reliable Multicast vs. Unreliable Multicast
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
63/114
T
ransmisindeDatosMultimedia-MasterIC
6
3
When a multicast message is sent by a process, the runtimesupport of the multicast mechanism is responsible for delivering themessage to each process currently in the multicast group.
As each participating process may be on a separate host, due to
factors such as failures of network links and/or network hosts,routing delays, and differences in software and hardware, the timebetween when a message is sent and when it is received may varyamong the recipient processes.
Moreover, a message may not be received by one or more of theprocesses at all.
2007/2008
Classification of multicasting mechanisms in terms of messagedelivery
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
64/114
T
ransmisindeDatosMultimedia-MasterIC
6
4
Unreliable multicast: The arrival of the correct message at each process is not guaranteed.
Reliable multicast:
Guarantees that each message is eventually delivered in a non-
corrupted form to each process in the group. The definition of reliable multicast requires that each participating
process receives exactly one copy of each message sent. It doesnot put any restriction of the order the messages delivered.
Reliable multicast can be further classified based on the order of thedelivery of the messages: unordered, FIFO, causal order, atomicorder.
C2007/2008
Classification of reliable multicast -- unordered
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
65/114
T
ransmisindeDatosMultimedia-MasterIC
6
5
An unordered reliable multicast system guarantees the safe deliveryof each message, but it provides no guarantee on the delivery orderof the messages.
Example: Processes P1, P2, and P3 have formed a multicast group.
Three messages, m1, m2, m3 have been sent to the group. Anunordered reliable multicast system may deliver the messages toeach of the three processes in any of these:
m1-m2-m3,
m1-m3-m2,m2-m1-m3,
m2-m3-m1,
m3-m1-m2,
m3-m2-m1
C2007/2008
Classification of reliable multicast - FIFO
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
66/114
T
ransmisindeDatosMultimedia-MasterIC
6
6
If process P sent messages mi and mj, in that order, then eachprocess in the multicast group will be delivered the messages miand mj, in that order.
Note that FIFO multicast places no restriction on the delivery order
among messages sent by different processes. For example, P1sends messages m11 then m12, and P2 sends messages m21 thenm22. It is possible for different processes to receive any of thefollowing orders:
m11-m12-m21-m22,m11-m21-m12-m22,
m11-m21-m22-m12,
m21-m11-m12-m22
m21-m11-m22-m12m21-m22-m11-m12.
C2007/2008
Classification of reliable multicast Causal order
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
67/114
T
ransmisindeDatosMultimedia-MasterIC
6
7
If message mi causes (results in) the occurrence of message mj,then mi will be delivered to each process prior to mj. Messages miand mj are said to have a causal or happen-before relationship.
For example, P1 sends a message m1, to which P2 replies with a
multicast message m2. Since m2 is triggered by m1, the twomessages share a causal relationship of m1-> m2. A causal-ordermulticast message system ensures that these two messages will bedelivered to each of the processes in the order of m1- m2.
C2007/2008
Classification of reliable multicast Atomic order
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
68/114
T
ransmisindeDato
sMultimedia-MasterIC
6
8
In an atomic-order multicast system, all messages are guaranteedto be delivered to each participant in the exact same order. Notethat the delivery order does not have to be FIFO or causal, butmust be identical for each process.
Example: P1 sends m1, P2 sends m2, and P3 sends m3.
An atomic system will guarantee that the messages will be deliveredto each process in only one of the six orders:
m1-m2- m3, m1- m3- m2, m2- m1-m3,
m2-m3-m1, m3-m1- m2, m3-m2-m1.
C2007/2008
IP Multicast Architecture
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
69/114
T
ransmisindeDato
sMultimedia-MasterIC
6
9
Hosts
Routers
Service modelService model
Host-to-router protocol
(IGMP)
Multicast routing protocols
(various)
IC2007/2008
IP Multicast model: RFC 1112
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
70/114
T
ransmisindeDato
sMultimedia-MasterI
7
0
Message sent to multicast group (of receivers) Senders need not be group members
A group identified by a single group address
Use group address instead of destination address in IP packet sent togroup
Groups can have any size;
Group members can be located anywhere on the Internet
Group membership is not explicitly known
Receivers can join/leave at will
Packets are not duplicated or delivered to destinations outside thegroup
Distribution tree constructed for delivery of packets
No more than one copy of packet appears on any subnet
Packets delivered only to interested receivers => multicast deliverytree changes dynamically
Network has to actively discover paths between senders and receivers
IC2007/2008
IP Multicast Addresses
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
71/114
T
ransmisindeDato
sMultimedia-MasterI
7
1
Class D IP addresses 224.0.0.0 239.255.255.255
Address allocation: Well-known (reserved) multicast addresses, assigned by IANA:
224.0.0.x and 224.0.1.x Transient multicast addresses, assigned andreclaimed dynamically, e.g., by sdr program
Each multicast address represents a group of arbitrary size, called ahost group
There is no structure within class D address space like subnetting=> flat address space
1 1 1 0 Group ID
IC2007/2008
IP Multicast Service
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
72/114
T
ransmisindeDato
sMultimedia-Mast
er
7
2
Sending Uses normal IP-Send operation, with an IP multicast address specified
as the destination
Must provide sending application a way to: Specify outgoing network interface, if >1 available
Specify IP time-to-live (TTL) on outgoing packet Enable/disable loop-back if the sending host is/isn't a member of the
destination group on the outgoing interface
Receiving Two new operations
Join-IP-Multicast-Group(group-address, interface)
Leave-IP-Multicast-Group(group-address, interface)
Receive multicast packets for joined groups via normal IP-Receive
operation
IC2007/2008
Link-Layer Transmission/Reception
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
73/114
T
ransmisindeDato
sMultimedia-Mast
er
7
3
Transmission IP multicast packet is transmitted as a link-layer multicast, on those
links that support multicast
Link-layer destination address is determined by an algorithm specific tothe type of link
Reception
Necessary steps are taken to receive desired multicasts on a particularlink, such as modifying address reception filters on LAN interfaces
Multicast routers must be able to receive all IP multicasts on a link,
without knowing in advance which groups will be used
IC2007/2008
Using Link-Layer Multicast Addresses
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
74/114
T
ransmisindeDato
sMultimedia-Mast
er
7
4
Ethernet and other LANs using 802 addresses: Direct mapping! Simpler than unicast! No ARP etc.
32 class D addresses may map to one MAC address
Special OUI for IETF: 0x01-00-5E.
No mapping needed for point-to-point links
LAN multicast address
0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 1 1 1 1 0 0
1 1 1 0 28 bits
23 bits
IP multicast address
Group bit
rIC2007/2008
Multicast over LANs & Scoping
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
75/114
T
ransmisindeDato
sMultimedia-Mast
er
7
5
Multicasts are flooded across MAC-layer bridges along a spanningtree
But flooding may steal sending opportunity for non-member stationswhich want to transmit
Almost like broadcast!
Scope: How far do transmissions propagate?
Implicit scoping: Reserved Mcast addresses => dont leave subnet.
Also called link-local addresses
TTL-based scoping:
Multicast routers have a configured TTL threshold
Multicast datagram dropped if TTL
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
76/114
T
ransmisindeDato
sMultimedia-Mast
er
7
6
TTL expanding-ring search to reach or find a nearby subset of agroup
Rings can be nested, but not overlapping
s
1
2
3
rIC2007/2008
IP Multicast Architecture
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
77/114
TransmisindeDato
sMultimedia-Mast
er
7
7
Hosts
Routers
Service model
HostHost--toto--router protocolrouter protocol(IGMP)(IGMP)
Multicast routing protocols
(various)
erIC2007/2008
Internet Group Management Protocol
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
78/114
TransmisindeDato
sMultimedia-Maste
7
8
IGMP: signaling protocol to establish, maintain, remove groups ona subnet.
Objective: keep router up-to-date with group membership of entireLAN
Routers need not know who all the members are, only that membersexist
Each host keeps track of which mcast groups are subscribed to
Socket API informs IGMP process of all joins
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
79/114
erIC2007/2008
How IGMP Works (cont.)
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
80/114
TransmisindeDato
sMultimedia-Maste
8
0
When a hosts timer for group G expires, it sends a MembershipReport to group G, with TTL = 1
Other members of G hear the report and stop (suppress) theirtimers
Routers hear all reports, and time out non-responding groups
Q
G G G G
Routers:
Hosts:
erIC2007/2008
How IGMP Works (cont.)
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
81/114
TransmisindeDato
sMultimedia-Maste
8
1
Normal case: only one report message per group present is sent inresponse to a query
Query interval is typically 60-90 seconds
When a host first joins a group, it sends immediate reports, instead
of waiting for a query
IGMPv2: Hosts may send a Leave group message to all routers(224.0.0.2) address
Querier responds with a Group-specific Query message: see if anygroup members are present
Lower leave latency
terIC2007/2008
The Java Basic Multicast API
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
82/114
TransmisindeDato
sMultimedia-Mast
82
At the transport layer, the basic multicast supported by Java is anextension of UDP (the User Datagram Protocol)
For the basic multicast, Java provides a set of classes which areclosely related to the datagram socket API classes
terIC2007/2008
Datagram - recap
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
83/114
TransmisindeDato
sMultimedia-Mast
83
a byte array
a DatagramPacket object
receiver's
address
a DatagramSocket
object
sender process
a byte array
a DatagramPacket object
a DatagramSocket
object
receiver process
send
receive
object reference
data flow
terIC2007/2008
The Java Basic Multicast API - 2
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
84/114
TransmisindeDatosMultimedia-Mast
84
There are four major classes in the API, the first three of which wehave already seen in the context of datagram sockets.
InetAddress: In the datagram socket API, this class represents theIP address of the sender or receiver. In multicasting, this classcan be used to identify a multicast group.
DatagramPacket: As with datagram sockets, an object of this classrepresents an actual datagram; in multicast, a DatagramPacketobject represents a packet of data sent to all participants orreceived by each participant in a multicast group.
DatagramSocket: In the datagram socket API, this class representsa socket through which a process may send or receive data.
MulticastSocket : A MulticastSocket is a DatagramSocket, withadditional capabilities for joining and leaving a multicast group. Anobject of the multicast datagram socket class can be used for
sending and receiving multicast packets. In the Java API, aMulticastSocket object is bound to a port address, e.g. 3456, andmethods of the object allows for the joining and leaving of amulticast address, e.g. 239.1.2.3
terIC2007/2008
Joining a multicast group
To join a multicast group at IP address mand UDP portp, a
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
85/114
TransmisindeDatosMultimedia-Mas
85
MulticastSocket object must be instantiated withp, then the objectsjoinGroup method can be invoked specifying the address m:
// join a Multicast group at IP address
// 239.1.2.3 and port 3456
InetAddress group =
InetAddress.getByName("239.1.2.3"); MulticastSocket s =
new MulticastSocket(3456); s.joinGroup(group);
terIC2007/2008
Sending to a multicast group
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
86/114
TransmisindeDatosMultimedia-Mas
86
A multicast message can be sent using syntax similar with the datagramsocket API.
String msg = "a multicast message.";
InetAddress group =
InetAddress.getByName("239.1.2.3");
MulticastSocket s =
new MulticastSocket(3456);
s.joinGroup(group); // optional
DatagramPacket hi =new DatagramPacket(msg.getBytes( ),
msg.length( ),group, 3456);
s.send(hi);
terIC2007/2008
Receiving messages sent to a multicast group
h h d l h
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
87/114
TransmisindeDatosMultimedia-Mas
87
A process that has joined a multicast group may receive messages sent to the groupusing syntax similar to receiving data using a datagram socket API.
byte[] buf = new byte[1000];
InetAddress group =
InetAddress.getByName("239.1.2.3");
MulticastSocket s =
new MulticastSocket(3456);
s.joinGroup(group);
DatagramPacket recv =
new DatagramPacket(buf,buf.length);
s.receive(recv);
terIC2007/2008
Leaving a multicast group
A l lti t b i ki th l G
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
88/114
TransmisindeDatosMultimedia-Mas
88
A process may leave a multicast group by invoking the leaveGroupmethod of a MulticastSocket object, specifying the multicastaddress of the group.
s.leaveGroup(group);
terIC2007/2008
Setting the time-to-live
Th ti t d t t lti t f
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
89/114
TransmisindeDatosMultimedia-Mas
89
The runtime support needs to propagate a multicast message froma host to a neighboring host in an algorithm which, when executedproperly, will eventually deliver the message to all the participants.
Under some anomalous circumstances, however, it is possible thatthe algorithm which controls the propagation does not terminateproperly, resulting in a packet circulating in the network indefinitely.
Indefinite message propagation causes unnecessary overhead onthe systems and the network.
To avoid this occurrence, it is recommended that a time to liveparameter be set with each multicast datagram.
The time-to-live (ttl) parameter, when set, limits the count ofnetwork links or hops that the packet will be forwarded on thenetwork.
as
terIC2007/2008
Setting the time-to-live
Th d d ttl tti
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
90/114
TransmisindeDatosMultimedia-Mas
90
The recommended ttl settings are: 0 if the multicast is restricted to processes on the same host
1 if the multicast is restricted to processes on the same subnet
32 if the multicast is restricted to processes on the same site
64 if the multicast is restricted to is processes on the same region 128 is if the multicast is restricted to processes on the same
continent
255 is the multicast is unrestricted
as
terIC2007/2008
Setting the time-to-live
String msg = "Hello everyone!";
dd
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
91/114
TransmisindeDatosMultimedia-Mas
91
InetAddress group =
InetAddress.getByName("239.1.2.3");MulticastSocket s = new MulticastSocket(3456);s.setTimeToLive(1);
// set time-to-live to 1 hop
DatagramPacket hi =
new DatagramPacket(msg.getBytes( ),msg.length( ),group, 3456);
s.send(hi);
The value specified for the ttlmust be in the range 0
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
92/114
TransmisindeDatosMultimedia-Mas
92
To join a group, you use the setsockopt() kernel service call with anew parameter. The new parameter is the ip_mreq structure:
The imr_multiaddr field specifies the multicast group you want tojoin. It is the same format as the sin_addr field in the sockaddr_instructure. The imr_interface field lets you choose a particular hostinterface. This is similar to a bind(), which lets you specify the host
interface (or leave the host option wide open with an INADDR_ANYvalue).
/************************************************************/
/*** The ip_mreq structure for selecting a multicast addr ***/
/************************************************************/
struct ip_mreq{
struct in_addr imr_multiaddr; /* known multicast group */
struct in_addr imr_interface; /* network interface */
} ;
Mas
terIC2007/2008
The C version: Joining Multicast Groups
The following code snippet shows you how to join a group using thei t t It t th i i t f fi ld t INADDR ANY
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
93/114
TransmisindeDat
osMultimedia-Mas
93
The following code snippet shows you how to join a group using theip_mreq structure. It sets the imr_interface field to INADDR_ANYmerely for demonstration. Do not use it unless you have only oneinterface on your host; the results can be unpredictable .
/************************************************************/
/*** Join a multicast group ***/
/************************************************************/
const char *GroupID = "224.0.0.10";
struct ip_mreq mreq;
if ( inet_aton(GroupID, &mreq.imr_multiaddr) == 0 )
panic("address (%s) bad", GroupID);
mreq.imr_interface.s_addr = INADDR_ANY;
if ( setsockopt(sd, SOL_IP, IP_ADD_MEMBERSHIP,&mreq,sizeof(mreq))!= 0)
panic("Join multicast failed");
/************************************************************/
/*** Drop a multicast group ***/
/************************************************************/
if ( setsockopt(sd, SOL_IP, IP_DROP_MEMBERSHIP, &mreq, sizeof(mreq)) != 0 )
panic("Drop multicast failed");
Mas
terIC2007/2008
IP Multicast Architecture
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
94/114
TransmisindeDat
osMultimedia-M
94
Hosts
Routers
Service model
Host-to-router protocol
(IGMP)
Multicast routing
protocols
Mas
terIC2007/2008
Multicast Routing
Basic objective build distribution tree for multicast packets
Th l f th di t ib ti t th b t t i i t l t
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
95/114
TransmisindeDat
osMultimedia-M
95
The leaves of the distribution tree are the subnets containing at leastone group member (detected by IGMP)
Multicast service model makes it hard
Anonymity
Dynamic join/leave
Mas
terIC2007/2008
Simple Multicast Routing Techniques
Flood and prune
Begin by flooding traffic to entire network
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
96/114
TransmisindeDat
osMultimedia-M
96
Begin by flooding traffic to entire network
Prune branches with no receivers
Examples: DVMRP, PIM-DM
Unwanted state where there are no receivers
Link-state multicast protocols
Routers advertise groups for which they have receivers to entirenetwork
Compute trees on demand
Example: MOSPF
Unwanted state where there are no senders
Mas
terIC2007/2008
How to Flood Efficiently ?
A router forwards a packet from source (S) iffit arrives via theshortest path from the router back to S
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
97/114
TransmisindeDat
osMultimedia-M
97
shortest path from the router back to S
Reverse path check!
Packet is replicated out all but the incoming interface
Reverse shortest paths easy to compute
just use info in DVrouting tables
DV gives shortest reverse paths
Efficient if costs are symmetric
xxyy
tt
SS
a
zz
Forward packets that arriveon shortest path from t to S(assume symmetric routes)
Mas
terIC2007/2008
Problem
Flooding can cause a given packet to be sent multiple times over the same
link: can filter better than this!
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
98/114
TransmisindeDat
osMultimedia-
98
ca te bette t a t s
Solution: Reverse Path Broadcasting
xx yy
zz
SS
a
b
duplicate packet
MasterIC2007/2008
Reverse Path Broadcasting (RPB)
Basic idea: forward a packet from S only on child links for S
Child link of router x for source S: link that has x as parent on the
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
99/114
TransmisindeDatosMultimedia-
99
Child link of router x for source S: link that has x as parent on theshortest path from the link to S
xx yy
zz
SS
a
b
5 6
child link of xfor S
forward onlyto child link
MasterIC2007/2008
How to Find Child Links?
Routing updates !
If z tells x that it can reach S through y
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
100/114
TransmisindeDatosMultimedia-
10
If z tells x that it can reach S through y,
and if the cost of this path is >= the cost of the path from z to Sthrough x,
then x knows that the link to z is a child link
In case of tie, lower address wins
xx yy
zz
SS
a
b
5 6
child link of xfor S
forward only
to child link
MasterIC2007/2008
Truncated RPB
This is still a broadcast algorithm the traffic goes everywhere lousy filtering!
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
101/114
TransmisindeDatosMultimedia-
10
y g
First order solution: Truncated RPB
Don't forward traffic onto networks with no receivers
Identify leaves
Leaf links are the child links that no other router uses to reach source S
Use periodic updates of form: this is my next-link to source S
If child is not the next-link for anyone, it is a leaf
Detect group membership in leaf (IGMP)
-MasterIC2007/2008
Reverse Path Multicast (RPM)
Prune back transmission so that only absolutely necessary linkscarry traffic
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
102/114
TransmisindeDatosMultimedia-
10
y
Use on-demand pruning so that router group state scales withnumber of active groups
xx yy
tt
SS
vv bbaa
aa
bb
data messageprune message
-MasterIC2007/2008
Basic RPM Idea
Prune (Source,Group) at leaf if no members
Send Non-Membership Report (NMR) up the tree
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
103/114
TransmisindeDatosMultimedia-
10
p p ( ) p
If all children of router R prune (S,G)
Propagate prune for (S,G) to parent R
On timeout: Prune dropped
Flow is reinstated
Down stream routers re-prune
Note: this is a soft-state approach Grafting: Explicitly reinstate sub-tree when
IGMP detects new members at leaf, or when a child asks for a graft.
-MasterIC2007/2008
Putting it together: Topology
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
104/114
TransmisindeDatosMultimedia
10
G G
S
G
-MasterIC2007/2008
Flood with Truncated Broadcast
G G
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
105/114
TransmisindeDatosMultimedia
10
S
G
-MasterIC2007/2008
Pruning
G G
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
106/114
TransmisindeDatosMultimedia
10
S
Prune (s,g)
Prune (s,g)
G
a-MasterIC2007/2008
Grafting
G G
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
107/114
TransmisindeDatosMultimedia
10
Graft (s,g)
Graft (s,g)S
G
G
Report (g)
a-Ma
sterIC2007/2008
After Grafting Complete
G G
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
108/114
TransmisindeDa
tosMultimedia
10
S
G
G
a-Ma
sterIC2007/2008
Reliable Multicast: The Goal
Implement reliability on top of IP multicast
Why is this hard ?
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
109/114
TransmisindeDa
tosMultimedia
10
Sender cannot keep state for unknown number of dynamic receivers
Remember open & dynamic group semantic?
Algorithms like TCP that estimate path properties such as RTT and
congestion window dont generalize to trees. Remember: TCP is only for a unicast session!
Has to address (N)ACK implosions
a-Ma
sterIC2007/2008
Implosion
Packet 1 is lost All 4 receivers request a resend
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
110/114
TransmisindeDa
tosMultimedia
11
R1
S
R3 R4
R2
21
R1
S
R3 R4
R2
Resend request
ia-Ma
sterIC2007/2008
Retransmission
Re-transmitter
Options: sender, other receivers
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
111/114
TransmisindeDa
tosMultimedi
11
How to retransmit
Unicast, multicast, scoped multicast, retransmission group,
Problem: retransmissions (aka repairs) may reach destinations thatdont require a retransmission
A.k.a exposure problem
Solution: subcast the re-transmission only to receivers that need it.
ia-Ma
sterIC2007/2008
Why Subcast? Exposure problem
Packet 1 does not reach R1;Receiver 1 requests a resend
Packet 1 resent to all 4 receivers
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
112/114
TransmisindeDa
tosMultimed
11
R1
S
R3 R4
R2
21
R1
S
R3 R4
R2
q
1
1
Resend request Resent packet
dia-Ma
sterIC2007/2008
Ideal Recovery Model
Packet 1 reaches R1 but is lostbefore reaching other Receivers
Only one receiver sends NACK tothe nearest S or R with packet
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
113/114
TransmisindeDa
tosMultimed
11
S
R3 R4
R2
2
1
S
R3 R4
R2
g p
Resend request
1 1
Resent packet
Repair sent
only to
those thatneed packet
R1 R1
dia-Ma
sterIC2007/2008
Reliable Multicast Transport: Issues
Retransmission can make reliable multicast as inefficient as
replicated unicast (N)ACK-implosion if all destinations ack at once
-
8/10/2019 Universidad Publica de Valencia_T2_encaminamiento
114/114
TransmisindeDa
tosMultimed
1
1
(N)ACK-implosion if all destinations ack at once
Crying baby: a bad link affects entire group
Heterogeneity: receivers, links, group sizes
Anonymous/Open/Dynamic Group Model:
Source does not know # of destinations, and destinations may vanish
Multicast applications do not need strong reliability of the typeprovided by TCP.
Can tolerate some reordering, delay, etc