Group name | 05_Dijkstra |
---|---|
Member 1 | Lukas Heimes |
Member 2 | Dan Kluser |
Member 3 | Patrick Ziegler |
We use a centralized controller that computes paths through a series of multi-commodity flow (MCF) problems. The user selects a subset of the SLAs, which give rise to a set of (base traffic) flows that must be considered. The simulation is divided into fixed-size intervals of a few seconds length. For each interval, the controller constructs an MCF problem consisting of the relevant base traffic flows. This global view allows us to approximate an optimal solution. The MCF problems are solved by means of Linear Programming (LP), the solutions are then converted into paths and installed on the switches.
In the forwarding plane, paths are added to the packets at the ingress switches in the form of an MPLS header stack. Forwarding is based on these labels, except for the last hop, which is IPv4 based. Packets that do not belong to an installed path are dropped at the ingress switch.
The above approach is complemented by the following techniques:
- Failure detection: Link state changes are detected using heartbeats and trigger a recomputation of the MCF solution.
- Additional Traffic: If enabled, the switches detect additional traffic and forward it to the controller, which can then decide to install an appropriate path on the switch if the residual graph has sufficient capacity.
- Load balancing: When the MCF solution distributes a flow among multiple paths, the ingress switch chooses a path uniformly at random at flowlet (TCP) or packet (UDP) granularity.
Implemented failure detection/handling and flowlet-like load balancing. Contributed to the MCF approach and detected numerous bugs through careful code reviews.
Devised the MCF approach, formulated the Linear Program and implemented the interval handling. Really, the heart of our solution.
Implemented MPLS forwarding as well as the detection and integration of additional traffic flows. Refactored the controller for modularity and extensibility.
.
|-- controllers
| |-- parameters.py // Definition of configurable parameters
| |-- controller.py // Main file for the centralized controller
| |-- flow_manager.py // Computes paths using mcf.py, used by controller.py
| |-- table_manager.py // Installs paths on the switches, used by controller.py
| |-- heartbeat_generator.py // Generates heartbeats for failure detection; copied from Ex. 7
| |-- mcf.py // Encodes the MCF problem as a LP, solves it and transforms it to paths
| |-- commodity.py // Class definition for a commodity, used by mcf.py
| |-- flow_endpoint.py // Class definition for a flow endpoint, used by mcf.py
| |-- flow.py // Class definition for a flow in the network
| |-- graph.py // Data structure for representing a network graph
| |-- edge.py // Edge class used in graph.py
| |-- node.py // Node class used in graph.py
| `-- test_mcf.py // Testcases for mcf.py
|-- p4src
| |-- include
| | |-- headers.p4 // Packet headers, metadata, and constant definitions
| | `-- parsers.p4 // Parser and deparser for supported headers
| `-- switch.p4 // Main dataplane file, used by all switches
`-- README.md
We start by defining what a multi-commodity flow (MCF) problem on a directed graph
For each edge
Furthermore, we have
So intuitively, having a commodity
This intuition translates to the following constrained optimization problem to minimize the total cost for all flows:
Objective:
where
Constraints:
-
Link capacity:
$\forall (u,v) \in E: \sum_{i \in K} f_i(u,v) \leq c(u,v)$ -
Flow conservation (transit nodes):
$\forall i \in K: \sum_{w\in V} f_i(u,w) - \sum_{w \in V} f_i(w,u) = 0 \text{ when } u \neq s_i, t_i$ -
Flow conservation (source node)
$\forall i \in K: \sum_{w\in V} f_i(s_i,w) - \sum_{w \in V} f_i(w,s_i) = d_i$ -
Flow conservation (sink node)
$\forall i \in K: \sum_{w\in V} f_i(t_i,w) - \sum_{w \in V} f_i(w,t_i) = -d_i$
If we constrain our solutions to have a single flow per commodity, i.e.,
The above constraints translate in a straightforward manner to a linear program. Simply introduce a variable for each
The number of constraints and variables is in
So far, the LP is feasible only if all of the commodities' demands can be satisfied. Since we are also interested in a solution satisfying only a part of our requirements, we introduce slack variables (edges) to our LP (as in [1]).
For every commodity, we add an edge
[1]: Suri S., Waldvogel M., Warkhede P.R. (2001) Profile-Based Routing: A New Framework for MPLS Traffic Engineering.
We start with the graph shost:sport -> dhost:dport
with demand
For both flow endpoints (shost:sport:TCP
) and (dhost:dport:TCP
), we add a new node to our graph, each connected to the respective host node (shost
) and (dhost
) with infinite capacity and zero cost edges.
We then need two commodities:
and
Note that we reduce the bandwidth of the reverse flow by the
The parameter
Note that for UDP flows we do the same exact thing, but without the reverse direction commodity.
We may directly add waypointing as a further constraint to our LP. Let us assume we want to waypoint some commodity
The requirement is simple: every flow in our solution for commodity
For this we split the commodity
To prevent a situation where we have a flow from
Waypointing Constraint
This ensures that in every feasible solution if we have a flow of
Our solution might return fractional flows. This means for some flow we might have multiple paths, each taking a different fraction of the flow. We load balance across those paths by selecting one uniformly at random an the ingress switch. This selection is performed at a flowlet granularity for TCP flows and at a packet granularity for UDP flows.
While this might not respect the exact fractional solutions derived in our LP, experimental evidence suggests that this is not a big problem.
Switches drop all packets for which no explicit paths are installed. This includes any flows from the additional traffic because they are not known beforehand.
If detection of additional traffic is enabled, the switch will detect UDP packets on ports above 60000 and send them to the controller. The controller then solves an MCF problem on an approximate residual capacity graph (see section below) using all known additional traffic flows and link failures. As with the base traffic, solutions are converted into paths and installed on the switches.
As soon as the paths are installed, the corresponding additional traffic packets are treated like any other traffic, i.e., the ingress switch adds the installed MPLS header stack and the packet is forwarded according to these labels.
Any packets arriving in the interval between the controller receiving the first packet and the new paths being installed are also sent to the controller. The controller can optionally add the appropriate MPLS headers in the control plane and send the packet back to the switch such that they are not lost.
The MCF for all additional traffic flows is solved on an approximation of the residual graph after subtracting the base-traffic bandwidth demands. The residual graph is approximated as follows:
- The bandwidth of each base-traffic flow is normalized over the entire 60 seconds time interval.
- A single MCF for all normalized base-traffic flows is solved.
- The bandwidth requirements of all resulting paths are subtracted from the edge capacities in the base graph.
Each detected additional traffic flow is assigned a bandwidth requirement of 10Mbps (configurable) because the effective rate is not known and could be 10Mbps in the worst case.
Since we also don't know when an additional traffic flow ends, we periodically purge all detected additional flows and their associated table entries in the switch.
If some additional traffic flow is still active, the switch will send the next packet to the controller again and the controller will compute new paths.
Having table entries for stale paths is no big deal, but stale flows in the MCF use up capacity which could otherwise be used for flows that are still active.
In the code version we hand in, additional traffic detection is disabled because we find this to be the more promising option for the leaderboard :-) To enable the feature, you need to do three things:
- In
controller.py
, select at least one of the SLAs for the additional traffic, e.g.,prr_31
. - In the same file, enable purging (if desired), by setting the parameter
additional_traffic_purge=True
. - In
switch.p4
, comment out the line#define DO_ADDITIONAL 0
.
The user can select a set of SLAs that the controller should try to satisfy.
This is done by including the names of the selected SLAs in the slas
parameter in controller.py
.
The controller then includes all base-traffic flows to which a selected SLA applies in its MCF computation. Note, however, the following limitations of SLA selection:
- The type of the SLA is not considered. For example, it does not matter whether it is a flow completion time or flow completion rate SLA.
- The type of the SLA is not considered (except for waypointing). For example, it does not matter whether it is a flow completion time or flow completion rate SLA. An SLA only determines which flows in the base-traffic are added to the MCF and the MCF will try its best to allocate bandwidth for that flow without any formal guarantees around delay or packet-reception-rate.
- As such, it is not possible to consider SLAs only up to a particular target value.
For example, including the SLA
prr_26
is equivalent to including any other SLA with the same protocol and port ranges (likeprr_27
orprr_28
).
Our controller exposes a multitude of configuration options. While we set sensible defaults, different choices lead to different tradeoffs, which may be preferred depending on the situation.
We give a brief overview of the parameters (and default values) here, with a focus on the tradeoffs. Please also refer to the description of the parameters in controllers/parameters.py
.
p4src/switch.p4:
FLOWLET_TIMEOUT=48w200000 (microseconds)
Threshold when a new TCP flowlet starts. Lower value might allow for faster reaction to congestion,
but also leads to more reordering within TCP flows.
FAILURE_THRESHOLD=48w500000 (microseconds)
Threshold after which a link is considered failed if no heartbeats are received. Lower value leads
to faster detection of failures, but also to a higher false positive rate.
DO_ADDITIONAL=0 (boolean, 0 or 1)
Whether detected additional traffic should be reported to the controller.
controllers/controller.py:
total_time=60 (seconds)
The total duration of the simulation.
mcf_interval_size=5 (seconds)
The size of time intervals for which an MCF problem is solved. Lower value leads to a better
approximation of the bandwidth demands at a particular point in time, but increases the
computational cost at the beginning and for link state changes.
normalize_bw_across_time=False
Whether flows that are active only during a part of a time interval should be normalized as
if they spanned the entire interval. Enabling this might lead to less wasted bandwidth, but
may also increase congestion.
tcp_default_bw=10 (Mbps)
The bandwidth demand that is assumed for (size-based) TCP flows. Lower value might allow to
accommodate more flows, but generally leads to more congestion.
udp_cost_multiplier=1
A multiplier for the cost of UDP flows. Higher value prioritizes short paths for UDP flows
relative to TCP flows.
tcp_cost_multiplier=1
Same for TCP flows.
udp_bw_multiplier=1
A multiplier for the bandwidth demand of UDP flows. A value >1 means that some spare capacity
will be allocated for UDP flows, which reduces congestion but might waste bandwidth.
tcp_bw_multiplier=1
Same for TCP flows.
tcp_ack_bw_multiplier=0.5
Determines the bandwidth demand for the reverse direction of a TCP flow. Lower value saves
bandwidth but may lead to lost ACKs, which lead to unnecessary retransmissions.
use_num_hops_cost=False
Use the hop count instead of delay as the cost measure in the MCF problem. Enabling this generally
leads to shorter paths in terms of hop count, which reduces the number of switches involved, but
may increase the length of the path in terms of delay.
heartbeat_frequency=0.1 (seconds)
The time interval between heartbeat messages. Lower value (together with a lower value of
FAILURE_THRESHOLD in switch.py) leads to faster failure detection, but increases the overhead in
the network.
tcp_duration_multiplier=1.5
A multiplier on the estimated duration of (size-based) TCP flows. Note that the same effect could
be achieved by changing the tcp_default_bw parameter.
additional_traffic_bw=10 (Mbps)
The assumed bandwidth demand of additional traffic flows. Lower value wastes less bandwidth for
small additional traffic flows, but may lead to congestion if the true demand is higher.
controller_forward_mpls=True
Whether the controller should add MPLS headers in the control plane and forward the packet to the
appropriate output port during the period where there are no paths installed for an additional
traffic flow. Disabling this decreases the load on the controller, but leads to more packet loss
for additional traffic flows.
additional_traffic_purge_interval=5 (seconds)
The interval at which additional traffic flow paths are purge from the switches. Higher value
decreases load on both the switches and controller, but leads to more wasted bandwidth.
additional_traffic_purge=False
Whether to purge installed additional traffic flows at a regular interval (specified by the
additional_traffic_purge_interval parameter above).
slas=[...] (list of strings)
The names of the SLAs that the controller should try to satisfy (see above).