Initial implementation for explicit resource allocations on the HPWREN network

Hans-Werner Braun, 15 July 2007

Assumptions

The need for an HPWREN resource allocation implementation is driven by the vastly diverse networking requirements of various HPWREN applications, ranging from hundreds of gigabytes per day in astronomer bulk data, to continuous trickle-traffic of individual sensors where the data may only have a short usable half-life. In addition, high priority traffic during emergency events, such as intelligence data related to firefighter Incident Command Posts of large wildfires, and deferable traffic, such as various background flows, like environmental video and still cameras, has to be supported.

This leads to a requirement for at least four groups of network traffic sets:

Another assumption is that is should be possible, including in an automated fashion across the network, to reallocate network resources in real-time, such as to accommodate the day/night differences in astronomer traffic, or requirements arising from a sudden fire or other emergency. The implementation will utilize the Quality of Service (QoS) capabilities of the currently predominantly 3560/3550 Cisco routers in the network, based on Differentiated Services Code Point (DSCP) tags.

Concepts

A primary goal is to create an implementation without a requirement to hand off state between routers, outside of the IP packets themselves. The network will differentiate between access links to individual sites and backbone links between HPWREN nodes. All traffic on access links will be tagged with a DSCP value, a value that will not change throughout the network, and which link outbound traffic will be sorted by. Since congestion happens on individual links and the system is stateless (e.g., another link downstream may be non-congested), there is no need to modify DSCP values on backbone links within the network, and the only decision on outbound traffic to backbone links is whether to drop a specific packet or not. Outbound interfaces on the backbone will utilize the router queuing system to enforce packet-drop policies.

Cisco 3560 implementation

The Cisco implementation consists of tagging on access links to define which of the four egress to use on the outbound interfaces on the HPWREN backbone, and queue management on those backbone link interfaces to determine when a packet is to be dropped from its queue. QoS is being enabled in all HPWREN routers with the

  # mls qos

command.

The tagging and policing of packets on access links with DSCP values is based on the following DSCP and queue/threshold mapping table:


X = ToS: Drop.Throughput.Reliability
Y = nnn: IP precedence
Fields are: top: Name - DSCP, bottom: Precedence - ToS
000
001
010
011
100
101
110
111
000
default
0
0
1
-
1
0
1
-
2
0
2
-
3
0
3
-
4
0
4
-
5
0
5
-
6
0
6
-
7
0
7
001
CS1
8
1
0
-
9
1
1
AF11
10
1
2
-
11
1
3
AF12
12
1
4
-
13
1
5
AF13
14
1
6
-
15
1
7
010
CS2
16
2
0
-
17
2
1
AF21
18
2
2
-
19
2
3
AF22
20
2
4
-
21
2
5
AF23
22
2
6
-
23
2
7
011
CS3
24
3
0
-
25
3
1
AF31
26
3
2
-
27
3
3
AF32
28
3
4
-
29
3
5
AF33
30
3
6
-
31
3
7
100
CS4
32
4
0
-
33
4
1
AF41
34
4
2
-
35
4
3
AF42
36
4
4
-
37
4
5
AF42
38
4
6
-
39
4
7
101
CS5
40
5
0
-
41
5
1
-
42
5
2
-
43
5
3
-
44
5
4
-
45
5
5
EF
46
5
6
-
47
5
7
110
CS6
48
6
0
-
49
6
1
-
50
6
2
-
51
6
3
-
52
6
4
-
53
6
5
-
54
6
6
-
55
6
7
111
CS7
56
7
0
-
57
7
1
-
58
7
2
-
59
7
3
-
60
7
4
-
61
7
5
-
62
7
6
-
63
7
7

Default mappings after MLS QOS is enabled:
Ingress: Queue1-Threshold1
Ingress: Queue2-Threshold1
Egress: Queue1-Threshold1
Egress: Queue2-Threshold1
Egress: Queue3-Threshold1
Egress: Queue4-Threshold1

Of those table entries, CS1 will be used for deferable traffic, CS2 for normal traffic, CS4 for priority bulk traffic, and CS5 for high priority traffic. That choice was made so the traffic maps to appropriate default queues and thresholds.

Tagging on access links

Two ingress queues support two different user-configurable drop thresholds each, plus one non-configurable threshold preset to the queue-full state. After policing/marking incoming traffic, the packets will be mapped to queue/threshold pairs based on CoS or DSCP values, and subjected to Weighted Tail Drop (WTD) to determine whether a packet is to be inserted into the queue, or dropped. Given the default configuration being used, all packets will be mapped to ingress queue 1, threshold 1, except for the CS5 row in the table above, which will be mapped to ingress queue 2, threshold 1. Shaped Round Robin (SRR) weights will determine the dequeuing of a packet from an ingress queue, and onto the routers Internal Ring, to then be moved to the appropriate egress queue.

The initial HPWREN implementations leaves the ingress queue/threshold maps and buffer management at its default values, and only tags the packets with DSCP values during the marking phase. The following example is from a benchtest setup, to test the QoS performance with iperf streams on various TCP or UDP ports. A real configuration needs to replace the access-list entries with the appropriate traffic profiles.

  # Create access lists for the policy/class maps
  #  Fire/etc on CS5:
     access-list 140 permit tcp any any eq 7000
     access-list 140 permit udp any any eq 7000
  #  MPO on CS4:
     access-list 132 permit tcp any any eq 7001
     access-list 132 permit udp any any eq 7001
  #  Default on CS2:
     # undefined, i.e., all other traffic
  #  Deferable on CS1:
     access-list 108 permit tcp any any eq 7003
     access-list 108 permit udp any any eq 7003

  # Create class maps
  #  Fire/etc:
     class-map match-all 140-in
      match access-group 140
  #  MPO:
     class-map match-all 132-in
      match access-group 132
  #  Deferable:
     class-map match-all 108-in
      match access-group 108
  
  # Define policy map
    policy-map hpwren-access
  #  High Priority (Fire/etc.) --> default maps to Q1T1
     class 140-in
      set dscp cs5
  #  Priority bulk (MPO) --> default map to Q4T1:
     class 132-in
      set dscp cs4
  #  Deferable (video) --> default map to Q2T1:
     class 108-in
      set dscp cs1
  #  default (everything else) --> default map to Q3T1:
     class class-default
      set dscp cs2

  # Add policy map to (example) access port
    interface FastEthernet0/4
    service-policy input hpwren-access

To make this work across multiple HPWREN backbone routers, a downstream router will have to trust the DSCP values for traffic coming from the upstream router:

  # Add DSCP trust for (example) traffic port from an upstream router.
    interface FastEthernet0/22
    mls qos trust dscp

Queue management on backbone link interfaces

Two queue-set templates define four egress queues with four drop thresholds each. The queue-set is defined in an interface configuration, like:

  # interface FastEthernet0/22
  #  queue-set 1

which is not displayed in "show conf" as using queue-set 1 is the default behavior. CoS or DSCP values can be mapped each to an egress queue and threshold identifier. This implementation uses the default maps, per the illustration in the table above.

An initial assumption was made to allocate resources based on the four classifications:

deferable:5 percent guaranteed
standard:30 percent guaranteed
priority bulk:50 percent guaranteed
high priority:15 percent guaranteed

which has been reflected in the buffer allocations, using also almost-default threshold settings in the router-global configurations:

  mls qos queue-set output 1 buffers 15 5 30 50

  #the commented-out lines are the same as their default values
  # mls qos queue-set output 1 threshold 1 100 100 50 400
  mls qos queue-set output 1 threshold 2 100 100 50 400
  # mls qos queue-set output 1 threshold 3 100 100 50 400
  # mls qos queue-set output 1 threshold 4 100 100 50 400
and the per-port SRR configuration on specific output ports to other backbone nodes:
  # Add SRR characteristics to (example) output port
  interface FastEthernet0/22
   srr-queue bandwidth share 15 5 30 50

Bench testing

Initial iperf testing with TCP created very confusing, and basically useless, results, because of the systemic interactions across various components in the setup (e.g., a host TCP reacting to bandwidth limiters). A much more predictable way of testing is the clocked UDP streams that iperf supports. This was facilitated by limiting the router interconnection Ethernet interfaces to 10Mbps, to force a queue buildup. Using two senders and one receiver across the bench test network, and the 10Mbps bandwidth limit between two routers in the path, the receiver runs two tasks:

  iperf -i 1 -s -p 7001 -u
  iperf -i 1 -s -p 7002 -u 
With sender 1 using a 20Mbps stream:
  iperf -i 1 -t 9999 -c receiver_host -p 7001 -u -b 20m
and sender 2 also using a 20Mbps stream:
  iperf -i 1 -t 9999 -c receiver_host -p 7002 -u -b 20m
to show the appropriate results.

Acknowledgments

Gaurav Dhiman and Brian Dunne at UCSD were very helpful for researching an initial QoS system that this summary is in large part based on. In addition, Parker Foster of Cisco has been of great help.