* Updated the utils/run_exercise.py to allow exercises to customize host configuration from the topology.json file. Now hosts and `ping` each other in the basic exercise. Other Linux utilities should work as well (e.g. iperf). ``` mininet> h1 ping h2 PING 10.0.2.2 (10.0.2.2) 56(84) bytes of data. 64 bytes from 10.0.2.2: icmp_seq=1 ttl=62 time=3.11 ms 64 bytes from 10.0.2.2: icmp_seq=2 ttl=62 time=2.34 ms 64 bytes from 10.0.2.2: icmp_seq=3 ttl=62 time=2.15 ms ^C --- 10.0.2.2 ping statistics --- 3 packets transmitted, 3 received, 0% packet loss, time 2003ms rtt min/avg/max/mdev = 2.153/2.540/3.118/0.416 ms mininet> pingall *** Ping: testing ping reachability h1 -> h2 h3 h2 -> h1 h3 h3 -> h1 h2 *** Results: 0% dropped (6/6 received) ``` Only updated basic exercise, still need to update other exercises. Also, updated the root-bootstrap.sh because I was running into issues with latest version of vagrant. * Accidentially added the solution to the basic exercise in the previous commit. Undoing that here ... * Updated the topology.json file and table entries for the basic_tunnel exercise. * Updated P4Runtime exercise with new topology and table entries. * Fixed MAC addresses in P4Runtime exercise. It is working now. * Fixed MAC addresses in P4Runtime exercise starter code * Updated ECN exercise to use new topology.json file. Updated the table entries / MAC addresses as well. * Updated the topology.json file and table entries for the MRI exercise. * Updated source_routing exercise with new topology file and verified correct functionality. * Updated load_balance exercise with new topology. * Moved basic exercise triangle topology into a separate folder * Added new topology for the basic exercise: a single pod of a fat-tree. * Updated Makefiles and run_exercise.py to allow exercises to configure each switch with a different P4 program. This is mainly for the firewall exercise. * Updated Makefiles of project to work with new utils/Makefile * Updated load_balance and p4runtime exercise Makefiles * Initial commit of the firewall exercise, which is a simple stateful firewall that uses a bloom filter. Need to update README files * Initial commit of the path_monitor exercise. It is working but still need to update the README and figure out what we want the tutorial attendees to implement. * Updated README file in firewall exercise. Also removed the bits from the starter code that we want the tutorial attendees to implement * Renamed path_monitor exercise to link_monitor * Updated the README in the link_monitor exercise and removed the bits from the starter code that we want the tutorial attendees to implement. * Updated README for the firewall exercise * Adding pod-topo.png image to basic exercise * Added firewall-topo.png image to firewall exercise * Added link-monitor-topo.png to link_monitor exercise * Updated README files to point to topology images * Updated top-level README to point to new exercises. * Fixed link for VM dependencies script in README * Updated bmv2/pi/p4c commits * Updated README files for exercises to fix some typos and added a note about the V1Model architecture. * Added a note about food for thought in the link_monitor README * Updated the firewall.p4 program to use two register arrays rather than a single one. This is to make the design more portable to high line rate devices which can only support a single access to each register array. * Minor fix to firewall exercise to get rid of compiler warning. * Updated comment in firewall exercise. * Minor (typo) fixes in the firewall ReadMe * More info in firewall exercise ReadMe step 2 * Updated firewall.p4 to reuse direction variable * More testing steps, small fixes in firewall exercise Readme * Added food for thought to firewall Readme * Cosmetic fixes to firewall ReadMe * Made a few updates to the basic exercise README and added more details to the link_monitor exercise README. Also added a command to install grip when provisioning the VM. This could be useful for rendering the markdown README files offline. * Updated top level README so it can be merged into the master branch. * Moved cmd to install grip from root-bootstrap to user-bootstrap
228 lines
8.6 KiB
Markdown
228 lines
8.6 KiB
Markdown
# Implementing Link Monitoring
|
|
|
|
## Introduction
|
|
|
|
The objective of this exercise is to write a P4 program that enables
|
|
a host to monitor the utilization of all links in the network. This
|
|
exercise builds upon the basic IPv4 forwarding exercise so be sure
|
|
to complete that one before attempting this one. Specifically, we
|
|
will modify the basic P4 program to process a source routed probe
|
|
packet such that it is able to pick up the egress link utilization
|
|
at each hop and deliver it to a host for monitoring purposes.
|
|
|
|
Our probe packet will contain the following three header types:
|
|
```
|
|
// Top-level probe header, indicates how many hops this probe
|
|
// packet has traversed so far.
|
|
header probe_t {
|
|
bit<8> hop_cnt;
|
|
}
|
|
|
|
// The data added to the probe by each switch at each hop.
|
|
header probe_data_t {
|
|
bit<1> bos;
|
|
bit<7> swid;
|
|
bit<8> port;
|
|
bit<32> byte_cnt;
|
|
time_t last_time;
|
|
time_t cur_time;
|
|
}
|
|
|
|
// Indicates the egress port the switch should send this probe
|
|
// packet out of. There is one of these headers for each hop.
|
|
header probe_fwd_t {
|
|
bit<8> egress_spec;
|
|
}
|
|
```
|
|
|
|
We will use the pod-topology for this exercise, which consists of
|
|
four hosts connected to four switches that are wired up as they
|
|
would be in a single pod of a fat tree topology.
|
|
|
|

|
|
|
|
In order to monitor the link utilization our switch will maintain
|
|
two register arrays:
|
|
* `byte_cnt_reg` - counts the number of bytes transmitted out of
|
|
each port since the last probe packet was transmitted out of
|
|
the port.
|
|
* `last_time_reg` - stores the last time that a probe packet was
|
|
transmitted out of each port.
|
|
|
|
Our P4 program will be written for the V1Model architecture implemented
|
|
on P4.org's bmv2 software switch. The architecture file for the V1Model
|
|
can be found at: /usr/local/share/p4c/p4include/v1model.p4. This file
|
|
desribes the interfaces of the P4 programmable elements in the architecture,
|
|
the supported externs, as well as the architecture's standard metadata
|
|
fields. We encourage you to take a look at it.
|
|
|
|
> **Spoiler alert:** There is a reference solution in the `solution`
|
|
> sub-directory. Feel free to compare your implementation to the
|
|
> reference.
|
|
|
|
## Step 1: Run the (incomplete) starter code
|
|
|
|
The directory with this README contains a skeleton P4 program,
|
|
`link_monitor.p4`, which implements basic IPv4 forwarding, as well
|
|
as source routing of the probe packets. Your job will be to
|
|
extend this skeleton program to fill out the fields in the probe
|
|
packet.
|
|
|
|
Before that, let's compile and test the incomplete `link_monitor.p4`
|
|
program:
|
|
|
|
1. In your shell, run:
|
|
```bash
|
|
make run
|
|
```
|
|
This will:
|
|
* compile `link_monitor.p4`, and
|
|
* start the pod-topo in Mininet and configure all switches with
|
|
the `link_monitor.p4` program + table entries, and
|
|
* configure all hosts with the commands listed in
|
|
[pod-topo/topology.json](./pod-topo/topology.json)
|
|
|
|
2. You should now see a Mininet command prompt. Open two terminals
|
|
on `h1`:
|
|
```bash
|
|
mininet> xterm h1 h1
|
|
```
|
|
3. In one of the xterms run the `send.py` script to start sending
|
|
probe packets every second. Each of these probe packets takes the
|
|
path indicated in link-monitor-topo.png.
|
|
```bash
|
|
./send.py
|
|
```
|
|
4. In the other terminal run the `receive.py` script to start
|
|
receiving and parsing the probe packets. This allows us to monitor
|
|
the link utilization within the network.
|
|
```bash
|
|
./receive.py
|
|
```
|
|
The reported link utilization and the switch port numbers will
|
|
always be 0 because the probe fields have not been filled out yet.
|
|
|
|
5. Run an iperf flow between h1 and h4:
|
|
```bash
|
|
mininet> iperf h1 h4
|
|
```
|
|
6. Type `exit` to leave each xterm and the Mininet command line.
|
|
Then, to stop mininet:
|
|
```bash
|
|
make stop
|
|
```
|
|
And to delete all pcaps, build files, and logs:
|
|
```bash
|
|
make clean
|
|
```
|
|
|
|
The measured link utilizations will not agree with what iperf reports
|
|
because the probe packet fields have not been populated yet. Your
|
|
goal is to fill out the probe packet fields so that the two
|
|
measurements agree.
|
|
|
|
### A note about the control plane
|
|
|
|
A P4 program defines a packet-processing pipeline, but the rules
|
|
within each table are inserted by the control plane. When a rule
|
|
matches a packet, its action is invoked with parameters supplied by
|
|
the control plane as part of the rule.
|
|
|
|
In this exercise, we have already implemented the control plane
|
|
logic for you. As part of bringing up the Mininet instance, the
|
|
`make run` command will install packet-processing rules in the tables of
|
|
each switch. These are defined in the `sX-runtime.json` files, where
|
|
`X` corresponds to the switch number.
|
|
|
|
**Important:** We use P4Runtime to install the control plane rules. The
|
|
content of files `sX-runtime.json` refer to specific names of tables, keys, and
|
|
actions, as defined in the P4Info file produced by the compiler (look for the
|
|
file `build/link_monitor.p4.p4info.txt` after executing `make run`). Any
|
|
changes in the P4 program that add or rename tables, keys, or actions
|
|
will need to be reflected in these `sX-runtime.json` files.
|
|
|
|
## Step 2: Implement Link Monitoring Logic
|
|
|
|
The `link_monitor.p4` file contains a skeleton P4 program with key pieces of
|
|
logic replaced by `TODO` comments. Your implementation should follow
|
|
the structure given in this file---replace each `TODO` with logic
|
|
implementing the missing piece.
|
|
|
|
Here are a few more details about the design:
|
|
|
|
**Parser**
|
|
* The parser has been extended support parsing of the source routed probe packets.
|
|
The parser is the most complicated part of the design so spend a bit of time
|
|
reading over it. Note that it does not contain any TODO comments so there is
|
|
nothing you need to change here.
|
|
* To parse the probe packets, we use the `hdr.probe.hop_cnt` to determine how many
|
|
hops the packet has traversed prior to reaching the switch. If this is the first
|
|
hop then there will not be any `probe_data` in the packet so we skip that state
|
|
and transition directly to the `parse_probe_fwd` state. In the `parse_probe_fwd`
|
|
state, we use the `hdr.probe.hop_cnt` field to figure out which `egress_spec`
|
|
header field to use to perform forwarding and we save that port value into a
|
|
metadata field which is subsequently used to perform forwarding.
|
|
|
|
**Ingress Control**
|
|
* The ingress control block looks very similar to the `basic` exercise. The only
|
|
difference is that the `apply` block contains another condition to forward probe
|
|
packets using the `egress_spec` field extracted by the parser. It also increments
|
|
the `hdr.probe.hop_cnt` field.
|
|
|
|
**Egress Control**
|
|
* This is where the interesting stateful processing occurs. It uses the
|
|
`byte_cnt_reg` register to count the number of bytes that have passed through each
|
|
port since the last probe packet passed through the port.
|
|
* It adds a new `probe_data` header to the packet and filld out the `bos`
|
|
(bottom of stack) field, as well as the `swid` (switch ID) field.
|
|
* TODO: your job is to fill out the rest of the probe packet fields in order to
|
|
ensure that you can properly measure link utilization.
|
|
|
|
**Deparser**
|
|
* Simply emits all headers in the correct order.
|
|
* Note that emitting a header stack will only emit the headers within the stack
|
|
that are actually marked as valid.
|
|
|
|
## Step 3: Run your solution
|
|
|
|
Follow the instructions from Step 1. This time, the measured link
|
|
utilizations should agree with what `iperf` reports.
|
|
|
|
### Troubleshooting
|
|
|
|
There are several problems that might manifest as you develop your program:
|
|
|
|
1. `link_monitor.p4` might fail to compile. In this case, `make run` will
|
|
report the error emitted from the compiler and halt.
|
|
|
|
2. `link_monitor.p4` might compile but fail to support the control plane
|
|
rules in the `s1-runtime.json` through `s4-runtime.json` files that
|
|
`make run` tries to install using P4Runtime. In this case, `make run` will
|
|
report errors if control plane rules cannot be installed. Use these error
|
|
messages to fix your `link_monitor.p4` implementation.
|
|
|
|
3. `link_monitor.p4` might compile, and the control plane rules might be
|
|
installed, but the switch might not process packets in the desired
|
|
way. The `logs/sX.log` files contain detailed logs that describing
|
|
how each switch processes each packet. The output is detailed and can
|
|
help pinpoint logic errors in your implementation.
|
|
|
|
#### Cleaning up Mininet
|
|
|
|
In the latter two cases above, `make run` may leave a Mininet instance
|
|
running in the background. Use the following command to clean up
|
|
these instances:
|
|
|
|
```bash
|
|
make stop
|
|
```
|
|
|
|
### Food For Thought
|
|
|
|
Now that you've implemented this basic monitoring framework can you
|
|
think of ways to leverage this information about link utilization
|
|
within the core of the network? For instance, how might you use this
|
|
data, either at the hosts or at the switches, to make real-time
|
|
load-balancing decisions?
|
|
|