Radostin Stoyanov ccc5693807
readme: Update logs path (#448)
* readme: remove trailing whitespaces

Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>

* readme: update path to log files

Fixes: #447

Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
2022-01-19 12:59:20 -05:00

234 lines
9.2 KiB
Markdown

# Implementing Link Monitoring
## Introduction
The objective of this exercise is to write a P4 program that enables
a host to monitor the utilization of all links in the network. This
exercise builds upon the basic IPv4 forwarding exercise so be sure
to complete that one before attempting this one. Specifically, we
will modify the basic P4 program to process a source routed probe
packet such that it is able to pick up the egress link utilization
at each hop and deliver it to a host for monitoring purposes.
Our probe packet will contain the following three header types:
```
// Top-level probe header, indicates how many hops this probe
// packet has traversed so far.
header probe_t {
bit<8> hop_cnt;
}
// The data added to the probe by each switch at each hop.
header probe_data_t {
bit<1> bos;
bit<7> swid;
bit<8> port;
bit<32> byte_cnt;
time_t last_time;
time_t cur_time;
}
// Indicates the egress port the switch should send this probe
// packet out of. There is one of these headers for each hop.
header probe_fwd_t {
bit<8> egress_spec;
}
```
We will use the pod-topology for this exercise, which consists of
four hosts connected to four switches that are wired up as they
would be in a single pod of a fat tree topology.
![topology](./link-monitor-topo.png)
In order to monitor the link utilization our switch will maintain
two register arrays:
* `byte_cnt_reg` - counts the number of bytes transmitted out of
each port since the last probe packet was transmitted out of
the port.
* `last_time_reg` - stores the last time that a probe packet was
transmitted out of each port.
Our P4 program will be written for the V1Model architecture implemented
on P4.org's bmv2 software switch. The architecture file for the V1Model
can be found at: /usr/local/share/p4c/p4include/v1model.p4. This file
desribes the interfaces of the P4 programmable elements in the architecture,
the supported externs, as well as the architecture's standard metadata
fields. We encourage you to take a look at it.
> **Spoiler alert:** There is a reference solution in the `solution`
> sub-directory. Feel free to compare your implementation to the
> reference.
## Step 1: Run the (incomplete) starter code
The directory with this README contains a skeleton P4 program,
`link_monitor.p4`, which implements basic IPv4 forwarding, as well
as source routing of the probe packets. Your job will be to
extend this skeleton program to fill out the fields in the probe
packet.
Before that, let's compile and test the incomplete `link_monitor.p4`
program:
1. In your shell, run:
```bash
make run
```
This will:
* compile `link_monitor.p4`, and
* start the pod-topo in Mininet and configure all switches with
the `link_monitor.p4` program + table entries, and
* configure all hosts with the commands listed in
[pod-topo/topology.json](./pod-topo/topology.json)
2. You should now see a Mininet command prompt. Open two terminals
on `h1`:
```bash
mininet> xterm h1 h1
```
3. In one of the xterms run the `send.py` script to start sending
probe packets every second. Each of these probe packets takes the
path indicated in link-monitor-topo.png.
```bash
./send.py
```
4. In the other terminal run the `receive.py` script to start
receiving and parsing the probe packets. This allows us to monitor
the link utilization within the network.
```bash
./receive.py
```
The reported link utilization and the switch port numbers will
always be 0 because the probe fields have not been filled out yet.
5. Run an iperf flow between h1 and h4:
```bash
mininet> iperf h1 h4
```
6. Type `exit` to leave each xterm and the Mininet command line.
Then, to stop mininet:
```bash
make stop
```
And to delete all pcaps, build files, and logs:
```bash
make clean
```
The measured link utilizations will not agree with what iperf reports
because the probe packet fields have not been populated yet. Your
goal is to fill out the probe packet fields so that the two
measurements agree.
### A note about the control plane
A P4 program defines a packet-processing pipeline, but the rules
within each table are inserted by the control plane. When a rule
matches a packet, its action is invoked with parameters supplied by
the control plane as part of the rule.
In this exercise, we have already implemented the control plane
logic for you. As part of bringing up the Mininet instance, the
`make run` command will install packet-processing rules in the tables of
each switch. These are defined in the `sX-runtime.json` files, where
`X` corresponds to the switch number.
**Important:** We use P4Runtime to install the control plane rules. The
content of files `sX-runtime.json` refer to specific names of tables, keys, and
actions, as defined in the P4Info file produced by the compiler (look for the
file `build/link_monitor.p4.p4info.txt` after executing `make run`). Any
changes in the P4 program that add or rename tables, keys, or actions
will need to be reflected in these `sX-runtime.json` files.
## Step 2: Implement Link Monitoring Logic
The `link_monitor.p4` file contains a skeleton P4 program with key pieces of
logic replaced by `TODO` comments. Your implementation should follow
the structure given in this file---replace each `TODO` with logic
implementing the missing piece.
Here are a few more details about the design:
**Parser**
* The parser has been extended support parsing of the source routed probe packets.
The parser is the most complicated part of the design so spend a bit of time
reading over it. Note that it does not contain any TODO comments so there is
nothing you need to change here.
* To parse the probe packets, we use the `hdr.probe.hop_cnt` to determine how many
hops the packet has traversed prior to reaching the switch. If this is the first
hop then there will not be any `probe_data` in the packet so we skip that state
and transition directly to the `parse_probe_fwd` state. In the `parse_probe_fwd`
state, we use the `hdr.probe.hop_cnt` field to figure out which `egress_spec`
header field to use to perform forwarding and we save that port value into a
metadata field which is subsequently used to perform forwarding.
**Ingress Control**
* The ingress control block looks very similar to the `basic` exercise. The only
difference is that the `apply` block contains another condition to forward probe
packets using the `egress_spec` field extracted by the parser. It also increments
the `hdr.probe.hop_cnt` field.
**Egress Control**
* This is where the interesting stateful processing occurs. It uses the
`byte_cnt_reg` register to count the number of bytes that have passed through each
port since the last probe packet passed through the port.
* It adds a new `probe_data` header to the packet and filld out the `bos`
(bottom of stack) field, as well as the `swid` (switch ID) field.
* TODO: your job is to fill out the rest of the probe packet fields in order to
ensure that you can properly measure link utilization.
**Deparser**
* Simply emits all headers in the correct order.
* Note that emitting a header stack will only emit the headers within the stack
that are actually marked as valid.
## Step 3: Run your solution
Follow the instructions from Step 1. This time, the measured link
utilizations should agree with what `iperf` reports.
### Troubleshooting
There are several problems that might manifest as you develop your program:
1. `link_monitor.p4` might fail to compile. In this case, `make run` will
report the error emitted from the compiler and halt.
2. `link_monitor.p4` might compile but fail to support the control plane
rules in the `s1-runtime.json` through `s4-runtime.json` files that
`make run` tries to install using P4Runtime. In this case, `make run` will
report errors if control plane rules cannot be installed. Use these error
messages to fix your `link_monitor.p4` implementation.
3. `link_monitor.p4` might compile, and the control plane rules might be
installed, but the switch might not process packets in the desired
way. The `logs/sX.log` files contain detailed logs that describing
how each switch processes each packet. The output is detailed and can
help pinpoint logic errors in your implementation.
#### Cleaning up Mininet
In the latter two cases above, `make run` may leave a Mininet instance
running in the background. Use the following command to clean up
these instances:
```bash
make stop
```
### Food For Thought
Now that you've implemented this basic monitoring framework can you
think of ways to leverage this information about link utilization
within the core of the network? For instance, how might you use this
data, either at the hosts or at the switches, to make real-time
load-balancing decisions?
## Relevant Documentation
The documentation for P4_16 and P4Runtime is available [here](https://p4.org/specs/)
All excercises in this repository use the v1model architecture, the documentation for which is available at:
1. The BMv2 Simple Switch target document accessible [here](https://github.com/p4lang/behavioral-model/blob/master/docs/simple_switch.md) talks mainly about the v1model architecture.
2. The include file `v1model.p4` has extensive comments and can be accessed [here](https://github.com/p4lang/p4c/blob/master/p4include/v1model.p4).