* Repository reorganization for 2018 Spring P4 Developer Day. * Port tutorial exercises to P4Runtime with static controller (#156) * Switch VM to a minimal Ubuntu 16.04 desktop image * Add commands to install Protobuf Python bindings to user_bootstrap.sh * Implement P4Runtime static controller for use in exercises From the exercise perspective, the main difference is that control plane rules are now specified using JSON files instead of CLI commands. Such JSON files define rules that use the same name for tables, keys, etc. as in the P4Info file. All P4Runtime requests generated as part of the make run process are logged in the exercise's “logs” directory, making it easier for students to see the actual P4Runtime messages sent to the switch. Only the "basic" exercise has been ported to use P4Runtime. The "p4runtime" exercise has been updated to work with P4Runtime protocol changes. Known issues: - make run hangs in case of errors when running the P4Runtime controller (probably due to gRPC stream channel threads not terminated properly) - missing support for inserting table entries with default action (can specify in P4 program as a workaround) * Force install protobuf python module * Fixing Ctrl-C hang by shutdown switches * Moving gRPC error print to function for readability Unforuntately, if this gets moved out of the file, the process hangs. We'll need to figure out how why later. * Renaming ShutdownAllSwitches -> ShutdownAllSwitchConnections * Reverting counter index change * Porting the ECN exercise to use P4 Runtime Static Controller * updating the README in the ecn exercise to reflect the change in rule files * Allow set table default action in P4Runtime static controller * Fixed undefined match string when printing P4Runtime table entry * Updated basic_tunnel exercise to use P4Runtime controller. * Changed default action in the basic exercise's ipv4_lpm table to drop * Porting the MRI exercise to use P4runtime with static controller * Updating readme to reflect the change of controller for mri * Update calc exercise for P4Runtime static controller * Port source_routing to P4 Runtime static controller (#157) * Port Load Balance to P4 Runtime Static Controller (#158)
245 lines
9.0 KiB
Markdown
245 lines
9.0 KiB
Markdown
# Implementing MRI
|
|
|
|
## Introduction
|
|
|
|
The objective of this tutorial is to extend basic L3 forwarding with a
|
|
scaled-down version of In-Band Network Telemetry (INT), which we call
|
|
Multi-Hop Route Inspection (MRI).
|
|
|
|
MRI allows users to track the path and the length of queues that every
|
|
packet travels through. To support this functionality, you will need
|
|
to write a P4 program that appends an ID and queue length to the
|
|
header stack of every packet. At the destination, the sequence of
|
|
switch IDs correspond to the path, and each ID is followed by the
|
|
queue length of the port at switch.
|
|
|
|
As before, we have already defined the control plane rules, so you
|
|
only need to implement the data plane logic of your P4 program.
|
|
|
|
> **Spoiler alert:** There is a reference solution in the `solution`
|
|
> sub-directory. Feel free to compare your implementation to the reference.
|
|
|
|
## Step 1: Run the (incomplete) starter code
|
|
|
|
The directory with this README also contains a skeleton P4 program,
|
|
`mri.p4`, which initially implements L3 forwarding. Your job (in the
|
|
next step) will be to extend it to properly prepend the MRI custom
|
|
headers.
|
|
|
|
Before that, let's compile the incomplete `mri.p4` and bring up a
|
|
switch in Mininet to test its behavior.
|
|
|
|
1. In your shell, run:
|
|
```bash
|
|
make
|
|
```
|
|
This will:
|
|
* compile `mri.p4`, and
|
|
* start a Mininet instance with three switches (`s1`, `s2`, `s3`) configured
|
|
in a triangle. There are 5 hosts. `h1` and `h11` are connected to `s1`.
|
|
`h2` and `h22` are connected to `s2` and `h3` is connected to `s3`.
|
|
* The hosts are assigned IPs of `10.0.1.1`, `10.0.2.2`, etc
|
|
(`10.0.<Switchid>.<hostID>`).
|
|
* The control plane programs the P4 tables in each switch based on
|
|
`sx-runtime.json`
|
|
|
|
2. We want to send a low rate traffic from `h1` to `h2` and a high
|
|
rate iperf traffic from `h11` to `h22`. The link between `s1` and
|
|
`s2` is common between the flows and is a bottleneck because we
|
|
reduced its bandwidth to 512kbps in topology.json. Therefore, if we
|
|
capture packets at `h2`, we should see high queue size for that
|
|
link.
|
|
|
|

|
|
|
|
3. You should now see a Mininet command prompt. Open four terminals
|
|
for `h1`, `h11`, `h2`, `h22`, respectively:
|
|
```bash
|
|
mininet> xterm h1 h11 h2 h22
|
|
```
|
|
3. In `h2`'s xterm, start the server that captures packets:
|
|
```bash
|
|
./receive.py
|
|
```
|
|
4. in `h22`'s xterm, start the iperf UDP server:
|
|
```bash
|
|
iperf -s -u
|
|
```
|
|
|
|
5. In `h1`'s xterm, send one packet per second to `h2` using send.py
|
|
say for 30 seconds:
|
|
```bash
|
|
./send.py 10.0.2.2 "P4 is cool" 30
|
|
```
|
|
The message "P4 is cool" should be received in `h2`'s xterm,
|
|
6. In `h11`'s xterm, start iperf client sending for 15 seconds
|
|
```bash
|
|
iperf -c 10.0.2.22 -t 15 -u
|
|
```
|
|
7. At `h2`, the MRI header has no hop info (`count=0`)
|
|
8. type `exit` to close each xterm window
|
|
|
|
You should see the message received at host `h2`, but without any
|
|
information about the path the message took. Your job is to extend
|
|
the code in `mri.p4` to implement the MRI logic to record the path.
|
|
|
|
### A note about the control plane
|
|
|
|
P4 programs define a packet-processing pipeline, but the rules
|
|
governing packet processing are inserted into the pipeline by the
|
|
control plane. When a rule matches a packet, its action is invoked
|
|
with parameters supplied by the control plane as part of the rule.
|
|
|
|
In this exercise, the control plane logic has already been
|
|
implemented. As part of bringing up the Mininet instance, the
|
|
`make` script will install packet-processing rules in the tables of
|
|
each switch. These are defined in the `sX-runtime.json` files, where
|
|
`X` corresponds to the switch number.
|
|
|
|
## Step 2: Implement MRI
|
|
|
|
The `mri.p4` file contains a skeleton P4 program with key pieces of
|
|
logic replaced by `TODO` comments. These should guide your
|
|
implementation---replace each `TODO` with logic implementing the
|
|
missing piece.
|
|
|
|
MRI will require two custom headers. The first header, `mri_t`,
|
|
contains a single field `count`, which indicates the number of switch
|
|
IDs that follow. The second header, `switch_t`, contains switch ID and
|
|
Queue depth fields of each switch hop the packet goes through.
|
|
|
|
One of the biggest challenges in implementing MRI is handling the
|
|
recursive logic for parsing these two headers. We will use a
|
|
`parser_metadata` field, `remaining`, to keep track of how many
|
|
`switch_t` headers we need to parse. In the `parse_mri` state, this
|
|
field should be set to `hdr.mri.count`. In the `parse_swtrace` state,
|
|
this field should be decremented. The `parse_swtrace` state will
|
|
transition to itself until `remaining` is 0.
|
|
|
|
The MRI custom headers will be carried inside an IP Options
|
|
header. The IP Options header contains a field, `option`, which
|
|
indicates the type of the option. We will use a special type 31 to
|
|
indicate the presence of the MRI headers.
|
|
|
|
Beyond the parser logic, you will add a table in egress, `swtrace` to
|
|
store the switch ID and queue depth, and actions that increment the
|
|
`count` field, and append a `switch_t` header.
|
|
|
|
A complete `mri.p4` will contain the following components:
|
|
|
|
1. Header type definitions for Ethernet (`ethernet_t`), IPv4 (`ipv4_t`),
|
|
IP Options (`ipv4_option_t`), MRI (`mri_t`), and Switch (`switch_t`).
|
|
2. Parsers for Ethernet, IPv4, IP Options, MRI, and Switch that will
|
|
populate `ethernet_t`, `ipv4_t`, `ipv4_option_t`, `mri_t`, and
|
|
`switch_t`.
|
|
3. An action to drop a packet, using `mark_to_drop()`.
|
|
4. An action (called `ipv4_forward`), which will:
|
|
1. Set the egress port for the next hop.
|
|
2. Update the ethernet destination address with the address of
|
|
the next hop.
|
|
3. Update the ethernet source address with the address of the switch.
|
|
4. Decrement the TTL.
|
|
5. An ingress control that:
|
|
1. Defines a table that will read an IPv4 destination address, and
|
|
invoke either `drop` or `ipv4_forward`.
|
|
2. An `apply` block that applies the table.
|
|
6. At egress, an action (called `add_swtrace`) that will add the
|
|
switch ID and queue depth.
|
|
8. An egress control that applies a table (`swtrace`) to store the
|
|
switch ID and queue depth, and calls `add_swtrace`.
|
|
9. A deparser that selects the order in which fields inserted into the outgoing
|
|
packet.
|
|
10. A `package` instantiation supplied with the parser, control,
|
|
checksum verification and recomputation and deparser.
|
|
|
|
## Step 3: Run your solution
|
|
|
|
Follow the instructions from Step 1. This time, when your message
|
|
from `h1` is delivered to `h2`, you should see the sequence of
|
|
switches through which the packet traveled plus the corresponding
|
|
queue depths. The expected output will look like the following,
|
|
which shows the MRI header, with a `count` of 2, and switch ids
|
|
(`swids`) 2 and 1. The queue depth at the common link (from s1 to
|
|
s2) is high.
|
|
|
|
```
|
|
got a packet
|
|
###[ Ethernet ]###
|
|
dst = 00:04:00:02:00:02
|
|
src = f2:ed:e6:df:4e:fa
|
|
type = 0x800
|
|
###[ IP ]###
|
|
version = 4L
|
|
ihl = 10L
|
|
tos = 0x0
|
|
len = 42
|
|
id = 1
|
|
flags =
|
|
frag = 0L
|
|
ttl = 62
|
|
proto = udp
|
|
chksum = 0x60c0
|
|
src = 10.0.1.1
|
|
dst = 10.0.2.2
|
|
\options \
|
|
|###[ MRI ]###
|
|
| copy_flag = 0L
|
|
| optclass = control
|
|
| option = 31L
|
|
| length = 20
|
|
| count = 2
|
|
| \swtraces \
|
|
| |###[ SwitchTrace ]###
|
|
| | swid = 2
|
|
| | qdepth = 0
|
|
| |###[ SwitchTrace ]###
|
|
| | swid = 1
|
|
| | qdepth = 17
|
|
###[ UDP ]###
|
|
sport = 1234
|
|
dport = 4321
|
|
len = 18
|
|
chksum = 0x1c7b
|
|
###[ Raw ]###
|
|
load = 'P4 is cool'
|
|
|
|
```
|
|
|
|
### Troubleshooting
|
|
|
|
There are several ways that problems might manifest:
|
|
|
|
1. `mri.p4` fails to compile. In this case, `make` will report the
|
|
error emitted from the compiler and stop.
|
|
2. `mri.p4` compiles but does not support the control plane rules in
|
|
the `sX-runtime.json` files that `make` tries to install using the BMv2 CLI.
|
|
In this case, `make` will log the CLI tool output in the `logs` directory.
|
|
Use these error messages to fix your `mri.p4` implementation.
|
|
3. `mri.p4` compiles, and the control plane rules are installed, but
|
|
the switch does not process packets in the desired way. The
|
|
`/tmp/p4s.<switch-name>.log` files contain trace messages describing
|
|
how each switch processes each packet. The output is detailed and can
|
|
help pinpoint logic errors in your implementation. The
|
|
`build/<switch-name>-<interface-name>.pcap` also contains the pcap of
|
|
packets on each interface. Use `tcpdump -r <filename> -xxx` to print
|
|
the hexdump of the packets.
|
|
4. `mri.p4` compiles and all rules are installed. Packets go through
|
|
and the logs show that the queue length is always 0. Then either
|
|
reduce the link bandwidth in `topology.json`.
|
|
|
|
#### Cleaning up Mininet
|
|
|
|
In the latter two cases above, `make` may leave a Mininet instance
|
|
running in the background. Use the following command to clean up
|
|
these instances:
|
|
|
|
```bash
|
|
make stop
|
|
```
|
|
|
|
## Next Steps
|
|
|
|
Congratulations, your implementation works! Move on to [Source
|
|
Routing](../source_routing).
|
|
|