OPTE maintains two sets of benchmarks: userland microbenchmarks, and kernel module benchmarks. Userland benchmarks can be run on most development machines, while the kernel module benchmarks will require a full Helios install and additional lab setup depending on what benchmarks you want to run.
Benchmark outputs are located in opte/target/criterion, and any flamegraphs built during kmod benchmarks are placed into opte/target/xde-bench.
We use criterion to measure and profile individual packet processing times for slow-/fast-path traffic as well as generated hairpin packets.
These can be called using cargo ubench, or cargo bench --package opte-bench --bench userland — <options>. This benchmark runner uses the standard criterion CLI. To see a clean list of available benchmarks, use the cargo ubench --list 2> /dev/null | sort | uniq command.
Benchmarks are split into several categories:
-
Metric:
wallclock,alloc_ct,alloc_sz. -
Action:
parse,process. -
Packet family.
The kernel module benchmarks can be called using cargo kbench, or cargo bench --package opte-bench --bench xde — <options>. They require that:
-
you are running on an up-to-date Helios instance.
-
the XDE kernel module and
opteadmare installed, either via IPS or thecargo xtask installcommand. -
you have installed the IPS packages
flamegraph,demangle,iperfandsparse.
They implement zont-to-zone iperf traffic in two scenarios:
-
cargo kbench localon one machine. This uses an identical test setup toxde-tests/loopback. Two sparse zones will be created on the current machine, with simnet links being used as an underlay network. This is lower fidelity than the below two-node setup. -
cargo kbench serverandcargo kbench remote <SERVER_IP>on two separate machines. One zone will be created on each machine (running an iperf server and client respectively), using the shared lab/home network to exchange link local addresses.
Below you can find a lab setup which suffices for the second option. Currently, linklocals must be created with the name syntax <nic>/ll: this can be done using, e.g., pfexec ipadm create-addr igb0/ll -T addrconf. The benchmark defaults to using the NICs igb0 and igb1, and can be overridden to match your setup using the --underlay-nics option. E.g., when testing over a Chelsio NIC --underlay-nics cxgbe0 cxgbe1 will select these devices and use the link-local addresses cxgbe0/ll and cxgbe1/ll. Additionally, MTUs should be set to 9000 for physical underlay links.
fe80::a236:9fff:fe0c:2586 fe80::a236:9fff:fe0c:25b6 fe80::a236:9fff:fe0c:2587 fe80::a236:9fff:fe0c:25b7 ┌─────────────────────────────────────┐ │ │ │ ┌─────────────────┐ │ │ │ │ │ igb0┌┴┐ ┌┴┐igb1 igb1┌┴┐ ┌┴┐igb0 ╔═╩═╩═══════╩═╩═╗ ╔═╩═╩═══════╩═╩═╗ ║ cargo kbench ║░ ║ cargo kbench ║░ ║ remote ║░ ║ server ║░ ║ 10.0.125.173 ║░ ║ ║░ ╚══════╦═╦══════╝░ ╚══════╦═╦══════╝░ ░░░░░░░│░░░░░░░░░ ░░░░░░░│░░░░░░░░░ 10.0.147.187/8 10.0.125.173/8 │ ┌ ─ ─ ─ ─ ─ ┐ │ Lab/Home └ ─ ─ ▶│ Network │◀ ─ ─ ─ ┘ ─ ─ ─ ─ ─ ─Connecting igb0<→igb0, etc., is not a requirement, as NDP tables are inspected for inserting underlay network routes.
In both scenarios, the benchmark harness will run iperf in client-to-server and server-to-client modes, and will record periodic stack information and timings using dtrace. These are converted into flamegraphs and timing data for further analysis by criterion.
The kernel module benchmark harness can be moved onto a gimlet or other development system for measurement. The path to the binary can be found using the command:
cargo bench --package opte-bench \ --no-run --message-format json-render-diagnostics \ | jq -r -s "map( \ select(.reason==\"compiler-artifact\") \ | select( \ .target.kind\ | map_values(.==\"bench\") \ | any \ ) \ | select(.target.name==\"xde\") \ ) | map(.executable)"Once the binary is moved onto the global zone of a target machine, measurements can be taken using xde in-situ. On a gimlet we add the -d flag as we do not have access to flamegraph. This places captured stacks into the xde-bench folder.
$ ./xde in-situ expt-name -d # ... exit $ ls -R xde-bench xde-bench: expt-name xde-bench/expt-name: histos.out raw.stacksMeasured data in xde-bench can be moved and processed into flamegraphs and histograms on any development machine using the command ./xde in-situ expt-name -c none.