Skip to content

Latest commit

 

History

History

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 

OPTE Benchmarks

OPTE maintains two sets of benchmarks: userland microbenchmarks, and kernel module benchmarks. Userland benchmarks can be run on most development machines, while the kernel module benchmarks will require a full Helios install and additional lab setup depending on what benchmarks you want to run.

Benchmark outputs are located in opte/target/criterion, and any flamegraphs built during kmod benchmarks are placed into opte/target/xde-bench.

Userland Benchmarks

We use criterion to measure and profile individual packet processing times for slow-/fast-path traffic as well as generated hairpin packets.

These can be called using cargo ubench, or cargo bench --package opte-bench --bench userland — <options>. This benchmark runner uses the standard criterion CLI. To see a clean list of available benchmarks, use the cargo ubench --list 2> /dev/null | sort | uniq command.

Benchmarks are split into several categories:

  • Metric: wallclock, alloc_ct, alloc_sz.

  • Action: parse, process.

  • Packet family.

Kernel Module Benchmarks

The kernel module benchmarks can be called using cargo kbench, or cargo bench --package opte-bench --bench xde — <options>. They require that:

  • you are running on an up-to-date Helios instance.

  • the XDE kernel module and opteadm are installed, either via IPS or the cargo xtask install command.

  • you have installed the IPS packages flamegraph, demangle, iperf and sparse.

They implement zont-to-zone iperf traffic in two scenarios:

  • cargo kbench local on one machine. This uses an identical test setup to xde-tests/loopback. Two sparse zones will be created on the current machine, with simnet links being used as an underlay network. This is lower fidelity than the below two-node setup.

  • cargo kbench server and cargo kbench remote <SERVER_IP> on two separate machines. One zone will be created on each machine (running an iperf server and client respectively), using the shared lab/home network to exchange link local addresses.

Below you can find a lab setup which suffices for the second option. Currently, linklocals must be created with the name syntax <nic>/ll: this can be done using, e.g., pfexec ipadm create-addr igb0/ll -T addrconf. The benchmark defaults to using the NICs igb0 and igb1, and can be overridden to match your setup using the --underlay-nics option. E.g., when testing over a Chelsio NIC --underlay-nics cxgbe0 cxgbe1 will select these devices and use the link-local addresses cxgbe0/ll and cxgbe1/ll. Additionally, MTUs should be set to 9000 for physical underlay links.

fe80::a236:9fff:fe0c:2586 fe80::a236:9fff:fe0c:25b6 fe80::a236:9fff:fe0c:2587 fe80::a236:9fff:fe0c:25b7 ┌─────────────────────────────────────┐ │ │ │ ┌─────────────────┐ │ │ │ │ │ igb0┌┴┐ ┌┴┐igb1 igb1┌┴┐ ┌┴┐igb0 ╔═╩═╩═══════╩═╩═╗ ╔═╩═╩═══════╩═╩═╗ ║ cargo kbench ║░ ║ cargo kbench ║░ ║ remote ║░ ║ server ║░ ║ 10.0.125.173 ║░ ║ ║░ ╚══════╦═╦══════╝░ ╚══════╦═╦══════╝░ ░░░░░░░│░░░░░░░░░ ░░░░░░░│░░░░░░░░░ 10.0.147.187/8 10.0.125.173/8 │ ┌ ─ ─ ─ ─ ─ ┐ │ Lab/Home └ ─ ─ ▶│ Network │◀ ─ ─ ─ ┘ ─ ─ ─ ─ ─ ─

Connecting igb0<→igb0, etc., is not a requirement, as NDP tables are inspected for inserting underlay network routes.

In both scenarios, the benchmark harness will run iperf in client-to-server and server-to-client modes, and will record periodic stack information and timings using dtrace. These are converted into flamegraphs and timing data for further analysis by criterion.

In-situ measurement

The kernel module benchmark harness can be moved onto a gimlet or other development system for measurement. The path to the binary can be found using the command:

cargo bench --package opte-bench \ --no-run --message-format json-render-diagnostics \ | jq -r -s "map( \  select(.reason==\"compiler-artifact\") \  | select( \  .target.kind\  | map_values(.==\"bench\") \  | any \  ) \  | select(.target.name==\"xde\") \  ) | map(.executable)"

Once the binary is moved onto the global zone of a target machine, measurements can be taken using xde in-situ. On a gimlet we add the -d flag as we do not have access to flamegraph. This places captured stacks into the xde-bench folder.

$ ./xde in-situ expt-name -d # ... exit $ ls -R xde-bench xde-bench: expt-name xde-bench/expt-name: histos.out raw.stacks

Measured data in xde-bench can be moved and processed into flamegraphs and histograms on any development machine using the command ./xde in-situ expt-name -c none.