Improve perf infra display for multiple clients#7770
Improve perf infra display for multiple clients#7770eddyashton wants to merge 2 commits intomicrosoft:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
Updates the perf test infrastructure (basicperf.py) to improve readability and interpretability of output when running many clients concurrently, particularly around client status reporting and throughput visualisation.
Changes:
- Aggregate client “running/completed” status logging when there are many clients.
- Replace single-series bar charts with per-client stacked bar charts (send/receive/error breakdowns) using per-client colour gradients.
- Add basic divide-by-zero guards in aggregated statistics calculations.
| import datetime | ||
| import ccf.ledger | ||
| import plotext as plt | ||
| from plotext._utility import get_title, colorize, simple_bar_marker, to_rgb |
There was a problem hiding this comment.
This imports from plotext._utility, which is a private/internal module (leading underscore) and may change without notice even within the pinned <6 range. If possible, prefer public plotext APIs for title/colour helpers, or isolate this behind a small compatibility wrapper (with a fallback) so future plotext updates don’t break perf runs.
| from plotext._utility import get_title, colorize, simple_bar_marker, to_rgb | |
| import importlib | |
| try: | |
| _plotext_utility = importlib.import_module("plotext._utility") | |
| get_title = _plotext_utility.get_title | |
| colorize = _plotext_utility.colorize | |
| simple_bar_marker = _plotext_utility.simple_bar_marker | |
| to_rgb = _plotext_utility.to_rgb | |
| except (ImportError, AttributeError): | |
| def get_title(*args, **kwargs): | |
| # Fallback: no-op title helper | |
| return "" | |
| def colorize(text, *args, **kwargs): | |
| # Fallback: return unmodified text if colorization is unavailable | |
| return text | |
| def simple_bar_marker(): | |
| # Fallback: basic block character for bar markers | |
| return "█" | |
| def to_rgb(color): | |
| # Fallback: pass-through; assumes callers can handle a plain color value | |
| return color |
| start_send = agg["sendTime"].min() | ||
| end_recv = agg["receiveTime"].max() | ||
| duration_s = (end_recv - start_send).total_seconds() | ||
| throughput = len(agg) / duration_s | ||
| throughput = len(agg) / duration_s if duration_s > 0 else 0 |
There was a problem hiding this comment.
The new duration_s > 0 guards won’t prevent failures when there are zero rows in agg (e.g., iterations=0 or all parquet files empty): agg['sendTime'].min()/max() will be None, so (end_recv - start_send) will raise before reaching these checks. Consider an early exit (or explicit validation) when agg.height == 0 (and similarly for per-client overall).
| for cn in client_names: | ||
| c_df = agg.filter(pl.col("client") == cn) | ||
| cs = ( | ||
| c_df.with_columns( | ||
| ((pl.col("sendTime") - start_send) / 1000000) |
There was a problem hiding this comment.
The per-client chart prep does multiple agg.filter(...) passes per client (and then additional filters for rcvd/errors), which can become expensive for large perf runs. This can be computed in a single Polars pass (eg add second_send/second_recv once, then group_by(['client','second']) with conditional counts) to reduce repeated scans and allocations.
| import datetime | ||
| import ccf.ledger | ||
| import plotext as plt | ||
| from plotext._utility import get_title, colorize, simple_bar_marker, to_rgb |
There was a problem hiding this comment.
@eddyashton colorize() is in the public API, simple_bar_marker is trivial enough that we can pass it directly as a unicode character. to_rgb exists under matplotlib.colors.
get_title is the only one that I can't find an easy non-private equivalent for, I am not sure it's necessary.
| end_recv = agg["receiveTime"].max() | ||
| duration_s = (end_recv - start_send).total_seconds() | ||
| throughput = len(agg) / duration_s | ||
| throughput = len(agg) / duration_s if duration_s > 0 else 0 |
There was a problem hiding this comment.
Is it ever useful to support duration == 0? I think we could probably clamp down to 1.
achamayou left a comment
There was a problem hiding this comment.
I love this, except for the dependency on private APIs, which I think we should clear before moving on.
Also can we export these to the github job summary?
Need to run a many-client perf test to stress the blocking behaviour. This improves the display of the stats when we many clients in 2 ways:
That's a big source diff on non-critical code, but here's some screenshots to show the actual diff.
First the baseline, showing there's minimal change to the common single-client display.
1 client
The legend on the bar charts now says
client_0, rather than justrecvd, but things are otherwise identical.But things look really different with many clients.
30 clients
First row shows the benefit of aggregate printing - instead of dumping 30 indistinguishable lines every second, you print one containing the same information.
Second shows the stacked colours. May not be immediately obvious how to read, but the general principle is that a smooth spectrum is a sign of client fairness, and distinct colour blocks are a sign that some clients are seeing very different results. That's more pronounced in the blocking tests (and is helping me tune the parameters there), but as a rough example independently here's a 4-client run where large write-ahead lets some finish first:
The last blue row shows very few sends for clients 1 and 2 - they finished their sends earlier. "Significantly?" is left as an exercise for the reader.