Performance Graphs

Bowire ships a built-in micro-benchmarker for any unary method. Configure the request body in the Body tab as usual, then switch to the Performance tab in the response pane to repeat the call N times and visualize the latency distribution.

When you can use it

The Performance tab is only visible for unary methods -- it appears as a third tab in the response pane next to Response and Response Metadata. Streaming, client-streaming, and duplex channels don't show it because they don't have a single-call latency to measure.

Running a benchmark

Pick a unary method
Fill in the request body in the Body tab and any metadata in the Metadata tab
Switch to the Performance tab
Set Calls (1 -- 10000) and Concurrency (1 -- 20)
Click Run benchmark

While the run is in progress:

The progress bar shows completed / total calls and a percentage
The progress bar shimmers to indicate active work
A Stop button appears -- click it to cancel cleanly. Stats compute from whatever calls already completed.
Stats and charts update live every ~2 % of progress (capped to keep the UI responsive for large N)

What gets measured

Each iteration runs through the full Bowire pipeline:

${var} substitution against the active environment
${now}, ${uuid}, and other system variables -- regenerated per call, so timestamps and IDs differ across iterations
Auth helper from the active environment -- including JWT signing and OAuth token caching
Whatever metadata you set in the Metadata tab

Latency is measured as the wall-clock time from the moment the /api/invoke request leaves the browser to the moment the JSON response comes back. This includes Bowire's server-side dispatch overhead, but in practice that's small relative to gRPC round-trip time.

Failed calls (network errors, exceptions, error responses) do not count toward the latency stats -- they're tracked separately in the success / failed counters and the status distribution.

Stats

After at least one successful call, the stats grid shows:

Stat	Meaning
min	Fastest call
avg	Arithmetic mean
p50	Median (50th percentile)
p90	90th percentile
p95	95th percentile
p99	99th percentile
max	Slowest call
throughput	`success_count / total_seconds` (req/s)
success	Number of OK responses
failed	Network errors + non-success responses
total	Wall-clock duration of the whole run
count	Number of successful calls included in the percentile math

Throughput is computed against the wall-clock duration of the whole run, so it reflects effective concurrency. With concurrency = 1 it's roughly 1000 / avg. With concurrency = 10 it can be much higher if the server can handle parallel requests.

Status distribution

Below the stats grid, every distinct status name (OK, NotFound, Unavailable, Error, NetworkError, ...) is listed with its occurrence count. Color-coded the same way as the action bar status indicators:

green -- OK / success states
yellow -- recoverable / client errors (NotFound, InvalidArgument, ...)
red -- server errors and network failures

This is handy for catching intermittent failures: a 1000-call run that's "mostly fine" might still show 17 × Unavailable here.

Latency histogram

The histogram bins all successful call latencies into 24 equal-width buckets between min and max. Each bar shows how many calls fell into that bucket. Hover any bar (the SVG <title> tooltip) for the exact range and count.

Two dashed vertical lines mark the p50 (green) and p95 (yellow) boundaries so you can read the shape of the distribution at a glance:

Tight cluster around p50, short tail -- well-behaved service
Long right tail past p95 -- some calls are way slower than typical (GC pause? cold cache?)
Bimodal (two peaks) -- two distinct paths through the code, e.g. cache hit vs miss

Latency over time

The second chart plots latency against call index as a polyline. Same p50 / p95 markers, but as horizontal lines this time, so you can see how individual calls compare to the percentile boundaries.

Patterns to look for:

Flat line near p50 -- consistent performance
First few calls slow, then drops -- warm-up effect (JIT, connection pool, cache priming)
Periodic spikes -- background work, GC, or downstream throttling
Increasing trend -- memory leak, connection exhaustion, or backpressure building up

For sequential runs (concurrency = 1) the X axis is also a time axis. With concurrency > 1 the indices reflect completion order, not start order, so the picture is fuzzier.

Memory

Bowire keeps only the per-call latency numbers, never the response bodies, so memory stays bounded even for very large N. A 10 000-call run uses roughly 80 KB of latency data plus the SVG render.

The benchmark state is in memory only and is reset whenever you start a new run. There's no export today -- copy the stats values manually if you need to log them somewhere.

Tips

Use small N first -- start with 100 calls to spot configuration mistakes before running 5000.
Watch the Console -- the Console / Log View gets a single REQ entry at the start of the benchmark and a RES entry at the end with the totals. Individual calls are not logged (they would flood the buffer).
Bump concurrency carefully -- the browser fetches go out simultaneously up to the concurrency limit. With high concurrency you're benchmarking your network stack and the server's ability to handle parallel requests, not single-call latency.
Combine with Environments -- benchmark the same call against Dev, then Staging, then Prod by switching environments. The results are not retained across switches, so screenshot or copy the stats first.
Compare percentiles, not averages -- the average can hide a long tail. p99 is the number that matters for user-facing latency.

Sections

On this page