Ray Serve Custom Router API Benchmarks

A benchmarking suite for comparing different request router implementations in Ray's LLM serving system. Compares Power-of-2 routers vs Prefix-Aware routers across various replica counts.

Project Structure

custom-router-api-benchmarks/ ├── scripts/ │ ├── engine_metrics.py # Metrics collection and parsing from Ray /metrics endpoint │ ├── sweep_replicas.py # Main replica scaling benchmark script │ └── visualize_replica_sweep.py # Visualization and analysis of results └── results/ ├── replica_sweep/ # Current benchmark results (JSON + raw metrics) └── visualizations/ # Generated plots and charts

Usage

Run Benchmark

cd scripts/ python sweep_replicas.py

Generate Visualizations

cd scripts/ python visualize_replica_sweep.py

Requirements

Ray cluster, see k8s install steps here
Docker image: rayproject/ray-llm:nightly-py311-cu128
Ray nightly wheel

Set the following environment variables in an Anyscale Service `runtime_env` for optimal performance:

ANYSCALE_RAY_SERVE_THROUGHPUT_OPT=1
RAYLLM_ROUTER_TO_MODEL_REPLICA_RATIO=8

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
results		results
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Ray Serve Custom Router API Benchmarks

Project Structure

Usage

Run Benchmark

Generate Visualizations

Requirements

Set the following environment variables in an Anyscale Service `runtime_env` for optimal performance:

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Ray Serve Custom Router API Benchmarks

Project Structure

Usage

Run Benchmark

Generate Visualizations

Requirements

Set the following environment variables in an Anyscale Service runtime_env for optimal performance:

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Set the following environment variables in an Anyscale Service `runtime_env` for optimal performance:

Packages