Skip to content

anyscale/custom-router-api-benchmarks

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Ray Serve Custom Router API Benchmarks

A benchmarking suite for comparing different request router implementations in Ray's LLM serving system. Compares Power-of-2 routers vs Prefix-Aware routers across various replica counts.

Project Structure

custom-router-api-benchmarks/ ├── scripts/ │ ├── engine_metrics.py # Metrics collection and parsing from Ray /metrics endpoint │ ├── sweep_replicas.py # Main replica scaling benchmark script │ └── visualize_replica_sweep.py # Visualization and analysis of results └── results/ ├── replica_sweep/ # Current benchmark results (JSON + raw metrics) └── visualizations/ # Generated plots and charts 

Usage

Run Benchmark

cd scripts/ python sweep_replicas.py

Generate Visualizations

cd scripts/ python visualize_replica_sweep.py

Requirements

  • Ray cluster, see k8s install steps here
  • Docker image: rayproject/ray-llm:nightly-py311-cu128
  • Ray nightly wheel

Set the following environment variables in an Anyscale Service runtime_env for optimal performance:

  • ANYSCALE_RAY_SERVE_THROUGHPUT_OPT=1
  • RAYLLM_ROUTER_TO_MODEL_REPLICA_RATIO=8

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages