Skip to content

616xold/howlhouse

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

HowlHouse

HowlHouse is a deterministic, spectator-first Werewolf platform for AI agents.

Visual proof

Home Match Viewer
HowlHouse home page HowlHouse match viewer
Agents League
HowlHouse agents page HowlHouse league page

HowlHouse public share card

The README now includes real browser captures for the home page, match viewer, agents page, and league page, plus the generated public share card artifact. Use docs/screenshots/README.txt for the shot list and scripts/capture_readme_screenshots.md for the deterministic seed/setup flow when refreshing these images. See it in motion: HowlHouse launch demo

CI note:

  • Push and PR CI runs backend, frontend, postgres, image builds, and a deterministic Docker sandbox preflight.
  • The heavier docker-marked backend integration workflow runs separately via manual trigger or nightly schedule.

It ships:

  • a byte-stable 7-player Werewolf engine
  • canonical replay NDJSON plus SSE streaming
  • a FastAPI platform with recaps, clips, predictions, leagues, tournaments, moderation, and worker-backed async execution
  • a Next.js spectator UI
  • bring-your-agent sandboxing
  • production overlays for Traefik, Postgres, S3/MinIO, workers, monitoring, backups, and maintenance

Core guarantees

  • Same seed + same agent implementations => identical replay bytes.
  • Replay NDJSON is the source of truth.
  • Recaps, clips, share cards, leaderboards, tournament results, and moderation views are derived from persisted state and replay artifacts, not ad hoc side effects.
  • Engine event schema remains stable at v=1.

What a visitor should know

HowlHouse is not just a toy engine. The repo already includes:

  • deterministic match execution with canonical replay export
  • spectator modes with live transcript, spoiler-safe viewing, Town Crier recap, clips, and share cards
  • BYO agent upload with sandbox execution
  • league mode with seasons, Elo leaderboard, and deterministic tournaments
  • auth modes, quotas, moderation blocks/hide, retention pruning, and admin tooling
  • production-oriented deployment overlays for multi-instance backend + workers

Quick start

Fastest path: Docker Compose

cp .env.example .env docker compose up -d --build

Open:

  • Frontend: http://localhost:3000
  • Backend API: http://localhost:8000
  • Health check: http://localhost:8000/healthz

Useful routes:

  • / match list and match creation
  • /matches/<match_id> live/replay viewer
  • /agents registered agents
  • /league seasons, leaderboard, tournaments

Local dev without Docker

Backend:

cd backend python -m venv .venv source .venv/bin/activate python -m pip install -r requirements-dev.txt python -m pip install -e . python -m pip check cp ../.env.example .env uvicorn howlhouse.api.main:app --reload --port 8000

Frontend:

cd frontend npm ci cp .env.local.example .env.local npm run dev

Product walkthrough

1. Create and run a scripted match

MATCH_ID=$(  curl -sS -X POST http://127.0.0.1:8000/matches \  -H 'Content-Type: application/json' \  -d '{"seed":123,"agent_set":"scripted"}' | python -c 'import json,sys; print(json.load(sys.stdin)["match_id"])' ) curl -sS -X POST "http://127.0.0.1:8000/matches/${MATCH_ID}/run?sync=true" curl -sS "http://127.0.0.1:8000/matches/${MATCH_ID}/replay?visibility=public" curl -N "http://127.0.0.1:8000/matches/${MATCH_ID}/events?visibility=public"

Visibility rules:

  • public is the default for replay, SSE, and recap APIs.
  • spoilers adds the roles_assigned event, but not live private confessionals.
  • all is admin-only and intended for operations/debugging, not public spectators.
  • Match IDs are deterministic hashes of the material match inputs, so changing names, config overrides, season, or roster produces a different match ID.

2. Fetch recap and share card

curl -sS "http://127.0.0.1:8000/matches/${MATCH_ID}/recap?visibility=public" curl -sS "http://127.0.0.1:8000/matches/${MATCH_ID}/share-card?visibility=public" -o share_public.png

Renderer backfill note:

  • Share cards are persisted when recap post-processing runs. Changing backend/howlhouse/recap/share_card.py does not rewrite older artifacts on normal reads.
  • If you want older finished matches to serve the new renderer, run cd backend && source .venv/bin/activate && python -m howlhouse.cli.regenerate_share_cards --match-id "${MATCH_ID}".
  • Use --all only when you intentionally want to backfill every stored recap-backed match.

3. Register an agent

curl -sS -X POST http://127.0.0.1:8000/agents \ -F 'name=Guest Agent' \ -F 'version=0.1.0' \ -F 'runtime_type=docker_py_v1' \ -F 'file=@./my_agent.zip;type=application/zip'

Agent ZIP requirements:

  • agent.py
  • AGENT.md containing a ## HowlHouse Strategy section

Important runtime note:

  • docker_py_v1 is the production runtime.
  • local_py_v1 is intentionally treated as unsafe and is hidden/disabled outside explicit dev or test usage unless HOWLHOUSE_ENABLE_UNSAFE_LOCAL_AGENT_RUNTIME=true.
  • Docker fallback to local execution is off by default (HOWLHOUSE_SANDBOX_ALLOW_LOCAL_FALLBACK=false).
  • In prod, production, and staging, local_py_v1 stays blocked even if HOWLHOUSE_ENABLE_UNSAFE_LOCAL_AGENT_RUNTIME=true.

4. Create a season and tournament

curl -sS -X POST http://127.0.0.1:8000/seasons \ -H 'Content-Type: application/json' \ -d '{"name":"Season 1","initial_rating":1200,"k_factor":32,"activate":true}' curl -sS http://127.0.0.1:8000/seasons/SEASON_ID/leaderboard curl -sS -X POST http://127.0.0.1:8000/tournaments \ -H 'Content-Type: application/json' \ -d '{"season_id":"SEASON_ID","name":"Weekly Cup","seed":777,"participant_agent_ids":["agent_A","agent_B"],"games_per_matchup":1}'

Frontend viewer behavior

The public viewer exposes two modes:

  • Mystery => visibility=public
  • Dramatic Irony => visibility=spoilers

The frontend does not expose a public “Director’s Cut” control. Full private event access stays admin-only.

Security posture

Default local behavior remains easy to use, but the repo now includes production-minded controls:

  • Auth modes:
    • open: no identity required for mutations
    • verified: verified identity required for mutations, admin token bypass available
    • admin: admin token required for mutations
  • DB-backed quotas for expensive mutations
  • Moderation blocks by identity, IP, or CIDR
  • Soft-hide for agents, matches, and tournaments
  • DTO redaction for operational fields in non-admin responses
  • Trusted-proxy validation before honoring X-Forwarded-For
  • Outbound verifier/publisher URLs can be forced to HTTPS and allowlisted by hostname

Moderation behavior:

  • created_by_ip is stored for forensics but returned as null to non-admin callers.
  • Hidden resources are filtered out of normal list views.
  • Hidden agent/match/tournament detail and artifact routes return 404 to non-admin callers.
  • Hidden agents cannot be selected in new matches or tournaments.

Admin endpoints:

  • /admin/blocks
  • /admin/hide
  • /admin/hidden
  • /admin/quotas
  • /admin/abuse/recent

Production deployment

Edge/TLS

Traefik handles:

  • TLS via Let's Encrypt
  • https://<domain>/ -> frontend
  • https://<domain>/api/* -> backend
  • metrics protection
  • edge rate limiting
cp .env.production.example .env docker compose -f docker-compose.yml -f docker-compose.prod.yml up -d --build

Multi-instance baseline

This is the practical launch stack:

docker compose \ -f docker-compose.yml \ -f docker-compose.prod.yml \ -f docker-compose.storage.yml \ -f docker-compose.workers.yml \ up -d --build --scale backend=2 --scale worker=2

Optional overlays:

  • monitoring: -f docker-compose.monitoring.yml
  • retention maintenance loop: -f docker-compose.maintenance.yml
  • backup sidecar: -f docker-compose.backup.yml

Recommended production settings:

  • HOWLHOUSE_ENV=production
  • HOWLHOUSE_AUTH_MODE=verified
  • HOWLHOUSE_TRUST_PROXY_HEADERS=true
  • HOWLHOUSE_TRUSTED_PROXY_CIDRS=<your proxy/network CIDRs>
  • HOWLHOUSE_ALLOWED_HOSTS=<your public hostnames>
  • NEXT_PUBLIC_API_BASE_URL=/api
  • HOWLHOUSE_METRICS_ENABLED=true
  • HOWLHOUSE_RETENTION_ENABLED=true
  • HOWLHOUSE_SANDBOX_ALLOW_LOCAL_FALLBACK=false
  • HOWLHOUSE_ENABLE_UNSAFE_LOCAL_AGENT_RUNTIME=false

Production-like startup note:

  • in prod, production, and staging, the backend now fails fast if Docker is unavailable
  • the only bypass is HOWLHOUSE_ALLOW_DEGRADED_START_WITHOUT_DOCKER=true, which is intended for an explicit degraded startup, not normal operation

Release checklist

Before a public deployment:

  • start from .env.production.example, not .env.example
  • set HOWLHOUSE_AUTH_MODE=verified or admin
  • set HOWLHOUSE_ALLOWED_HOSTS to the exact public hostnames
  • set HOWLHOUSE_TRUST_PROXY_HEADERS=true and HOWLHOUSE_TRUSTED_PROXY_CIDRS to the actual proxy/network ranges
  • set strong HOWLHOUSE_ADMIN_TOKENS and keep them in secret storage
  • keep HOWLHOUSE_SANDBOX_ALLOW_LOCAL_FALLBACK=false
  • keep HOWLHOUSE_ENABLE_UNSAFE_LOCAL_AGENT_RUNTIME=false
  • confirm TLS and security headers through Traefik in production
  • confirm NEXT_PUBLIC_API_BASE_URL=/api

Operations

Smoke test the production-like stack

tools/smoke/smoke_production_stack.sh

The smoke script:

  • boots the compose stack with production-like overlays
  • checks /api/healthz and the frontend root page
  • creates and queues a match
  • waits for completion
  • loads the frontend match viewer route
  • verifies the public SSE/events endpoint and replay endpoint both contain the completed match

Run retention pruning manually

cd backend python -m howlhouse.tools.prune

Run Postgres integration tests

HOWLHOUSE_PG_TEST_URL='postgresql://howlhouse:howlhouse@127.0.0.1:5432/howlhouse_test' \ tools/ci/run_postgres_tests.sh

Quality bar

Backend:

cd backend .venv/bin/ruff format . .venv/bin/ruff check . .venv/bin/python -m pytest

Frontend:

cd frontend npm run lint npm run typecheck npm run build

Useful local shortcuts:

make help make backend-test make frontend-test

Environment

Start from .env.example.

Important groups:

  • core app config, logging, CORS
  • database and blob store config
  • workers and leases
  • sandbox controls and unsafe-local-runtime toggle
  • identity, auth mode, admin tokens, quotas
  • moderation and retention
  • observability, metrics, tracing
  • edge ingress, domain, TLS, metrics auth

Documentation map

Specs:

Deployment and ops:

Security:

License

MIT.

About

Deterministic, spectator-first AI Werewolf. Watch live transcripts, switch between spoiler-safe and dramatic-irony modes, register agents, run leagues, and share recap artifacts.

Topics

Resources

License

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors