A FastAPI-based backend demonstrating the OpenAI Agents SDK with streaming endpoints for multiple AI agents. This project features dedicated routers per agent with real-time streaming events including agent updates, raw responses, and run items, plus persistent conversation memory for multi-turn interactions.
This project is structured with separate packages for each agent:
src/ βββ api/ # FastAPI application β βββ routers/ # Agent-specific endpoints β βββ utils/ # Shared utilities βββ chat_agent/ # General chat agent package βββ research_bot/ # Basic research agent package β βββ agents/ # Planner, Search, and Writer agents β βββ manager.py # Research orchestration βββ deep_research_agent/ # Advanced multi-agent research system βββ agents/ # Hierarchical specialized agents βββ models.py # Comprehensive data models βββ orchestrator.py # Multi-agent coordination βββ tools.py # Research function tools βββ config.py # Advanced configuration system Each agent package can be imported and used independently, making the system modular and scalable.
- π Per-agent dedicated endpoints with standardized patterns
- π‘ Real-time streaming with Server-Sent Events (SSE)
- π Event types: Raw LLM responses, semantic agent events, handoffs
- πΎ Session Memory & Conversation History - Persistent multi-turn conversations
- π§© Modular architecture - each agent as separate package
- π Auto-generated OpenAPI docs at
/docs - π§ Development-ready with hot reload and comprehensive logging
Built-in conversational memory using OpenAI Agents SDK's SQLiteSession:
- Multi-turn conversations: Agents remember context across requests
- Session isolation: Each user/conversation maintains separate history
- Persistent storage: Conversation history survives server restarts
- Environment-based config: Enable with simple
ENABLE_SESSIONS=true - Zero code changes: Existing endpoints automatically support sessions
- Production-ready: SQLite-based storage with proper error handling
Perfect for: Chatbots, virtual assistants, customer support, educational apps, and any conversational AI that needs context awareness.
- uv - Fast Python package manager
- OpenAI API Key - Set in environment variables
# On macOS/Linux curl -LsSf https://astral.sh/uv/install.sh | sh # On Windows (PowerShell) powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex" # Via pip pip install uvgit clone https://github.com/ahmad2b/openai-agents-streaming-api.git cd openai-agents-streaming-api # Create virtual environment with correct Python version uv venv # Activate virtual environment # On Unix/macOS: source .venv/bin/activate # On Windows: .venv\Scripts\activate # Install project in development mode with all dependencies uv pip install -e .Create a .env file in the project root:
# .env OPENAI_API_KEY=your_openai_api_key_here # Session Memory Configuration (Optional) ENABLE_SESSIONS=true # Enable conversation memory SESSION_DB_PATH=./conversations.db # SQLite database path # Optional: Logging level LOG_LEVEL=INFO # Optional: Custom port (default is 8000) PORT=8000# Using uvicorn directly (with hot reload) uvicorn src.api.main:app --reload --host 0.0.0.0 --port 8000 # Or using the module directly python -m src.api.main# Install production dependencies (if different) uv pip install -e . # Run with production settings uvicorn src.api.main:app --host 0.0.0.0 --port 8000 --workers 4# Run directly with uv (manages virtual environment automatically) uv run uvicorn src.api.main:app --reload- Application:
http://127.0.0.1:8000 - Interactive Docs:
http://127.0.0.1:8000/docs - ReDoc:
http://127.0.0.1:8000/redoc
Each agent has standardized endpoints with automatic session support:
POST /assistant/run # Synchronous execution POST /assistant/stream # Real-time streaming GET /assistant/info # Agent information & session config GET /assistant/session/{session_id} # Get all messages for session DELETE /assistant/session/{session_id} # Clear conversation historyPOST /chat/run # Synchronous execution POST /chat/stream # Real-time streaming GET /chat/info # Agent information & session config GET /chat/session/{session_id} # Get all messages for session DELETE /chat/session/{session_id} # Clear conversation historyPOST /research # Full research pipeline# Test the chat agent without session memory curl -X POST "http://127.0.0.1:8000/chat/run" \ -H "Content-Type: application/json" \ -d '{"input": "Hello, how can you help me?"}'# First message with session_id - agent introduces context curl -X POST "http://127.0.0.1:8000/chat/run" \ -H "Content-Type: application/json" \ -d '{"input": "Hi, my name is Sarah and I work as a software engineer", "session_id": "user_sarah_123"}' # Second message - agent remembers Sarah and her profession curl -X POST "http://127.0.0.1:8000/chat/run" \ -H "Content-Type: application/json" \ -d '{"input": "What kind of work do I do?", "session_id": "user_sarah_123"}' # Stream responses with conversation context curl -X POST "http://127.0.0.1:8000/chat/stream" \ -H "Content-Type: application/json" \ -d '{"input": "Give me some programming tips for my field", "session_id": "user_sarah_123"}' \ --no-buffer# Get all messages for a session curl -X GET "http://127.0.0.1:8000/chat/session/user_sarah_123" # Get limited number of recent messages (e.g., last 10) curl -X GET "http://127.0.0.1:8000/chat/session/user_sarah_123?limit=10" # Clear conversation history for a user curl -X DELETE "http://127.0.0.1:8000/chat/session/user_sarah_123" # Check agent info and session configuration curl -X GET "http://127.0.0.1:8000/chat/info"Environment Variables:
ENABLE_SESSIONS=true- Enable conversation memory globallySESSION_DB_PATH=./conversations.db- SQLite database location (optional)
Automatic Session Handling:
- No session_id: Traditional stateless interaction
- With session_id: Automatic conversation history using OpenAI Agents SDK's SQLiteSession
- Session isolation: Each session_id maintains separate conversation memory
- Persistent storage: History survives server restarts and deployments
Built on OpenAI Agents SDK patterns:
- Uses
SQLiteSessionfor reliable conversation storage - Integrates with
Runner.run()andRunner.run_streamed()seamlessly - Maintains conversation context across agent handoffs
- Supports both synchronous and streaming interactions with memory
- Environment-driven: 12-factor app configuration
- Zero code changes: Works with existing agent implementations
- Scalable storage: SQLite for single instance, easily extensible to PostgreSQL
- Error handling: Graceful degradation when sessions unavailable
- Security: Session isolation prevents cross-user data leakage
# Add new dependencies uv add package-name # Add development dependencies uv add --dev pytest black isort # Update all dependencies uv lock --upgrade # Install from lock file (for consistent environments) uv pip install -r uv.lock# Install development tools uv add --dev black isort flake8 pytest # Format code black src/ isort src/ # Run linting flake8 src/ # Run tests pytestEach agent can be imported and used independently, with or without session memory:
# Using the chat agent directly (no session) from src.chat_agent.main import chat_agent from agents import Runner result = await Runner.run(chat_agent, "Hello!") # Using with session memory from agents import SQLiteSession session = SQLiteSession("user_123", db_path="conversations.db") result = await Runner.run(chat_agent, "Hello!", session=session) # Using the research bot from src.research_bot.manager import ResearchManager manager = ResearchManager() report = await manager.run("AI trends 2024")Each agent uses the standardized create_agent_router() utility that provides:
- POST
/run- Synchronous execution with complete response - POST
/stream- Real-time streaming with formatted events - DELETE
/session/{session_id}- Clear conversation history for specific session - GET
/info- Agent metadata, configuration, and session status
The streaming endpoints emit structured events:
raw_response- Direct from OpenAI (text deltas, function calls, etc.)run_item- Semantic agent events (tool usage, handoffs, reasoning)agent_updated- Agent handoff notificationsstream_complete- Final results with usage statistics and session infoerror- Error handling with details
To add a new agent with automatic session support:
- Create
src/your_agent/main.pywith agent definition - Create
src/api/routers/your_agent.pyusingcreate_agent_router() - Include the router in
src/api/main.py - Sessions work automatically - no additional code needed!
- Import Errors: Ensure you're running from the project root and the virtual environment is activated
- OpenAI API Key: Check that
OPENAI_API_KEYis set in your.envfile - Port Conflicts: Change the port in the uvicorn command if 8000 is occupied
- Python Version: Ensure Python 3.13+ is installed and selected
- Session Memory: Set
ENABLE_SESSIONS=trueand restart server to enable conversation memory
# Run with debug logging LOG_LEVEL=DEBUG uvicorn src.api.main:app --reload # Check agent information and session configuration curl http://127.0.0.1:8000/chat/info # Test session functionality curl -X POST http://127.0.0.1:8000/chat/run \ -H "Content-Type: application/json" \ -d '{"input": "Test message", "session_id": "debug_session"}'- Use
--workers Nfor production deployment - Configure appropriate
--timeout-keep-alivefor long streaming sessions - Monitor memory usage with longer conversations
- Session cleanup: Implement periodic cleanup of old conversation sessions
- Database maintenance: Regular SQLite VACUUM for optimal performance
- Customer support chatbots with conversation context
- Virtual assistants that remember user preferences
- Educational tutors tracking learning progress
- Healthcare assistants maintaining patient interaction history
- Employee helpdesk with persistent conversation threads
- Sales assistants remembering customer interactions
- Internal knowledge bots with user-specific context
- Multi-turn research assistants building on previous queries
- Code assistants with project context memory
- Documentation bots maintaining conversation flow
- API testing tools with session-based request history
- Interactive debugging assistants with state persistence
- Follow the established package structure
- Use the standardized agent router pattern
- Session memory works automatically - no special handling needed
- Add comprehensive logging
- Update documentation for new agents
- Test both sync and streaming endpoints with and without sessions
Built with FastAPI, OpenAI Agents SDK (with SQLiteSession), and uv for modern Python development.