An AI-powered medical receptionist system that handles patient verification calls through both text and voice interactions. Built with Flask, ElevenLabs TTS, and Groq AI.
- Voice-to-Voice Interaction: Full conversational AI with speech recognition and text-to-speech
- Patient Verification: Automated verification against patient database using name, phone, and date of birth
- Multiple Voice Options: 8 different AI voices (male/female, various tones)
- Session Management: Maintains conversation context across interactions
- Real-time Audio: Instant speech synthesis and playback
- RESTful API: Clean endpoints for integration with other systems
- Backend: Flask (Python)
- AI Model: Groq LLaMA 3 8B
- Text-to-Speech: ElevenLabs API
- Speech Recognition: Google Speech Recognition
- Frontend: HTML/JavaScript (template-based)
- Aria: Female, Professional
- Rachel: Female, Calm
- Adam: Male, Professional
- Josh: Male, Friendly
- Arnold: Male, Deep
- Bella: Female, Warm
- Elli: Female, Young
- James: Male, Mature
- Python 3.8+
- Microphone access for voice input
- Internet connection for API calls
-
Clone the repository
git clone <repository-url> cd medical-receptionist-ai
-
Install dependencies
pip install flask flask-cors requests python-dotenv elevenlabs speechrecognition pyaudio
-
Create environment file
touch .env
-
Add API keys to
.envGROQ_API_KEY=your_groq_api_key_here ELEVENLABS_API_KEY=your_elevenlabs_api_key_here
-
Create patient data file Create
data.jsonin the project root:{ "patients": [ { "name": "John Smith", "phone": "555-0123", "date_of_birth": "1985-06-15", "appointment_date": "2024-02-15", "appointment_time": "10:30 AM" } ] } -
Run the application
python app.py
POST /chat- Text-based conversationPOST /voice_chat- Voice-to-voice interactionPOST /speak- Generate speech from textGET /get_audio- Retrieve generated audio file
GET /- Web interfaceGET /voices- List available voicesGET /status- System health checkPOST /reset- Clear conversation history
Text Chat:
POST /chat { "message": "Hi, I'd like to verify my appointment", "session_id": "user123", "voice_id": "pFZP5JQG7iQjIQuC4Bku" } Response: { "response": "Hello! I can help verify your appointment. Could you please tell me your full name?", "session_id": "user123", "message_count": 2, "speech_enabled": true, "audio_available": true }Voice Chat:
POST /voice_chat { "session_id": "user123", "voice_id": "21m00Tcm4TlvDq8ikWAM" } Response: { "user_input": "My name is John Smith", "response": "Thank you, John. Now could you please provide your phone number?", "session_id": "user123", "message_count": 4, "speech_enabled": true, "audio_available": true }- Open browser to
http://localhost:5000 - Choose between text or voice interaction
- Select preferred AI voice
- Follow the verification prompts
- System greets caller
- Requests patient name
- Requests phone number
- Requests date of birth
- Verifies against database
- Confirms appointment or indicates no match
GROQ_API_KEY: Required for AI responsesELEVENLABS_API_KEY: Required for text-to-speech
The system expects a JSON file with patient records containing:
name: Full patient namephone: Phone number (flexible formatting)date_of_birth: YYYY-MM-DD formatappointment_date: Appointment dateappointment_time: Appointment time
The system includes comprehensive error handling for:
- Missing API keys
- Network connectivity issues
- Speech recognition failures
- Audio generation problems
- Invalid patient data
- Session management errors
- API keys are loaded from environment variables
- No sensitive patient data is logged
- Session data is stored in memory (not persistent)
- CORS is enabled for web interface integration
-
"No module named 'pyaudio'"
# On Windows pip install pyaudio # On macOS brew install portaudio pip install pyaudio # On Linux sudo apt-get install python3-pyaudio
-
Microphone not working
- Check microphone permissions
- Verify microphone is not used by other applications
- Test with
speech_recognitionlibrary directly
-
API key errors
- Verify keys are correctly set in
.envfile - Check API key permissions and quotas
- Ensure environment file is in the correct location
- Verify keys are correctly set in
-
Audio playback issues
- Check system audio settings
- Verify browser audio permissions
- Test with different audio formats
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests if applicable
- Submit a pull request
For issues and questions:
- Check the troubleshooting section
- Review API documentation for Groq and ElevenLabs
- Create an issue in the repository