🎯 AI Recruitr - Smart Resume Matcher

AI-powered recruitment system that uses semantic matching to find the best-fit candidates using FAISS vector database and Gemini AI.

✨ Features

🎯 Semantic Resume Matching - Goes beyond keyword matching using AI embeddings
🚀 Fast Vector Search - Lightning-fast similarity search with FAISS
🤖 AI-Powered Explanations - Gemini AI generates detailed match explanations
📄 Multi-Format Support - Process PDF, DOCX, and TXT resume files
💡 Clean Architecture - Modular microservices design with FastAPI + Streamlit
📊 Analytics Dashboard - Comprehensive insights and matching analytics
⚡ Real-Time Processing - Instant resume processing and matching
🔍 Advanced Filtering - Filter by skills, experience, location, and more

🏗️ Architecture

ai_recruitr/ ├── backend/ # FastAPI microservices │ ├── services/ # Core business logic │ ├── api/ # REST API endpoints │ └── models/ # Pydantic schemas ├── frontend/ # Streamlit UI │ ├── pages/ # UI pages │ └── components/ # Reusable components ├── config/ # Configuration ├── data/ # Data storage └── utils/ # Utilities

🛠️ Tech Stack

Component	Technology
Backend	FastAPI + Python 3.9+
Frontend	Streamlit
Embeddings	mxbai-embed-large-v1 (Hugging Face)
Vector DB	FAISS
LLM	Google Gemini
Resume Parsing	PyMuPDF, python-docx
Data Processing	Pandas, NumPy

🚀 Quick Start

Prerequisites

Python 3.9 or higher
Git
Google Gemini API key

1. Clone Repository

git clone https://github.com/yourusername/ai-recruitr.git cd ai-recruitr

2. Create Virtual Environment

# Create virtual environment python -m venv venv # Activate virtual environment # On Windows: venv\Scripts\activate # On macOS/Linux: source venv/bin/activate

3. Install Dependencies

pip install -r requirements.txt

4. Set Up Environment Variables

Create a .env file in the project root:

cp .env.example .env

Edit .env and add your API keys:

# Required: Google Gemini API Key GEMINI_API_KEY=your_gemini_api_key_here # Optional: Customize settings API_HOST=localhost API_PORT=8000 STREAMLIT_HOST=localhost STREAMLIT_PORT=8501 LOG_LEVEL=INFO

5. Get Your Gemini API Key

Go to Google AI Studio
Create a new API key
Copy and paste it into your .env file

6. Run the Application

Option A: Run Both Services (Recommended)

# Terminal 1: Start FastAPI backend python -m backend.main # Terminal 2: Start Streamlit frontend streamlit run frontend/app.py

Option B: Using Scripts (Windows)

# Start backend start_backend.bat # Start frontend  start_frontend.bat

Option C: Using Scripts (macOS/Linux)

# Start backend ./start_backend.sh # Start frontend ./start_frontend.sh

7. Access the Application

Streamlit UI: http://localhost:8501
FastAPI Docs: http://localhost:8000/docs
API Health: http://localhost:8000/health

📖 Usage Guide

1. Upload Resumes

Navigate to "📄 Upload Resumes" page
Drag and drop PDF/DOCX resume files
Click "🚀 Process All Files"
Wait for processing to complete

2. Match Job Descriptions

Go to "🔍 Job Matching" page
Fill in the job description form:
- Job title
- Detailed job description
- Required skills
- Experience level
Click "🔍 Find Matching Resumes"
Review the matching results

3. Analyze Results

Visit "📊 Results & Analytics" page
View current matching results
Explore analytics and insights
Export data in JSON/CSV format

🔧 Configuration

Environment Variables

Variable	Description	Default
`GEMINI_API_KEY`	Google Gemini API key	Required
`API_HOST`	FastAPI host	`localhost`
`API_PORT`	FastAPI port	`8000`
`STREAMLIT_HOST`	Streamlit host	`localhost`
`STREAMLIT_PORT`	Streamlit port	`8501`
`LOG_LEVEL`	Logging level	`INFO`
`MAX_FILE_SIZE`	Max upload size (bytes)	`10485760` (10MB)
`TOP_K_MATCHES`	Default max matches	`10`
`SIMILARITY_THRESHOLD`	Default threshold	`0.7`

Model Configuration

The system uses:

Embedding Model: mixedbread-ai/mxbai-embed-large-v1
LLM: gemini-pro
Vector Dimension: 1024
Max Sequence Length: 512 tokens

🧪 API Documentation

Upload Resume

curl -X POST "http://localhost:8000/api/v1/upload-resume" \ -H "Content-Type: multipart/form-data" \ -F "file=@resume.pdf"

Match Job Description

curl -X POST "http://localhost:8000/api/v1/match-job" \ -H "Content-Type: application/json" \ -d '{  "job_description": {  "title": "Senior Python Developer",  "description": "We are looking for...",  "skills_required": ["Python", "Django", "PostgreSQL"]  },  "top_k": 10,  "similarity_threshold": 0.7  }'

Get Resume Count

curl "http://localhost:8000/api/v1/resumes/count"

📁 Project Structure

ai_recruitr/ ├── 📁 backend/ │ ├── __init__.py │ ├── main.py # FastAPI application │ ├── 📁 api/ │ │ ├── __init__.py │ │ └── routes.py # API endpoints │ ├── 📁 models/ │ │ ├── __init__.py │ │ └── schemas.py # Pydantic models │ └── 📁 services/ │ ├── __init__.py │ ├── embedding_service.py # mxbai embeddings │ ├── faiss_service.py # Vector database │ ├── gemini_service.py # Gemini LLM │ └── resume_parser.py # Resume processing ├── 📁 frontend/ │ ├── __init__.py │ ├── app.py # Streamlit main app │ ├── 📁 pages/ │ │ ├── __init__.py │ │ ├── upload_resume.py # Upload interface │ │ ├── job_matching.py # Matching interface │ │ └── results.py # Analytics dashboard │ └── 📁 components/ │ ├── __init__.py │ └── ui_components.py # Reusable UI components ├── 📁 config/ │ ├── __init__.py │ └── settings.py # Configuration ├── 📁 data/ │ ├── 📁 resumes/ # Uploaded resumes │ ├── 📁 faiss_index/ # FAISS index files │ └── 📁 processed/ # Processed data ├── 📁 utils/ │ ├── __init__.py │ └── helpers.py # Utility functions ├── requirements.txt # Python dependencies ├── .env.example # Environment template ├── .gitignore # Git ignore rules └── README.md # This file

🚨 Troubleshooting

Common Issues

1. "GEMINI_API_KEY is required" Error

Problem: Missing or invalid Gemini API key.

Solution:

# Check your .env file cat .env # Ensure GEMINI_API_KEY is set echo $GEMINI_API_KEY

2. FAISS Installation Issues

Problem: FAISS installation fails on some systems.

Solution:

# Try installing CPU version specifically pip install faiss-cpu==1.7.4 # On macOS with Apple Silicon: conda install -c pytorch faiss-cpu

3. Resume Text Extraction Fails

Problem: PDF text extraction returns empty content.

Solution:

Ensure PDFs are text-based, not scanned images
Try converting PDFs to text format first
Check file permissions

4. Streamlit Connection Error

Problem: Frontend can't connect to FastAPI backend.

Solution:

# Check if backend is running curl http://localhost:8000/health # Verify ports in .env file grep -E "(API_PORT|STREAMLIT_PORT)" .env

5. Slow Embedding Generation

Problem: Embedding generation takes too long.

Solution:

Check if you have GPU available
Reduce batch size in processing
Consider using smaller embedding model for testing

Debug Mode

Enable debug logging:

# Set in .env LOG_LEVEL=DEBUG # Or run with debug python -m backend.main --log-level DEBUG

🔒 Security Considerations

Production Deployment

Change default ports
Set up proper CORS origins
Use environment-specific API keys
Enable HTTPS
Implement rate limiting
Add authentication
Secure file uploads
Monitor API usage

Data Privacy

Implement data retention policies
Add resume deletion functionality
Encrypt sensitive data
Audit API access
Comply with GDPR/privacy laws

🚀 Advanced Features

Scaling

Database: Replace FAISS with Pinecone/Weaviate for production
Caching: Add Redis for embedding caching
Queue: Use Celery for async processing
Load Balancing: Deploy with multiple API instances

Enhancements

Multi-language Support: Add language detection
Resume Scoring: Implement comprehensive scoring
Bias Detection: Add fairness checking
Integration: Connect with LinkedIn, ATS systems
Real-time Updates: WebSocket for live updates

🤝 Contributing

Fork the repository
Create feature branch (git checkout -b feature/amazing-feature)
Commit changes (git commit -m 'Add amazing feature')
Push to branch (git push origin feature/amazing-feature)
Open Pull Request

Development Setup

# Install development dependencies pip install -r requirements-dev.txt # Run tests pytest tests/ # Format code black . isort . # Lint code flake8 .

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

Hugging Face for mxbai embeddings
Google for Gemini LLM
Facebook Research for FAISS
FastAPI team
Streamlit team

📞 Support

📧 Email: support@ai-recruitr.com
💬 Discord: AI Recruitr Community
🐛 Issues: GitHub Issues
📖 Documentation: Full Docs

Made with ❤️ for smarter recruiting

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
backend		backend
config		config
data/faiss_index		data/faiss_index
frontend		frontend
utils		utils
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
WORKFLOW.md		WORKFLOW.md
requirements.txt		requirements.txt
start.sh		start.sh

Folders and files

Latest commit

History

Repository files navigation

🎯 AI Recruitr - Smart Resume Matcher

✨ Features

🏗️ Architecture

🛠️ Tech Stack

🚀 Quick Start

Prerequisites

1. Clone Repository

2. Create Virtual Environment

3. Install Dependencies

4. Set Up Environment Variables

5. Get Your Gemini API Key

6. Run the Application

Option A: Run Both Services (Recommended)

Option B: Using Scripts (Windows)

Option C: Using Scripts (macOS/Linux)

7. Access the Application

📖 Usage Guide

1. Upload Resumes

2. Match Job Descriptions

3. Analyze Results

🔧 Configuration

Environment Variables

Model Configuration

🧪 API Documentation

Upload Resume

Match Job Description

Get Resume Count

📁 Project Structure

🚨 Troubleshooting

Common Issues

1. "GEMINI_API_KEY is required" Error

2. FAISS Installation Issues

3. Resume Text Extraction Fails

4. Streamlit Connection Error

5. Slow Embedding Generation

Debug Mode

🔒 Security Considerations

Production Deployment

Data Privacy

🚀 Advanced Features

Scaling

Enhancements

🤝 Contributing

Development Setup

📄 License

🙏 Acknowledgments

📞 Support

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages