Skip to content

asimunit/ai_recruiter

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

2 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

🎯 AI Recruitr - Smart Resume Matcher

AI-powered recruitment system that uses semantic matching to find the best-fit candidates using FAISS vector database and Gemini AI.

AI Recruitr Banner

✨ Features

  • 🎯 Semantic Resume Matching - Goes beyond keyword matching using AI embeddings
  • πŸš€ Fast Vector Search - Lightning-fast similarity search with FAISS
  • πŸ€– AI-Powered Explanations - Gemini AI generates detailed match explanations
  • πŸ“„ Multi-Format Support - Process PDF, DOCX, and TXT resume files
  • πŸ’‘ Clean Architecture - Modular microservices design with FastAPI + Streamlit
  • πŸ“Š Analytics Dashboard - Comprehensive insights and matching analytics
  • ⚑ Real-Time Processing - Instant resume processing and matching
  • πŸ” Advanced Filtering - Filter by skills, experience, location, and more

πŸ—οΈ Architecture

ai_recruitr/ β”œβ”€β”€ backend/ # FastAPI microservices β”‚ β”œβ”€β”€ services/ # Core business logic β”‚ β”œβ”€β”€ api/ # REST API endpoints β”‚ └── models/ # Pydantic schemas β”œβ”€β”€ frontend/ # Streamlit UI β”‚ β”œβ”€β”€ pages/ # UI pages β”‚ └── components/ # Reusable components β”œβ”€β”€ config/ # Configuration β”œβ”€β”€ data/ # Data storage └── utils/ # Utilities 

πŸ› οΈ Tech Stack

Component Technology
Backend FastAPI + Python 3.9+
Frontend Streamlit
Embeddings mxbai-embed-large-v1 (Hugging Face)
Vector DB FAISS
LLM Google Gemini
Resume Parsing PyMuPDF, python-docx
Data Processing Pandas, NumPy

πŸš€ Quick Start

Prerequisites

  • Python 3.9 or higher
  • Git
  • Google Gemini API key

1. Clone Repository

git clone https://github.com/yourusername/ai-recruitr.git cd ai-recruitr

2. Create Virtual Environment

# Create virtual environment python -m venv venv # Activate virtual environment # On Windows: venv\Scripts\activate # On macOS/Linux: source venv/bin/activate

3. Install Dependencies

pip install -r requirements.txt

4. Set Up Environment Variables

Create a .env file in the project root:

cp .env.example .env

Edit .env and add your API keys:

# Required: Google Gemini API Key GEMINI_API_KEY=your_gemini_api_key_here # Optional: Customize settings API_HOST=localhost API_PORT=8000 STREAMLIT_HOST=localhost STREAMLIT_PORT=8501 LOG_LEVEL=INFO

5. Get Your Gemini API Key

  1. Go to Google AI Studio
  2. Create a new API key
  3. Copy and paste it into your .env file

6. Run the Application

Option A: Run Both Services (Recommended)

# Terminal 1: Start FastAPI backend python -m backend.main # Terminal 2: Start Streamlit frontend streamlit run frontend/app.py

Option B: Using Scripts (Windows)

# Start backend start_backend.bat # Start frontend  start_frontend.bat

Option C: Using Scripts (macOS/Linux)

# Start backend ./start_backend.sh # Start frontend ./start_frontend.sh

7. Access the Application

πŸ“– Usage Guide

1. Upload Resumes

  1. Navigate to "πŸ“„ Upload Resumes" page
  2. Drag and drop PDF/DOCX resume files
  3. Click "πŸš€ Process All Files"
  4. Wait for processing to complete

2. Match Job Descriptions

  1. Go to "πŸ” Job Matching" page
  2. Fill in the job description form:
    • Job title
    • Detailed job description
    • Required skills
    • Experience level
  3. Click "πŸ” Find Matching Resumes"
  4. Review the matching results

3. Analyze Results

  1. Visit "πŸ“Š Results & Analytics" page
  2. View current matching results
  3. Explore analytics and insights
  4. Export data in JSON/CSV format

πŸ”§ Configuration

Environment Variables

Variable Description Default
GEMINI_API_KEY Google Gemini API key Required
API_HOST FastAPI host localhost
API_PORT FastAPI port 8000
STREAMLIT_HOST Streamlit host localhost
STREAMLIT_PORT Streamlit port 8501
LOG_LEVEL Logging level INFO
MAX_FILE_SIZE Max upload size (bytes) 10485760 (10MB)
TOP_K_MATCHES Default max matches 10
SIMILARITY_THRESHOLD Default threshold 0.7

Model Configuration

The system uses:

  • Embedding Model: mixedbread-ai/mxbai-embed-large-v1
  • LLM: gemini-pro
  • Vector Dimension: 1024
  • Max Sequence Length: 512 tokens

πŸ§ͺ API Documentation

Upload Resume

curl -X POST "http://localhost:8000/api/v1/upload-resume" \ -H "Content-Type: multipart/form-data" \ -F "file=@resume.pdf"

Match Job Description

curl -X POST "http://localhost:8000/api/v1/match-job" \ -H "Content-Type: application/json" \ -d '{  "job_description": {  "title": "Senior Python Developer",  "description": "We are looking for...",  "skills_required": ["Python", "Django", "PostgreSQL"]  },  "top_k": 10,  "similarity_threshold": 0.7  }'

Get Resume Count

curl "http://localhost:8000/api/v1/resumes/count"

πŸ“ Project Structure

ai_recruitr/ β”œβ”€β”€ πŸ“ backend/ β”‚ β”œβ”€β”€ __init__.py β”‚ β”œβ”€β”€ main.py # FastAPI application β”‚ β”œβ”€β”€ πŸ“ api/ β”‚ β”‚ β”œβ”€β”€ __init__.py β”‚ β”‚ └── routes.py # API endpoints β”‚ β”œβ”€β”€ πŸ“ models/ β”‚ β”‚ β”œβ”€β”€ __init__.py β”‚ β”‚ └── schemas.py # Pydantic models β”‚ └── πŸ“ services/ β”‚ β”œβ”€β”€ __init__.py β”‚ β”œβ”€β”€ embedding_service.py # mxbai embeddings β”‚ β”œβ”€β”€ faiss_service.py # Vector database β”‚ β”œβ”€β”€ gemini_service.py # Gemini LLM β”‚ └── resume_parser.py # Resume processing β”œβ”€β”€ πŸ“ frontend/ β”‚ β”œβ”€β”€ __init__.py β”‚ β”œβ”€β”€ app.py # Streamlit main app β”‚ β”œβ”€β”€ πŸ“ pages/ β”‚ β”‚ β”œβ”€β”€ __init__.py β”‚ β”‚ β”œβ”€β”€ upload_resume.py # Upload interface β”‚ β”‚ β”œβ”€β”€ job_matching.py # Matching interface β”‚ β”‚ └── results.py # Analytics dashboard β”‚ └── πŸ“ components/ β”‚ β”œβ”€β”€ __init__.py β”‚ └── ui_components.py # Reusable UI components β”œβ”€β”€ πŸ“ config/ β”‚ β”œβ”€β”€ __init__.py β”‚ └── settings.py # Configuration β”œβ”€β”€ πŸ“ data/ β”‚ β”œβ”€β”€ πŸ“ resumes/ # Uploaded resumes β”‚ β”œβ”€β”€ πŸ“ faiss_index/ # FAISS index files β”‚ └── πŸ“ processed/ # Processed data β”œβ”€β”€ πŸ“ utils/ β”‚ β”œβ”€β”€ __init__.py β”‚ └── helpers.py # Utility functions β”œβ”€β”€ requirements.txt # Python dependencies β”œβ”€β”€ .env.example # Environment template β”œβ”€β”€ .gitignore # Git ignore rules └── README.md # This file 

🚨 Troubleshooting

Common Issues

1. "GEMINI_API_KEY is required" Error

Problem: Missing or invalid Gemini API key.

Solution:

# Check your .env file cat .env # Ensure GEMINI_API_KEY is set echo $GEMINI_API_KEY

2. FAISS Installation Issues

Problem: FAISS installation fails on some systems.

Solution:

# Try installing CPU version specifically pip install faiss-cpu==1.7.4 # On macOS with Apple Silicon: conda install -c pytorch faiss-cpu

3. Resume Text Extraction Fails

Problem: PDF text extraction returns empty content.

Solution:

  • Ensure PDFs are text-based, not scanned images
  • Try converting PDFs to text format first
  • Check file permissions

4. Streamlit Connection Error

Problem: Frontend can't connect to FastAPI backend.

Solution:

# Check if backend is running curl http://localhost:8000/health # Verify ports in .env file grep -E "(API_PORT|STREAMLIT_PORT)" .env

5. Slow Embedding Generation

Problem: Embedding generation takes too long.

Solution:

  • Check if you have GPU available
  • Reduce batch size in processing
  • Consider using smaller embedding model for testing

Debug Mode

Enable debug logging:

# Set in .env LOG_LEVEL=DEBUG # Or run with debug python -m backend.main --log-level DEBUG

πŸ”’ Security Considerations

Production Deployment

  • Change default ports
  • Set up proper CORS origins
  • Use environment-specific API keys
  • Enable HTTPS
  • Implement rate limiting
  • Add authentication
  • Secure file uploads
  • Monitor API usage

Data Privacy

  • Implement data retention policies
  • Add resume deletion functionality
  • Encrypt sensitive data
  • Audit API access
  • Comply with GDPR/privacy laws

πŸš€ Advanced Features

Scaling

  • Database: Replace FAISS with Pinecone/Weaviate for production
  • Caching: Add Redis for embedding caching
  • Queue: Use Celery for async processing
  • Load Balancing: Deploy with multiple API instances

Enhancements

  • Multi-language Support: Add language detection
  • Resume Scoring: Implement comprehensive scoring
  • Bias Detection: Add fairness checking
  • Integration: Connect with LinkedIn, ATS systems
  • Real-time Updates: WebSocket for live updates

🀝 Contributing

  1. Fork the repository
  2. Create feature branch (git checkout -b feature/amazing-feature)
  3. Commit changes (git commit -m 'Add amazing feature')
  4. Push to branch (git push origin feature/amazing-feature)
  5. Open Pull Request

Development Setup

# Install development dependencies pip install -r requirements-dev.txt # Run tests pytest tests/ # Format code black . isort . # Lint code flake8 .

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ™ Acknowledgments

πŸ“ž Support


Made with ❀️ for smarter recruiting

Python FastAPI Streamlit FAISS License

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors