Developed by: David Espinosa, September 2025
This repository contains the Natural Language Processing Workshop with:
- Active learning notebook (
IR_InvertedMatrix_Workshop.ipynb) - Automated submission script (
submit_assignment.py) - Validation + scoring (
utils/validate_notebook.py) - Configurable requirements (
config/required_items.json) - Auto-generated gradebook (
submissions_log.csv)
-
Fork & Clone the Repo
git clone https://github.com/<your_username>/IR_Inverted_Matrix_Workshop.git cd IR_Inverted_Matrix_Workshop
-
Install Dependencies
pip install -r requirements.txt
-
Work on the Notebook
- Open:
IR_InvertedMatrix_Workshop.ipynb - Fill all Markdown placeholders (no
TODOleft) - Implement required functions (e.g.,
build_inverted_index,query_processor) - Run all code cells
- Open:
-
Submit Your Work Run in the last cell:
!python submit_assignment.py --notebook IR_InvertedMatrix_Workshop.ipynb --student_id "team1"
✅ If valid → notebook is committed & pushed to
submissions/team1
❌ If errors → fix and resubmit -
Checklist Before Submitting
- No TODOs remain in Markdown
- All required functions implemented
- Notebook runs top-to-bottom without errors
-
Configure Requirements Edit
config/required_items.json:{ "required_functions": { "build_inverted_index": { "test_input": [["this is a doc", "this doc is about nlp"]], "expected_type": "dict", "points": 5 }, "query_processor": { "test_input": ["doc", {"doc1": [0], "doc2": [1]}], "expected_type": "list", "points": 5 } }, "required_markdown": { "Introduction": 2, "Reflection": 3 } } -
Scaffold Students
- Students fork the repo & complete the notebook
- Dependencies:
nbformat,gitpython
-
Collect Submissions
- Fetch all branches:
git fetch --all
- Review gradebook:
cat submissions_log.csv
Example log:
timestamp,student_id,notebook,score,max_score,status 2025-09-08T14:22:01,team1,IR_InvertedMatrix_Workshop.ipynb,8,10,✅ Passed 2025-09-08T14:25:12,team2,IR_InvertedMatrix_Workshop.ipynb,6,10,❌ Failed
- Fetch all branches:
- Automatic validation of Markdown + code answers
- Quick tests for required functions
- Point-based scoring with feedback
- Submissions logged in
submissions_log.csv - Final work pushed to
submissions/<team_id>branches
📌 For detailed step-by-step guides, see: