Vision System Demo

Standalone demo combining Object Detection (YOLOv8) and Face Recognition (face_recognition library) for testing and experimentation.

Features

🎯 Object Detection: Detects 80+ object classes (people, animals, objects)
👤 Face Recognition: Identifies specific people by name
📹 Live Webcam: Real-time detection and recognition
🖼️ Image Testing: Test on static images
📊 Detailed Analysis: Scene summaries with confidence scores
🎨 Visual Feedback: Color-coded bounding boxes

Installation

Method 1: Automatic (Recommended)

chmod +x install_vision.sh ./install_vision.sh

Method 2: Manual

# Update pip pip install --upgrade pip # Install dependencies pip install opencv-python pip install cmake pip install dlib pip install face_recognition pip install ultralytics pip install numpy

Note: On first run, YOLOv8 will automatically download the model weights (~6MB).

Quick Start

Basic Usage (No Face Recognition)

python vision_demo.py

Select option 1 for webcam or option 2 to test an image.

With Face Recognition

Create a directory for known faces:
```
mkdir known_faces
```

Add photos of people you want to recognize:

# Photo filenames become the person's name cp /path/to/victor_photo.jpg known_faces/Victor.jpg cp /path/to/jane_photo.jpg known_faces/Jane.jpg

Run the demo:
```
python vision_demo.py
```

Tips for face photos:

Front-facing, well-lit photos work best
One person per photo
Multiple photos per person will improve accuracy (coming soon)

Usage

Webcam Mode

python vision_demo.py # Choose option 1

Controls:

SPACE - Analyze current frame and show detailed results
s - Save current frame to disk
q - Quit

What you'll see:

Live webcam feed with FPS counter
When you press SPACE:
- Full scene analysis in terminal
- Detection results with bounding boxes
- Natural language description

Image Testing Mode

python vision_demo.py # Choose option 2 # Enter path to image

Analyzes a static image and displays results.

Understanding the Output

Visual Output

Color-coded bounding boxes:

🟢 GREEN - Identified person (with name and confidence)
🟡 YELLOW - Unidentified person
🔵 CYAN - Animals (cat, dog, bird, etc.)
🔷 BLUE - Objects (chair, laptop, ball, etc.)

Terminal Output

Example:

============================================================ SCENE ANALYSIS ============================================================ I can see: Victor, a cat, a laptop, 2 cups 🟢 Identified People: • Victor (face: 94.3%, detection: 0.95) 🔵 Animals: • Cat (confidence: 0.89) 🔷 Objects: • Laptop (confidence: 0.92) • Cup (confidence: 0.87) • Cup (confidence: 0.85)

Detectable Objects

People & Animals

person, cat, dog, bird, horse, cow, sheep, bear, zebra, giraffe, elephant

Common Objects

chair, couch, table, bed, tv, laptop, mouse, keyboard
cell phone, book, clock, vase, scissors
bottle, cup, fork, knife, spoon
car, bicycle, motorcycle, airplane, bus, train, truck
traffic light, fire hydrant, stop sign, parking meter, bench
backpack, umbrella, handbag, tie, suitcase
frisbee, skis, snowboard, sports ball, kite
baseball bat, baseball glove, skateboard, surfboard
tennis racket, bottle, wine glass, cup, fork
... and 80+ total classes

Full class list

Performance

Typical frame rates on CPU:

Object detection only: ~20-30 FPS
Object detection + face recognition: ~5-10 FPS

GPU acceleration:

YOLOv8 automatically uses CUDA if available
For face_recognition GPU support, dlib must be compiled with CUDA

Troubleshooting

"Could not open webcam"

Check webcam is connected and not in use by another program
Try different camera index: edit cap = cv2.VideoCapture(1) in the code

"No face found in image"

Ensure face is clearly visible and well-lit
Try a different photo with better face visibility
Face should be reasonably large in frame

dlib installation fails

Ubuntu/Debian:

sudo apt-get install build-essential cmake sudo apt-get install libopenblas-dev liblapack-dev sudo apt-get install libx11-dev libgtk-3-dev pip install dlib

YOLOv8 model won't download

Manually download from: https://github.com/ultralytics/assets/releases Place in: ~/.cache/ultralytics/

Code Structure

VisionDemo ├── __init__() # Initialize models ├── add_known_face() # Add person to database ├── detect_objects() # Run YOLO detection ├── identify_face() # Run face recognition ├── process_frame() # Complete pipeline ├── draw_detections() # Visualize results ├── describe_scene() # Natural language output ├── run_webcam() # Live demo └── test_image() # Static image demo

Customization

Adjust detection confidence

# In detect_objects(), filter by confidence if confidence < 0.5: # Adjust threshold continue

Change YOLO model size

# Faster but less accurate self.yolo = YOLO('yolov8n.pt') # nano (current) # More accurate but slower self.yolo = YOLO('yolov8s.pt') # small self.yolo = YOLO('yolov8m.pt') # medium self.yolo = YOLO('yolov8l.pt') # large self.yolo = YOLO('yolov8x.pt') # extra large

Adjust face recognition tolerance

# In identify_face(), change tolerance matches = face_recognition.compare_faces( [known_encoding], face_encodings[0], tolerance=0.6 # Lower = stricter (default 0.6) )

Filter object categories

# Add to process_frame() to ignore certain objects if det['class'] in ['chair', 'table']: # Objects to ignore continue

Integration with Iris

Once you're comfortable with how this works, key components to integrate:

VisionDemo.process_frame() - Main processing pipeline
VisionDemo.describe_scene() - Natural language generation
VisionDemo.add_known_face() - Building the face database

The structured scene dictionary can be passed directly into Iris's context.

Next Steps

Test with your webcam: See what objects are detected
Add your face: Create known_faces/Victor.jpg
Experiment with different scenes: Try different objects, lighting
Understand the output: See how confidence scores work
Customize: Adjust thresholds and filters for your needs
Integrate: Once comfortable, add to Iris's vision system

Resources

License

This demo uses:

YOLOv8: AGPL-3.0
face_recognition: MIT
dlib: Boost Software License

For commercial use, review each library's licensing requirements.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
ARCHITECTURE.txt		ARCHITECTURE.txt
FILE_SUMMARY.txt		FILE_SUMMARY.txt
QUICKSTART.txt		QUICKSTART.txt
README.md		README.md
README_VISION.md		README_VISION.md
capture_face.py		capture_face.py
install_vision.sh		install_vision.sh
test_installation.py		test_installation.py
vision_demo.py		vision_demo.py

Folders and files

Latest commit

History

Repository files navigation

Vision System Demo

Features

Installation

Method 1: Automatic (Recommended)

Method 2: Manual

Quick Start

Basic Usage (No Face Recognition)

With Face Recognition

Usage

Webcam Mode

Image Testing Mode

Understanding the Output

Visual Output

Terminal Output

Detectable Objects

People & Animals

Common Objects

Performance

Troubleshooting

"Could not open webcam"

"No face found in image"

dlib installation fails

YOLOv8 model won't download

Code Structure

Customization

Adjust detection confidence

Change YOLO model size

Adjust face recognition tolerance

Filter object categories

Integration with Iris

Next Steps

Resources

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages