Owlyn

System Architecture I
System Architecture II
Monitoring Dashboard
Assistant Mode
Interview Session
Analysis

Inspiration

Technical hiring is fundamentally constrained by scale.

Organizations often receive hundreds of applications for a single role, making it impossible to conduct meaningful live interviews for every candidate. As a result, many companies rely on static coding tests, asynchronous recordings, or shallow AI interview tools.

These approaches remove the most important element of technical evaluation: context.

They cannot see a candidate’s workspace, follow their reasoning in real time, or react to how a solution evolves while it is being written.

We built Owlyn to solve this problem.

Owlyn is an autonomous multimodal agent ecosystem that conducts live interviews, provides real-time assistance, and generates structured technical evaluation reports.

Instead of static prompts or delayed AI responses, Owlyn can see, hear, and reason about a candidate’s workspace in real time. Using Gemini Live and multimodal AI agents, the system conducts natural voice interviews, observes coding behavior, and generates structured technical insights.

Owlyn is therefore a multimodal AI platform for live technical interviews, interview preparation, and real-time assistance.

What it does

Owlyn enables organizations and candidates to interact with AI in a fully multimodal technical environment.

The system analyzes:

voice conversations
screen activity
coding behavior
whiteboard interactions
webcam signals

From these inputs, Owlyn can conduct interviews, provide assistance, monitor sessions, and generate evaluation reports.

Live Technical Interviews

Owlyn conducts real-time technical interviews using Gemini Live with sub-second conversational latency.

Candidates interact with the AI interviewer in a professional workspace that includes:

Monaco code editor
whiteboard canvas
notes panel

As the candidate writes code or explains their reasoning, the system observes their implementation and asks relevant follow-up questions.

The interviewer reacts dynamically to the candidate’s logic and reasoning process rather than relying on static questions.

Practice Mode

Owlyn includes a Practice Mode designed for candidates preparing for real interviews.

Users can configure their own session by selecting:

topic or skill area
difficulty level
session duration
preferred spoken language

The system then generates a technical session that mirrors the real interview environment.

Practice Mode uses the same multimodal infrastructure as enterprise interviews, allowing candidates to interact with the AI interviewer while coding, drawing diagrams, or explaining solutions.

This gives users the opportunity to practice in a realistic interview environment before entering a real session.

Assistant Mode

Owlyn also functions as a persistent multimodal assistant.

In Assistant Mode, the AI runs as a floating widget that stays alongside the user’s IDE and terminal.

The assistant can:

observe the screen workspace
listen to voice prompts
analyze code context
provide debugging insights
explain architecture or implementation decisions

This creates an experience similar to pair programming with an AI collaborator that understands your working environment.

Monitoring Dashboard

Organizations can observe active interviews through a Monitoring Dashboard.

Recruiters or hiring managers can join a session as silent observers, gaining visibility into:

the candidate’s coding workspace
live transcripts
conversation flow
security alerts

If the system detects suspicious activity through the Sentinel agents, the monitoring interface immediately surfaces alerts so teams can intervene if necessary.

This gives hiring teams real-time visibility without disrupting the interview process.

Recruitment Dashboard

Owlyn also includes a Recruitment Management Dashboard for managing the hiring pipeline.

Teams can:

track candidates across interview stages
configure interview parameters
review AI-generated technical reports
collaborate internally on candidate evaluations

While Owlyn performs automated evaluation, human recruiters remain responsible for the final hiring decision.

Talent Pool

All completed interviews are stored in a centralized Talent Pool.

This system allows organizations to:

compare candidates side-by-side
filter candidates by skill score or role
review transcripts and code evolution
analyze hiring pipeline performance

Because every candidate is evaluated using the same criteria, the Talent Pool creates a more objective way to identify strong technical talent across large candidate pools.

Automated Technical Evaluation

After each session, Owlyn generates a structured technical report based on:

transcripts
coding behavior
reasoning patterns
interaction signals

The report includes structured insights into:

problem solving ability
code quality
communication clarity
algorithmic reasoning

These reports help recruiters quickly evaluate technical ability at scale.

How we built it

Owlyn is built as a distributed multimodal AI system powered by specialized agents.

Client Layer

The desktop application is built using:

Electron
React
Monaco Editor

The client captures:

microphone audio
webcam video
screen workspace
coding activity

These signals are streamed to the backend using WebRTC and REST APIs.

Real-Time Media Infrastructure

We use LiveKit to handle real-time audio and video streaming between the client and AI agents.

This allows Owlyn to maintain natural voice conversations with Gemini Live while keeping latency extremely low.

Backend Orchestration Layer

The backend is built using Spring Boot and acts as the central orchestrator.

It manages:

session lifecycle
candidate data
transcript synchronization
agent communication

All multimodal signals pass through this layer to ensure a unified system state.

Multi-Agent Architecture

Owlyn operates through several specialized Gemini-powered agents (GenAI SDK)

Question Generator (Gemini Flash)
Generates technical challenges based on job role and requirements.

Interviewer Agent (Gemini Live)
Conducts the real-time voice interview.

Workspace Sentinel (Gemini Vision)
Observes the coding workspace and analyzes implementation logic.

Integrity Sentinel
Monitors webcam signals for suspicious activity.

Technical Assessor (Gemini Pro)
Analyzes session logs and transcripts to generate structured evaluation reports.

These agents operate independently while sharing context through the orchestration layer.

Infrastructure

Owlyn runs on:

Google Cloud Virtual Machines
Docker
PostgreSQL
Redis
LiveKit
Gemini APIs

Redis stores live session state while PostgreSQL stores persistent interview records and reports.

Challenges we ran into

Real-Time Latency

Traditional LLM systems often introduce multi-second delays.

Using Gemini Live with WebRTC streaming allowed us to maintain sub-second response times necessary for natural conversations.

Multimodal Synchronization

Owlyn processes several signals simultaneously:

audio
video
workspace activity
code evolution

Coordinating these streams across multiple AI agents while keeping latency low required careful system orchestration.

Bandwidth Optimization

Continuous video analysis would consume significant bandwidth.

We optimized the system by streaming 1 frame per second, which provides enough visual context for Gemini Vision while maintaining accessibility for users with typical internet connections.

Multi-Agent Coordination

Gemini agents cannot directly communicate with each other.

To solve this, we built a central orchestration layer that routes context between agents and synchronizes their reasoning.

Accomplishments that we're proud of

Building a real-time multimodal interview system
Designing a multi-agent AI architecture powered by Gemini
Creating a practice environment that mirrors real interviews
Developing a persistent multimodal assistant
Building a monitoring system for live interview visibility
Implementing a Talent Pool for large-scale candidate evaluation

Most importantly, we built a system that feels less like a chatbot and more like an intelligent technical collaborator.

What we learned

Real-time AI systems require very different design decisions compared to traditional LLM applications.

Latency becomes the most important factor, especially when voice conversations and coding activities occur simultaneously.

We also learned that specialized agents outperform monolithic models when dealing with multimodal reasoning.

What's next for Owlyn

We plan to expand Owlyn with:

Hybrid AI + Human Interviews
Recruiters will be able to jump into an interview and seamlessly take over from the AI interviewer.

ATS Integrations
Direct integrations with hiring tools like Greenhouse and Lever.

Collaborative Whiteboarding
Allowing candidates and interviewers to work on the same whiteboard in real time.

Built With

Updates

Rahman Nugar started this project — Mar 16, 2026 06:13 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.