Akshay Bhat, Cornell University. Website & Contact
Deep Video Analytics provides a platform for indexing and extracting information from videos and images. Deep learning detection and recognition algorithms are used for indexing individual frames / images along with detected objects. The goal of Deep Video analytics is to become a quickly customizable platform for developing visual & video analytics applications, while benefiting from seamless integration with state or the art models released by the vision research community.
Advertisement: If you are interested in Healthcare & Machine Learning please take a look at Computational Healthcare
git clone https://github.com/AKSHAYUBHAT/DeepVideoAnalytics && cd DeepVideoAnalytics/docker && docker-compose up Replace docker-compose by nvidia-docker-compose, the Dockerfile uses tensorflow gpu base image and appropriate version of pytorch. The Makefile for Darknet is also modified accordingly. This code was tested using an older NVidia Titan GPU and nvidia-docker.
git clone https://github.com/AKSHAYUBHAT/DeepVideoAnalytics cd DeepVideoAnalytics/docker_GPU pip install --upgrade nvidia-docker-compose nvidia-docker-compose up Start a P2 instance with ami-b3cc1fa5 (N. Virginia), ports 8000, 6006, 8888 open (preferably to only your IP) and run following command after ssh'ing into the machine.
cd deepvideoanalytics && git pull && cd docker_GPU && ./rebuild.sh && nvidia-docker-compose up you can optionally specify "-d" at the end to detach it, but for the very first time its useful to read how each container is started. After approximately 3 ~ 5 minutes the user interface will appear on port 8000 of the instance ip.
Security warning! The current GPU container uses nginx <-> uwsgi <-> django setup to ensure smooth playback of videos. However it runs nginix as root (though within the container). Considering that you can now modify AWS Security rules on-the-fly, I highly recommend only allowing inbound traffic from your own IP address.
Process used for AMI creation is here
##Alpha version To Do list Deep Video Analytics is currently under active development.
- Django App
- Tasks using Celery & RabbitMQ
- Postgres database
- Deployment using docker-compose
- Minimal user interface for uploading and browsing uploaded videos/images
- Task for frame extraction from videos
- Simple detection models using Darknet YOLO
- Working visual search & indexer tasks using PyTorch
- Simple set of tests (E.g. upload a video, perform processing, indexing, detection)
- Deployment using nvidia-docker-compose for machines with GPU
- Continuous integration test suite
- Improved user interface for browsing past queries
- Improve TEvent model to track state of tasks
- Improved frame extraction using PySceneDetect (every 100th frame and frame selected by content change)
- Integrate Tensorflow 1.0
- Improved models by adding information about user performing the uploading video/dataset
- Automated docker based testing
- Implement a method to backup postgres db & media folder to S3 via a single command
- Integrate youtube-dl for downloading videos
- Test Deployment on AWS P2 machines running nvidia-docker
- Implemented nginx <-> uwsgi <-> django on GPU container for optimized serving of videos and static assets.
- Index detected object / create a separate query indexer using NMS lib or Anony
- Alexnet indexing using Pytorch
- YOLO 9000 (naive implementation, gets reloaded in memory for every video)
- Google inception using Tensorflow
- Pytorch Squeezenet
- Facenet or Openface (via a connected container)
- Soundnet
- Mapnet (requires converting models from Marvin)
- Keras-js which uses Keras inception for client side indexing
- Metadata stored in Postgres.
- Operations (Querying, Frame extraction & Indexing) performed using celery tasks and RabbitMQ.
- Separate queues and workers to allow selection of machines with GPU & RAM for specific tasks such as indexing / computing features.
- Videos, frames, indexes, numpy vectors stored in media directory.
You can use the jupyter notebook explore.ipynb to manually run tasks & code against the databases.
- One directory per video or dataset (a set of images)
- Extracted frames and detections are stored in detections/ & frames/ under the video directory
- Indexes (numpy arrays) and list of corresponding frames & detections are stored
- Query images are also stored inside media/queries/ named using primary key of the query object.
- Designed to enables rapid sync with S3 or processing via a third party program.
###Media directory organization example:
media/ ├── 1 │ ├── audio │ ├── detections │ ├── frames │ │ ├── 0.jpg │ │ ├── 10.jpg │ │ ... │ │ └── 98.jpg │ ├── indexes │ │ ├── alexnet.framelist │ │ └── alexnet.npy │ └── video │ └── 1.mp4 ├── 2 │ └── video │ └── 2.mp4 │ ├── detections │ ├── frames │ │ ├── 0.jpg │ │ ├── 10.jpg ... └── queries ├── 1.png ├── 10.png .... ├── 8.png └── 9.png 19 directories, 257 files - Pytorch License
- Darknet License
- AdminLTE2 License
- FabricJS License
- Modified PySceneDetect License
- Docker
- Nvidia-docker
- OpenCV
- Numpy
- FFMPEG
- Tensorflow
Copyright 2016-2017, Akshay Bhat, Cornell University, All rights reserved.





