The document discusses object detection and instance segmentation models like YOLOv5, Faster R-CNN, EfficientDet, Mask R-CNN, and TensorFlow's object detection API. It provides information on labeling images with bounding boxes for training these models, including open-source and commercial annotation tools. The document also covers evaluating object detection models using metrics like mean average precision (mAP) and intersection over union (IoU). It includes an example of training YOLOv5 on a custom dataset.
Introduction to object detection, various algorithms like YOLO, Faster-RCNN, and the need for labeled data. Discussion on evaluation metrics such as Mean Squared Error (MSE) and mean Average Precision (mAP) for object detection models.
Details on the output format for object detection models and the implementation process with examples of labeling and training.
Description of methods for detecting multiple objects in images and the architecture of object detection models.
Explanation of YOLO algorithm architecture, its efficiency in real-time detection, and output interpretation.
Introduction to YOLOv5, its advantages in production, and implementation details for inference.
Instructions for running inference using YOLOv5, including code snippets and examples for different media types.
Introduction to Faster R-CNN, its history, development, and how it improves upon previous models like R-CNN.
EfficientDet's structure, effectiveness, and architectural advantages over other detectors.
Understanding instance segmentation and the use of models like Mask R-CNN for object detection.
Overview of Detectron2 for object detection, its implementation in PyTorch, and model management.
Guidelines on preparing data for Detectron in COCO format and training custom models.
Steps involved in training Mask R-CNN on custom datasets with instructions on visualization.
Introduction to TensorFlow Object Detection API including model building, training custom datasets, and inference.Essential steps for gathering data, labeling images, and creating TFRecord files for model training.
Object Detection 2Hichem Felouat- Algeria - hichemfel@gmail.com • We all know about the image classification problem. Given an image, the Net finds out the class to which the image belongs. • Localizing an object in a picture means predicting a bounding box around the object and can be expressed as a regression task.
3.
3Hichem Felouat -Algeria - hichemfel@gmail.com Object Detection Problem: the dataset does not have bounding boxes around the objects, how can we train our model? • We need to add them ourselves. This is often one of the hardest and most costly parts of a Machine Learning project: getting the labels. • It is a good idea to spend time looking for the right tools.
4.
4Hichem Felouat -Algeria - hichemfel@gmail.com Image Labeling Tool An image labeling or annotation tool is used to label the images for bounding box object detection and segmentation. Open-source image labeling tool like : • VGG Image • Annotator • LabelImg • OpenLabeler • ImgLab Commercial tool like : • LabelBox • Supervisely Crowdsourcing Platform like : • Amazon Mechanical Turk
Hichem Felouat -Algeria - hichemfel@gmail.com 7 Object Detection - Notes • The bounding boxes should be normalized so that the horizontal and vertical coordinates, as well as the height and width, all range from 0 to 1. • It is common to predict the square root of the height and width rather than the height and width directly: this way, a 10-pixel error for a large bounding box will not be penalized as much as a 10-pixel error for a small bounding box.
8.
Hichem Felouat -Algeria - hichemfel@gmail.com 8 How to Evaluate Object Detection Model? • The MSE often works fairly well as a cost function to train the model, but it is not a great metric to evaluate how well the model can predict bounding boxes. • The most common metric for this is the Intersection over Union (IoU). • tf.keras.metrics.MeanIoU
9.
Hichem Felouat -Algeria - hichemfel@gmail.com 9 mean Average Precision (mAP) In order to calculate mAP, we draw a series of precision-recall curves with the IoU threshold set at varying levels of difficulty. In COCO evaluation, the IoU threshold ranges from 0.5 to 0.95 with a step size of 0.05 represented as AP@[.5:.05:.95] Evaluate Object Detection Model 2 mAP
10.
Hichem Felouat -Algeria - hichemfel@gmail.com 10 Evaluate Object Detection Model 2 mAP We draw these precision-recall curves for the dataset split out by class type (for example 3 classes and 6 threshold ranges).
11.
Hichem Felouat -Algeria - hichemfel@gmail.com 11 Evaluate Object Detection Model 2 mAP These precision and recall values are then plotted to get a PR (precision-recall) curve. The area under the PR curve is called Average Precision (AP). • For each class, calculate AP at different IoU thresholds and take their average to get the AP of that class. Exp: in PASCAL VOC challenge 2007, it is defined as the mean of precision values at a set of 11 equally spaced recall levels [0,0.1,…,1] (0 to 1 at step size of 0.1). Object detection metrics https://github.com/rafaelpadilla/Object-Detection-Metrics
12.
Hichem Felouat -Algeria - hichemfel@gmail.com 12 Evaluate Object Detection Model 2 mAP • Calculate the final AP by averaging the AP over different classes. mAP-IoU thresholds mAP - example COCO dataset
13.
13Hichem Felouat -Algeria - hichemfel@gmail.com Object Detection - Example (C, X,Y, W, H) raw image 416*416 image labeling (C, X,Y, W, H) image result model • Each item should be a tuple of the form : (images, (class labels, bounding boxes) )
14.
14Hichem Felouat -Algeria - hichemfel@gmail.com Object Detection - Example Full code: https://www.kaggle.com/hichemfelouat/braintumorlocalization
Hichem Felouat -Algeria - hichemfel@gmail.com 16 Multiple Objects Detection The task of classifying and localizing multiple objects in an image is called object detection. Detecting multiple objects by sliding a CNN across the image.
17.
Hichem Felouat -Algeria - hichemfel@gmail.com 17 Multiple Objects Detection 1. You need to add an extra objectness output to your CNN, to estimate the probability that an object is indeed present in the image. 2. Find the bounding box with the highest objectness score, and get rid of all the other bounding boxes that overlap a lot with it (IoU). 3. Repeat step two until there are no more bounding boxes to get rid of.
18.
Hichem Felouat -Algeria - hichemfel@gmail.com 18 Multiple Objects Detection In general, object detectors have three (3) main components: 1) The backbone that extracts features from the given image. 2) The feature network that takes multiple levels of features from the backbone as input and outputs a list of fused features that represent salient characteristics of the image. 3) The final class/box network that uses the fused features to predict the class and location of each object.
19.
Hichem Felouat -Algeria - hichemfel@gmail.com 19 What is Yolo? • You Only Look Once (YOLO) is an algorithm that uses convolutional neural networks for object detection. • It is one of the faster object detection algorithms out there. • It is a very good choice when we need real-time detection, without loss of too much accuracy. You Only Look Once: Unified, Real-Time Object Detection https://arxiv.org/abs/1506.02640 YOLOV3: https://arxiv.org/abs/1804.02767 YOLOV4: https://arxiv.org/abs/2004.10934
20.
Hichem Felouat -Algeria - hichemfel@gmail.com 20 What is Yolo? YoloV4 architecture
21.
Hichem Felouat -Algeria - hichemfel@gmail.com 21 What is Yolo? Backbone refers to the feature-extraction architecture. Tiny YOLO has only 9 convolutional layers, so it’s less accurate but faster and better suited for mobile and embedded projects. Darknet53 ( The backbone used in YOLOV3) has 53 convolutional layers, so it’s more accurate but slower. In YOLOv4, backbones can be VGG, ResNet, SpineNet, EfficientNet, ResNeXt, or Darknet53.
22.
Hichem Felouat -Algeria - hichemfel@gmail.com 22 What is Yolo? Neck additional layers between the backbone and the head (dense prediction block), its purpose is to add extra information in the layers. ( Feature Pyramid Networks, Path Aggregation Network ...). Head The head block is the part used to: locate bounding boxes and classify what’s inside each box. Sparse Prediction used in two-stage-detection algorithms.
23.
Hichem Felouat -Algeria - hichemfel@gmail.com 23 Interpreting The YOLO Output • The input is a batch of images of shape (m, 608, 608, 3). • The output is a list of bounding boxes along with the recognized classes. Each bounding box is represented by 6 numbers (pc, bx, by, bh, bw, c). If we expand c into an 80-dimensional vector (The number of classes), each bounding box is then represented by 85 numbers. • To do that, we divide the input image into a grid of dimensions equal to that of the final feature map. For example, the input image is 608 x 608, and the dimensions of the feature map are 19 x 19. So the stride of the network will be 32 (608/19=32).
24.
Hichem Felouat -Algeria - hichemfel@gmail.com 24 Interpreting The YOLO Output • IMAGE (m, 608, 608, 3) -> DEEP CNN -> ENCODING (m, 19, 19, 5, 85). • We are using 5 anchor boxes, each of the 19x19 cells thus encodes information about 5 boxes (YOLOV3).
25.
Hichem Felouat -Algeria - hichemfel@gmail.com 25 Interpreting The YOLO Output • Now, for each box (of each cell) we will compute the following elementwise product and extract a probability that the box contains a certain class.
26.
Hichem Felouat -Algeria - hichemfel@gmail.com 26 What does YOLO Predict on an Image
27.
Hichem Felouat -Algeria - hichemfel@gmail.com 27 Output Processing We will carry out these steps: • Get rid of boxes with a low score (meaning, the box is not very confident about detecting a class (exp: pc < 0.5)). • Select only one box when several boxes overlap with each other and detect the same object (Non-max suppression).
28.
Hichem Felouat -Algeria - hichemfel@gmail.com 28 YOLOV5 • YOLOv5 was released by Glenn Jocher on June 9, 2020. It follows the recent releases of YOLOv4 (April 23, 2020). • YOLOv5 is smaller and generally easier to use in production. Given it is natively implemented in PyTorch (rather than Darknet), modifying the architecture and exporting to many deploy environments is straightforward. • YOLOv5s is about 88% smaller than big-YOLOv4 (27 MB vs 244 MB). ultralytics /yolov5 : https://github.com/ultralytics/yolov5
Hichem Felouat -Algeria - hichemfel@gmail.com 32 Inference - Google Colab
33.
Hichem Felouat -Algeria - hichemfel@gmail.com 33 Inference - Google Colab Inference can be run on most common media formats. Model checkpoints are downloaded automatically if available. Results are saved to ./inference/output. !python detect.py --source 0 # webcam file.jpg # image file.mp4 # video path/ # directory path/*.jpg # glob rtsp://170.93.143.139/rtplive/470011e600ef003a004ee33696235daa # rtsp stream http://112.50.243.8/PLTV/88888888/224/3221225900/1.m3u8 # http stream ultralytics /yolov5 : https://github.com/ultralytics/yolov5
Hichem Felouat -Algeria - hichemfel@gmail.com 35 Inference - Example
36.
Hichem Felouat -Algeria - hichemfel@gmail.com 36 Custom object detection data in YOLOV5 format - image public blood cell detection dataset Train YOLOV5 On a Custom Dataset
37.
Hichem Felouat -Algeria - hichemfel@gmail.com 37 Custom object detection data in YOLOv5 format - label Label: class bx by bh bw.txt Train YOLOV5 On a Custom Dataset
38.
Hichem Felouat -Algeria - hichemfel@gmail.com 38 Train YOLOV5 On a Custom Dataset Download custom object detection data in YOLOV5 format from Roboflow. # Download Custom Dataset %mkdir /content/my dataset/ %cd /content/my dataset !curl -L "YOUR LINK HERE" > roboflow.zip; unzip roboflow.zip; rm roboflow.zip Roboflow simplifies your computer workflow from data organization, annotation verification, preprocessing, augmentations to exporting to your required model format. https://app.roboflow.ai/login Getting Started with Roboflow : https://blog.roboflow.com/getting-started-with-roboflow/
39.
Hichem Felouat -Algeria - hichemfel@gmail.com 39 Use the "YOLOv5 PyTorch" export format. Note that the Ultralytics implementation calls for a YAML file defining where your training and test data is. The Roboflow export also writes this format for us. Train YOLOV5 On a Custom Dataset
40.
Hichem Felouat -Algeria - hichemfel@gmail.com 40 Define Model Configuration and Architecture data.yaml file is for specifying the location of a YOLOv5 images folder, a YOLOv5 labels folder, and information on our custom classes.
41.
Hichem Felouat -Algeria - hichemfel@gmail.com 41 Define Model Configuration and Architecture My dataset Test set Test images Test labels data.yaml label An example of how to organize a dataset
42.
Hichem Felouat -Algeria - hichemfel@gmail.com 42 Define Model Configuration and Architecture 1) W e w i l l w r i t e a n e w y a m l s c r i p t f i l e ( e x p : my model l.ymal) that defines the parameters for our model like the number of classes, anchors, and each layer. 2) Copy then paste the script from one of the files (yolov5s, m, l, x) into your ymal script file (exp: my model l.ymal). 3) Update the parameters.
43.
Hichem Felouat -Algeria - hichemfel@gmail.com 43 Define Model Configuration and Architecture
Hichem Felouat -Algeria - hichemfel@gmail.com 45 Train YoloV5 on Custom Dataset • img: define input image size • batch: determine batch size • epochs: define the number of training epochs. (Note: often, 3000+ are common here!) • data: set the path to our yaml file • cfg: specify our model configuration • weights: specify a custom path to weights. (Note: you can download weights from the Ultralytics Google Drive folder) • name: result names • nosave: only save the final checkpoint • cache: cache images for faster training
46.
Hichem Felouat -Algeria - hichemfel@gmail.com 46 Evaluate Custom YOLOV5 Detector Performance # Evaluate Custom YOLOV5 Detector Performance %load ext tensorboard %tensorboard --logdir runs # we can also output some older school graphs if the tensor board isn't working for whatever reason... Image(filename='/content/yolov5/runs/exp0 my model l results/results.png', width=1000) # view results.png
Hichem Felouat -Algeria - hichemfel@gmail.com 48 Run Inference With Trained Weights # Inference all test images : %cd /content/yolov5/ !python detect.py --weights /content/yolov5/runs/exp0_my_model_l_results/weights/best.pt --img 416 --conf 0.5 --source /content/my_dataset/test/images #display inference on ALL test images import glob from IPython.display import Image, display for imageName in glob.glob('/content/yolov5/inference/output/*.jpg'): display(Image(filename=imageName)) print("n")
49.
Hichem Felouat -Algeria - hichemfel@gmail.com 49 Export Saved YOLOV5 Weights for Future Inference # Export Saved YOLOv5 Weights for Future Inference from google.colab import drive drive.mount('/content/gdrive') %cp /content/yolov5/runs/exp2 my model results/weights/best my model results.pt /content/gdrive/My Drive We recommend following the full code in this YOLOv5 Colab Notebook: https://colab.research.google.com/drive/1gDZ2xcTOgR39tGGs-EZ6i3RTs16wmzZQ How to Train YOLOv5 On a Custom Dataset: https://blog.roboflow.com/how-to-train-yolov5-on-a-custom-dataset/
50.
Hichem Felouat -Algeria - hichemfel@gmail.com 50 Faster R-CNN • Faster R-CNN (Faster Region-based Convolutional Neural Network) is now a canonical model for deep learning-based object detection. It helped inspire many detection and segmentation models that came after it. • Faster R-CNN was originally published in NIPS 2015. • We can not understand Faster R-CNN without understanding its own predecessors, R-CNN and Fast R-CNN.
51.
Hichem Felouat -Algeria - hichemfel@gmail.com 51 R-CNN R-CNN (Region-based Convolutional Neural Network) (2014): • Scan the input image for possible objects using an algorithm called Selective Search, generating ~2000 region proposals. • Run a convolutional neural net (CNN) on top of each of these region proposals. • Take the output of each CNN and feed it into a) an SVM to classify the region and b) a linear regressor to tighten the bounding box of the object if such an object exists. Selective Search for Object Recognition: https://link.springer.com/article/10.1007%252Fs11263-013-0620-5 R-CNN: https://arxiv.org/abs/1311.2524
Hichem Felouat -Algeria - hichemfel@gmail.com 53 Fast R-CNN The purpose of the Fast Region-based Convolutional Network (Fast R-CNN) is to reduce the time consumption related to the high number of models necessary to analyze all region proposals. • Fast R-CNN developed by Ross Girshick in 2015. • Performing feature extraction over the image before proposing regions, thus only running one CNN over the entire image instead of 2000 CNN’s over 2000 overlapping regions. • Replacing the SVM with a softmax layer, thus extending the neural network for predictions instead of creating a new model. Fast R-CNN: https://arxiv.org/abs/1504.08083
Hichem Felouat -Algeria - hichemfel@gmail.com 55 Faster R-CNN • The input image is passed through a pre-trained CNN (The original Faster R-CNN used VGG), then we use the convolutional feature map we get for the next part. • Next, the Region Proposal Network (RPN) uses the features that the CNN computed to find up to a predefined number of regions (bounding boxes), which may contain objects. It generates multiple possible regions based on k fixed-ratio anchor boxes (default bounding boxes). RPN accelerates the training and testing processes. • Finally, comes the R-CNN module, which uses that information to: Classify the content in the bounding box (or discard it, using “background” as a label). Adjust the bounding box coordinates (so it better fits the object).
56.
Hichem Felouat -Algeria - hichemfel@gmail.com 56 Faster R-CNN Faster R-CNN: Down the rabbit hole of modern object detection https://tryolabs.com/blog/2018/01/18/faster-r-cnn-down-the-rabbit-hole-of-modern-object-detection/ Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks https://arxiv.org/abs/1506.01497
57.
Hichem Felouat -Algeria - hichemfel@gmail.com 57 • EfficientDets are a family of object detector models that are based on EfficientNet and is reportedly much efficient than other states of the art models. • In the world of object detection balancing the trade-off between accuracy and performance efficiency of the detectors is a major challenge; EfficientDet attempts to minimize the trade-off and give the best detector both in terms of accuracy and performance. • EfficientDet-D7 achieves state-of-the-art 51.0 mAP on COCO dataset with 52M parameters and 326B FLOPS1 , being 4x smaller and using 9.3x fewer FLOPS. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks: https://arxiv.org/abs/1905.11946 EfficientDet: Scalable and Efficient Object Detection : https://arxiv.org/abs/1911.09070 Github: https://github.com/google/automl/tree/master/efficientdet
58.
Hichem Felouat -Algeria - hichemfel@gmail.com 58 EfficientDet architecture. EfficientDet uses EfficientNet as the backbone network and a newly proposed bi- directional feature pyramid network (BiFPN) feature network.
Hichem Felouat -Algeria - hichemfel@gmail.com 60 Instance Segmentation Instance Segmentation aims to predicting the object class-label and the pixel-specific object instance-mask, it localizes different classes of object instances present in various images. Instance segmentation aims to help largely robotics, autonomous driving, surveillance, etc. • Classification: There is a balloon in this image. • Semantic Segmentation: These are all the balloon pixels. • Object Detection: There are 7 balloons in this image at these locations. We’re starting to account for objects that overlap. • Instance Segmentation: There are 7 balloons at these locations, and these are the pixels that belong to each one. A Survey on Instance Segmentation: State of the art https://arxiv.org/abs/2007.00047
61.
Hichem Felouat -Algeria - hichemfel@gmail.com 61 Mask R-CNN • Mask R-CNN (Mask Regional Convolutional Neural Network) is a state of the art model for instance segmentation, developed on top of Faster R-CNN. • For a given image, Mask R-CNN, in addition to the class label and bounding box coordinates for each object, will also return the object mask. • The Mask R-CNN model introduced in the 2018 paper titled “Mask R-CNN”. Mask R-CNN: https://arxiv.org/abs/1703.06870
Hichem Felouat -Algeria - hichemfel@gmail.com 63 Detectron • Detectron2 was built by Facebook AI Research (FAIR) to support rapid implementation and evaluation of novel computer vision research. • Detectron2 is now implemented in PyTorch. • Detectron2 is flexible and extensible, and able to provide fast training on single or multiple GPU servers. • Detectron2 can be used as a library to support different projects on top of it. Detectron2: A PyTorch-based modular object detection library https://ai.facebook.com/blog/-detectron2-a-pytorch-based-modular-object-detection-library-/ Detectron2 : https://github.com/facebookresearch/detectron2?fbclid=IwAR2CdXQoTU9i-ebKPZIc7BQw8R6NKgp0B- yUkGr1BF3w1VKWzNhxFHi6Zbw detectron2’s documentation: https://detectron2.readthedocs.io/
64.
Hichem Felouat -Algeria - hichemfel@gmail.com 64 Detectron Model Zoo
65.
Hichem Felouat -Algeria - hichemfel@gmail.com 65 Detectron Model Zoo Detectron2 Model Zoo : https://github.com/facebookresearch/detectron2/blob/master/MODEL_ZOO.md
66.
Hichem Felouat -Algeria - hichemfel@gmail.com 66 Detectron - Google Colab # install dependencies: !pip install pyyaml==5.1 pycocotools>=2.0.1 import torch, torchvision print(torch. version , torch.cuda.is available()) !gcc --version # install detectron2: (Colab has CUDA 10.1 + torch 1.6) # See https://detectron2.readthedocs.io/tutorials/install.html for instructions assert torch. version .startswith("1.6") !pip install detectron2 -f https://dl.fbaipublicfiles.com/detectron2/wheels/cu101/torch1.6/index.html
67.
Hichem Felouat -Algeria - hichemfel@gmail.com 67 Detectron - Inference # Some basic setup: # Setup detectron2 logger import detectron2 from detectron2.utils.logger import setup_logger setup_logger() # import some common libraries import numpy as np import os, json, cv2, random from google.colab.patches import cv2_imshow # import some common detectron2 utilities from detectron2 import model_zoo from detectron2.engine import DefaultPredictor from detectron2.config import get_cfg from detectron2.utils.visualizer import Visualizer from detectron2.data import MetadataCatalog, DatasetCatalog # Load an image : img = cv2.imread("/content/img.png") cv2_imshow(img) Detectron2 Beginner's Tutorial : Colab https://colab.research.google.com/drive/16jcaJoc6bCF AQ96jDe2HwtXj7BMD_-m5
68.
Hichem Felouat -Algeria - hichemfel@gmail.com 68 Detectron - Inference # MODEL_ZOO : Choose the link of the algorithm https://github.com/facebookresearch/detectron2/blob/master/MODEL_ZOO.md # faster_rcnn model_link1 = "COCO-Detection/faster_rcnn_X_101_32x8d_FPN_3x.yaml" # mask_rcnn model_link2 = "COCO-InstanceSegmentation/mask_rcnn_X_101_32x8d_FPN_3x.yaml" # keypoint_rcnn model_link3 = "COCO-Keypoints/keypoint_rcnn_X_101_32x8d_FPN_3x.yaml" # Run a pre-trained detectron2 model : # we create a detectron2 config and a detectron2 DefaultPredictor to run inference on this image. cfg = get_cfg() # add project-specific config (e.g., TensorMask) here if you're not running a model in detectron2's core library cfg.merge_from_file(model_zoo.get_config_file(model_link2)) cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.5 # set threshold for this model # Find a model from detectron2's model zoo. You can use the https://dl.fbaipublicfiles... url as well cfg.MODEL.WEIGHTS = model_zoo.get_checkpoint_url(model_link2) predictor = DefaultPredictor(cfg) outputs = predictor(img)
69.
Hichem Felouat -Algeria - hichemfel@gmail.com 69 Detectron - Inference # Show results print("-----------------------") print("outputs : n",outputs) print("-----------------------") print("pred_classes : n",outputs["instances"].pred_classes) print("pred_boxes : n",outputs["instances"].pred_boxes) # pred_masks for mask_rcnn : outputs["instances"].pred_masks # We can use `Visualizer` to draw the predictions on the image. vis = Visualizer(img[:, :, ::-1], MetadataCatalog.get(cfg.DATASETS.TRAIN[0]), scale=1.2) out = vis.draw_instance_predictions(outputs["instances"].to("cpu")) cv2_imshow(out.get_image()[:, :, ::-1]) """ # PanopticSegmentation model_link = "COCO-PanopticSegmentation/panoptic_fpn_R_101_3x.yaml" . . predictor = DefaultPredictor(cfg) panoptic_seg, segments_info = predictor(im)["panoptic_seg"] v = Visualizer(im[:, :, ::-1], MetadataCatalog.get(cfg.DATASETS.TRAIN[0]), scale=1.2) out = v.draw_panoptic_seg_predictions(panoptic_seg.to("cpu"), segments_info) cv2_imshow(out.get_image()[:, :, ::-1]) """
Hichem Felouat -Algeria - hichemfel@gmail.com 71 Train Detectron on a custom dataset Convert your data-set to COCO-format : The COCO dataset is formatted in JSON and has five annotation types: object detection, keypoint detection, stuff segmentation, panoptic segmentation, and image captioning. It is a collection of “info”, “licenses”, “images”, “annotations”, “categories” (in most cases), and “segment info” (in one case). • If your dataset is not in coco format, you have to convert it.
Hichem Felouat -Algeria - hichemfel@gmail.com 73 Train Detectron on a custom dataset Import and Register Custom Detectron2 Data in COCO JSON format From Roboflow : # Download Custom Dataset # COCO JSON %mkdir /content/my dataset/ %cd /content/my dataset !curl -L "{YOUR LINK HERE}" > roboflow.zip; unzip roboflow.zip; rm roboflow.zip Full code : https://colab.research.google.com/drive/1-TNOcPm3Jr3fOJG8rnGT9gh60mHUsvaW#scrollTo=zVBjf0DE7HEW
74.
Hichem Felouat -Algeria - hichemfel@gmail.com 74 Train Detectron on a custom dataset Public blood cell dataset
75.
Hichem Felouat -Algeria - hichemfel@gmail.com 75 Train Detectron on a custom dataset • Detectron2 keeps track of a list of available datasets in a registry, so we must register our custom data with Detectron2 so it can be invoked for training. # Register Custom Detectron2 Data from detectron2.data.datasets import register_coco_instances register_coco_instances("my_dataset_train", {}, "/content/my_dataset/train/_annotations.coco.json", "/content/my_dataset/train") register_coco_instances("my_dataset_val", {}, "/content/my_dataset/valid/_annotations.coco.json", "/content/my_dataset/valid") register_coco_instances("my_dataset_test", {}, "/content/my_dataset/test/_annotations.coco.json", "/content/my_dataset/test")
76.
Hichem Felouat -Algeria - hichemfel@gmail.com 76 Train Detectron on a custom dataset • If you want to use a custom dataset while also reusing detectron2’s data loaders, you will need to: Register your dataset (i.e., tell detectron2 how to obtain your dataset). Optionally, register metadata for your dataset. • To let detectron2 know how to obtain a dataset named “my dataset”, users need to implement a function that returns the items in your dataset and then tell detectron2 about this function. Or you can convert your annotation to COCO format. Detectron2 custom dataset tutorial: https://detectron2.readthedocs.io/tutorials/datasets.html
77.
Hichem Felouat -Algeria - hichemfel@gmail.com 77 Train Detectron on a custom dataset # visualize training data my_dataset_train_metadata = MetadataCatalog.get("my_dataset_train") dataset_dicts = DatasetCatalog.get("my_dataset_train") import random from detectron2.utils.visualizer import Visualizer for d in random.sample(dataset_dicts, 3): img = cv2.imread(d["file_name"]) visualizer = Visualizer(img[:, :, ::-1], metadata=my_dataset_train_metadata, scale=0.5) out = visualizer.draw_dataset_dict(d) cv2_imshow(out.get_image()[:, :, ::-1]) print("n")
78.
Hichem Felouat -Algeria - hichemfel@gmail.com 78 Train Detectron on a custom dataset # We are importing our own Trainer Module here to use the COCO validation # evaluation during training. Otherwise no validation eval occurs. from detectron2.engine import DefaultTrainer from detectron2.evaluation import COCOEvaluator class CocoTrainer(DefaultTrainer): @classmethod def build_evaluator(cls, cfg, dataset_name, output_folder=None): if output_folder is None: os.makedirs("coco_eval", exist_ok=True) output_folder = "coco_eval" return COCOEvaluator(dataset_name, cfg, False, output_folder)
79.
Hichem Felouat -Algeria - hichemfel@gmail.com 79 Train Detectron on a custom dataset model_link = "COCO-Detection/faster_rcnn_X_101_32x8d_FPN_3x.yaml" cfg = get_cfg() cfg.merge_from_file(model_zoo.get_config_file(model_link)) cfg.DATASETS.TRAIN = ("my_dataset_train",) cfg.DATASETS.TEST = ("my_dataset_val",) cfg.DATALOADER.NUM_WORKERS = 4 # Number of data loading threads. cfg.MODEL.WEIGHTS = model_zoo.get_checkpoint_url(model_link) # Let training initialize from model zoo. cfg.SOLVER.IMS_PER_BATCH = 4 # Number of images per batch across all machines. cfg.SOLVER.BASE_LR = 0.001 cfg.SOLVER.WARMUP_ITERS = 1000 cfg.SOLVER.MAX_ITER = 1500 # Adjust up if val mAP is still rising, adjust down if overfit. cfg.SOLVER.STEPS = (1000, 1500) # The iteration number to decrease learning rate by GAMMA. cfg.SOLVER.GAMMA = 0.05 cfg.MODEL.ROI_HEADS.BATCH_SIZE_PER_IMAGE = 64 cfg.MODEL.ROI_HEADS.NUM_CLASSES = 4 # Your number of classes cfg.TEST.EVAL_PERIOD = 100 # The period (in terms of steps) to evaluate the model during training. os.makedirs(cfg.OUTPUT_DIR, exist_ok=True) trainer = CocoTrainer(cfg) trainer.resume_or_load(resume=False) trainer.train() Detectron2.config package https://detectron2.readthedocs.io/modules/config.html
80.
Hichem Felouat -Algeria - hichemfel@gmail.com 80 Train Detectron on a custom dataset # Test evaluation from detectron2.data import DatasetCatalog, MetadataCatalog, build_detection_test_loader from detectron2.evaluation import COCOEvaluator, inference_on_dataset cfg.MODEL.WEIGHTS = os.path.join(cfg.OUTPUT_DIR, "model_final.pth") cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.85 predictor = DefaultPredictor(cfg) evaluator = COCOEvaluator("my_dataset_test", cfg, False, output_dir="./output/") val_loader = build_detection_test_loader(cfg, "my_dataset_test") inference_on_dataset(trainer.model, val_loader, evaluator) # Look at training curves in tensorboard %load_ext tensorboard %tensorboard --logdir output
81.
Hichem Felouat -Algeria - hichemfel@gmail.com 81 Train Detectron on a custom dataset # Inference & evaluation using the trained model # cfg already contains everything we've set previously. Now we changed it a little bit for inference: cfg.MODEL.WEIGHTS = os.path.join(cfg.OUTPUT_DIR, "model_final.pth") cfg.DATASETS.TEST = ("my_dataset_test", ) cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.7 # set the testing threshold for this model predictor = DefaultPredictor(cfg) test_metadata = MetadataCatalog.get("my_dataset_test") from detectron2.utils.visualizer import ColorMode import glob for imageName in glob.glob("/content/my_dataset/test/*jpg")): img = cv2.imread(imageName) outputs = predictor(img) v = Visualizer(img[:, :, ::-1], metadata=test_metadata, scale=0.8 ) out = v.draw_instance_predictions(outputs["instances"].to("cpu")) cv2_imshow(out.get_image()[:, :, ::-1]) print("n")
82.
Hichem Felouat -Algeria - hichemfel@gmail.com 82 Train Detectron - Save/Load Trained Model # Save model final.pth %cp /content/my dataset/output/model final.pth /content/gdrive/My Drive my cfg = get cfg() my cfg.merge from file(model zoo.get config file("COCO-Detection/faster rcnn X 101 32x8d FPN 3x.yaml")) my cfg.MODEL.ROI HEADS.SCORE THRESH TEST = 0.5 my cfg.MODEL.WEIGHTS = "/content/gdrive/My Drive/model final.pth" my cfg.MODEL.ROI HEADS.NUM CLASSES = 4 my cfg.DATASETS.TEST = ("my dataset test", ) my model detectron2 = DefaultPredictor(my cfg) test metadata = MetadataCatalog.get("my dataset test") from detectron2.utils.visualizer import ColorMode import glob for imageName in glob.glob("/content/my dataset/test/*jpg"): im = cv2.imread(imageName) outputs = my model detectron2(im) print("Classes : n", outputs["instances"].pred classes) print("Scores : n", outputs["instances"].scores) v = Visualizer(im[:, :, ::-1], metadata=test metadata, scale=0.8 ) out = v.draw instance predictions(outputs["instances"].to("cpu")) cv2 imshow(out.get image()[:, :, ::-1]) print("n")
83.
Hichem Felouat -Algeria - hichemfel@gmail.com 83 Instance Segmentation on a custom dataset # Download, decompress the data %mkdir /content/my dataset/ %cd /content/my dataset !wget https://github.com/TannerGilbert/Detectron2-Train-a-Instance-Segmentation-Model/raw/master/microcontroller segmentation data.zip !unzip microcontroller segmentation data.zip Full code : https://github.com/TannerGilbert/Detectron2-Train-a-Instance-Segmentation-Model
84.
Hichem Felouat -Algeria - hichemfel@gmail.com 84 Instance Segmentation on a custom dataset Here, the dataset is in its custom format, therefore we write a function to parse it and prepare it into detectron2's standard format. User should write such a function when using a dataset in custom format (get my dataset dicts). See the tutorial for more details. https://detectron2.readthedocs.io/tutorials/datasets.html
85.
Hichem Felouat -Algeria - hichemfel@gmail.com 85 Instance Segmentation on a custom datasetfrom detectron2.structures import BoxMode def get_my_dataset_dicts(directory): classes = ['Raspberry_Pi_3', 'Arduino_Nano', 'ESP8266', 'Heltec_ESP32_Lora'] dataset_dicts = [] for filename in [file for file in os.listdir(directory) if file.endswith('.json')]: json_file = os.path.join(directory, filename) with open(json_file) as f: img_anns = json.load(f) record = {} filename = os.path.join(directory, img_anns["imagePath"]) record["file_name"] = filename record["height"] = 600 record["width"] = 800 annos = img_anns["shapes"] objs = [] for anno in annos: px = [a[0] for a in anno['points']] py = [a[1] for a in anno['points']] poly = [(x, y) for x, y in zip(px, py)] poly = [p for x in poly for p in x] obj = { "bbox": [np.min(px), np.min(py), np.max(px), np.max(py)], "bbox_mode": BoxMode.XYXY_ABS, "segmentation": [poly], "category_id": classes.index(anno['label']), "iscrowd": 0 } objs.append(obj) record["annotations"] = objs dataset_dicts.append(record) return dataset_dicts
86.
Hichem Felouat -Algeria - hichemfel@gmail.com 86 Instance Segmentation on a custom dataset from detectron2.data import DatasetCatalog, MetadataCatalog for d in ["train", "test"]: DatasetCatalog.register("my_dataset_" + d, lambda d=d: get_my_dataset_dicts("/content/my_dataset/Microcontroller Segmentation/" + d)) MetadataCatalog.get("my_dataset_" + d).set(thing_classes=["Raspberry_Pi_3", "Arduino_Nano", "ESP8266", "Heltec_ESP32_Lora"]) my_dataset_metadata = MetadataCatalog.get("my_dataset_train") # visualize training data import random dataset_dicts = get_my_dataset_dicts("/content/my_ataset/Microdcontroller Segmentation/train") for d in random.sample(dataset_dicts, 3): img = cv2.imread(d["file_name"]) visualizer = Visualizer(img[:, :, ::-1], metadata=my_dataset_metadata, scale=0.5) out = visualizer.draw_dataset_dict(d) cv2_imshow(out.get_image()[:, :, ::-1])
Hichem Felouat -Algeria - hichemfel@gmail.com 88 Instance Segmentation on a custom dataset # Inference & evaluation using the trained model # cfg already contains everything we've set previously. Now we changed it a little bit for inference: cfg.MODEL.WEIGHTS = os.path.join(cfg.OUTPUT_DIR, "model_final.pth") cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.5 # set a custom testing threshold cfg.DATASETS.TEST = ("my_dataset_test", ) predictor = DefaultPredictor(cfg) from detectron2.utils.visualizer import ColorMode dataset_dicts = get_my_dataset_dicts("/content/my_dataset/Microcontroller Segmentation/test") for d in random.sample(dataset_dicts, 5): img = cv2.imread(d["file_name"]) outputs = predictor(img) v = Visualizer(img[:, :, ::-1], metadata=my_dataset_metadata, scale=0.5, instance_mode=ColorMode.IMAGE_BW ) out = v.draw_instance_predictions(outputs["instances"].to("cpu")) cv2_imshow(out.get_image()[:, :, ::-1]) print("n")
89.
Hichem Felouat -Algeria - hichemfel@gmail.com 89 TensorFlow Object Detection API • The TensorFlow Object Detection API is an open- source framework built on top of TensorFlow that makes it easy to construct, train, and deploy object detection models. • The TensorFlow Object Detection API allows you to train a collection state of the art object detection models under a unified framework. TensorFlow Object Detection API : https://github.com/tensorflow/models/tree/master/research/object_detection
90.
Hichem Felouat -Algeria - hichemfel@gmail.com 90 TensorFlow Object Detection API - Model Zoo Model Zoo : https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/tf2_detection_zoo.md
91.
Hichem Felouat -Algeria - hichemfel@gmail.com 91 TensorFlow Object Detection API - Inference import os import pathlib # Clone the tensorflow models repository if it doesn't already exist if "models" in pathlib.Path.cwd().parts: while "models" in pathlib.Path.cwd().parts: os.chdir("..") elif not pathlib.Path("models").exists(): !git clone --depth 1 https://github.com/tensorflow/models # Install the Object Detection API %%bash cd models/research/ protoc object_detection/protos/*.proto --python_out=. cp object_detection/packages/tf2/setup.py . python -m pip install . # Run model builder test !python /content/models/research/object_detection/builders/model_builder_tf2_test.py
92.
Hichem Felouat -Algeria - hichemfel@gmail.com 92 TensorFlow Object Detection API - Inference Download the model : More models can be found in the TensorFlow 2 Detection Model Zoo. To use a different model you will need the URL name of the specific model. This can be done as follows:
Hichem Felouat -Algeria - hichemfel@gmail.com 94 # Download labels file """ Since the pre-trained model we will use has been trained on the COCO dataset, we will need to download the labels file corresponding to this dataset, named mscoco label map.pbtxt. """ def download labels(filename): base_url = 'https://raw.githubusercontent.com/tensorflow/models/master/research/object_detection/data/' label_dir = tf.keras.utils.get_file(fname=filename, origin=base_url + filename, untar=False) label_dir = pathlib.Path(label_dir) return str(label_dir) LABEL_FILENAME = "mscoco_label_map.pbtxt" PATH_TO_LABELS = download labels(LABEL_FILENAME) TensorFlow Object Detection API - Inference
95.
Hichem Felouat -Algeria - hichemfel@gmail.com 95 TensorFlow Object Detection API - Inference # Load The Downloaded Model import time from object_detection.utils import label_map_util from object_detection.utils import visualization_utils as viz_utils PATH_TO_SAVED_MODEL = PATH_TO_MODEL_DIR + "/saved_model" print("Loading model...", end="") start_time = time.time() # Load saved model and build the detection model my_detection_model = tf.saved_model.load(PATH_TO_SAVED_MODEL) end_time = time.time() elapsed_time = end_time - start_time print("Done! Took {} seconds".format(elapsed_time)) # ****************************************************************************** # Load label map data (for plotting) category_index = label_map_util.create_category_index_from_labelmap(PATH_TO_LABELS, use_display_name=True)
96.
Hichem Felouat -Algeria - hichemfel@gmail.com 96 TensorFlow Object Detection API - Inference def load_image_into_numpy_array(path): """ Load an image from file into a numpy array. Puts image into numpy array to feed into tensorflow graph. Note that by convention we put it into a numpy array with shape (height, width, channels), where channels=3 for RGB. Args: path: the file path to the image Returns: uint8 numpy array with shape (img_height, img_width, 3) """ return np.array(Image.open(path))
Hichem Felouat -Algeria - hichemfel@gmail.com 98 TensorFlow Object Detection API - Inference # Mask-RCNN def run_inference_for_single_image(image_path): print("Running inference for : ",image_path) image_np = load_image_into_numpy_array(image_path) # The input needs to be a tensor, convert it using `tf.convert_to_tensor`. input_tensor = tf.convert_to_tensor(image_np) # The model expects a batch of images, so add an axis with `tf.newaxis`. input_tensor = input_tensor[tf.newaxis, ...] # input_tensor = np.expand_dims(image_np, 0) detections = my_detection_model(input_tensor) # All outputs are batches tensors. # Convert to numpy arrays, and take index [0] to remove the batch dimension. # We're only interested in the first num_detections. num_detections = int(detections.pop("num_detections")) import itertools detections = dict(itertools.islice(detections.items(), num_detections)) detections["num_detections"] = num_detections image_np_with_detections = image_np.copy() # Handle models with masks: if "detection_masks" in detections: # Reframe the the bbox mask to the image size. detection_masks_reframed = utils_ops.reframe_box_masks_to_image_masks( detections["detection_masks"][0], detections["detection_boxes"][0], image_np.shape[0], image_np.shape[1]) detection_masks_reframed = tf.cast(detection_masks_reframed > 0.5, tf.uint8) detections["detection_masks_reframed"] = detection_masks_reframed.numpy() boxes = np.asarray(detections["detection_boxes"][0]) classes = np.asarray(detections["detection_classes"][0]).astype(np.int64) scores = np.asarray(detections["detection_scores"][0]) mask = np.asarray(detections["detection_masks_reframed"]) # Visualizing the results viz_utils.visualize_boxes_and_labels_on_image_array(image_np_with_detections, boxes, classes, scores, category_index, instance_masks=mask, use_normalized_coordinates=True, line_thickness=3) cv2_imshow(image_np_with_detections) display(Image.fromarray(image_np_with_detections)) print("Done")TF2 ObjectDetectionAPI Inference.ipynb https://colab.research.google.com/drive/1ossp9IRFgjIQ1-PbWqaChcgMfjDPGDyL?usp=sharing
99.
Hichem Felouat -Algeria - hichemfel@gmail.com 99 Train TF OD API on a custom dataset • The TF2 object detection API models expect the input dataset in TFRecord format. • The TFRecord format is a simple format for storing a sequence of binary records. • How to Create a TFRecord File for Computer Vision and Object Detection? follow: https://blog.roboflow.com/how-to-create-to-a-tfrecord-file-for-computer-vision/
100.
Hichem Felouat -Algeria - hichemfel@gmail.com 100 Train TF OD API on a custom dataset # Downloading data from Roboflow # UPDATE THIS LINK - get our data from Roboflow %mkdir /content/my_dataset/ %cd /content/my_dataset/ !curl -L "[YOUR LINK HERE]" > roboflow.zip; unzip roboflow.zip; rm roboflow.zip • After annotating the dataset, we can convert it form (xml, csv, or any other format) to TFRecord format.
101.
Hichem Felouat -Algeria - hichemfel@gmail.com 101 Train TF OD API on a custom dataset # change chosen model to deploy different models available in the TF2 object detection zoo # pretrained checkpoint : # https://github.com/tensorflow/models/blob/master/research/object detection/g3doc/tf2 detection zoo.md # base pipeline file: # https://github.com/tensorflow/models/tree/master/research/object detection/configs/tf2 MODELS_CONFIG = { 'efficientdet_d0': { 'model_name': 'efficientdet_d0_coco17_tpu-32', 'base_pipeline_file': 'ssd_efficientdet_d0_512x512_coco17_tpu-8.config', 'pretrained_checkpoint': 'efficientdet_d0_coco17_tpu-32.tar.gz', 'batch_size': 8 }, 'efficientdet_d7': { 'model_name': 'efficientdet_d7_coco17_tpu-32', 'base_pipeline_file': 'ssd_efficientdet_d7_1536x1536_coco17_tpu-32.config', 'pretrained_checkpoint': "efficientdet_d7_coco17_tpu-32.tar.gz", 'batch_size': 4 }, 'Faster_RCNN': { 'model_name': 'faster_rcnn_resnet50_v1_640x640_coco17_tpu-8', 'base_pipeline_file': 'faster_rcnn_resnet50_v1_640x640_coco17_tpu-8.config', 'pretrained_checkpoint': 'faster_rcnn_resnet50_v1_640x640_coco17_tpu-8.tar.gz', 'batch_size': 4 } }
102.
Hichem Felouat -Algeria - hichemfel@gmail.com 102 Train TF OD API on a custom dataset chosen_model = "Faster_RCNN" num_steps = 2000 # The more steps, the longer the training. Increase if your loss function is still decreasing and validation metrics are increasing. num_eval_steps = 100 # Perform evaluation after so many steps model_name = MODELS_CONFIG[chosen_model]['model_name'] pretrained_checkpoint = MODELS_CONFIG[chosen_model]['pretrained_checkpoint'] base_pipeline_file = MODELS_CONFIG[chosen_model]['base_pipeline_file'] batch_size = MODELS_CONFIG[chosen_model]['batch_size']
103.
Hichem Felouat -Algeria - hichemfel@gmail.com 103 Train TF OD API on a custom dataset # Download pretrained weights %mkdir /content/models/research/deploy/ %cd /content/models/research/deploy/ import tarfile download_tar = 'http://download.tensorflow.org/models/object_detection/tf2/20200711/' + pretrained_checkpoint !wget {download_tar} tar = tarfile.open(pretrained_checkpoint) tar.extractall() tar.close() # Download base training configuration file %cd /content/models/research/deploy download_config = 'https://raw.githubusercontent.com/tensorflow/models/master/research/object_detection/configs/tf2/' + base_pipeline_file !wget {download_config}
104.
Hichem Felouat -Algeria - hichemfel@gmail.com 104 Train TF OD API on a custom dataset !python /content/models/research/object_detection/model_main_tf2.py --pipeline_config_path={pipeline_file} --model_dir={model_dir} --alsologtostderr --num_train_steps={num_steps} --sample_1_of_n_eval_examples=1 --num_eval_steps={num_eval_steps} Training Custom Object Detector: https://tensorflow-object-detection-api-tutorial.readthedocs.io/en/latest/training.html Train Custom TensorFlow2 Object Detection Model: https://colab.research.google.com/drive/1sLqFKVV94wm-lglFq_0kGo2ciM0kecWD#scrollTo=fF8ysCfYKgTP&uniqifier=1
105.
Hichem Felouat -Algeria - hichemfel@gmail.com 105 TF OD API - Mask-RCNN on a custom dataset Dataset Gathering : As always, at the beginning of an AI Project after the problem statement has been identified we move on to gathering the data or in this case images for training. Data Labeling : Now that we have gathered the dataset we need to label the images so that the model understands what is the interesting object on the picture. To label the images we need a labeling software, for example : labelme (instance segmentation)
106.
Hichem Felouat -Algeria - hichemfel@gmail.com 106 TF OD API - Mask-RCNN on a custom dataset labelme
107.
Hichem Felouat -Algeria - hichemfel@gmail.com 107 TF OD API - Mask-RCNN on a custom dataset Create TFRecords : • After labeling our entire dataset we now have to generate TFRecords which serves as input for our model training. labelme offer us json files, so that we need to convert the json labelme labels into COCO format. • You can use labelme2coco package to convert labelme annotations to COCO format. https://github.com/fcakyon/labelme2coco • Now that the data is in COCO format we can create the TFRecord files. For this, we will make use of the create coco tf record.py file.
108.
Hichem Felouat -Algeria - hichemfel@gmail.com 108 TF OD API - Mask-RCNN on a custom dataset Follow the full code of Mask RCNN in my Colab Notebook: https://colab.research.google.com/drive/1mBw-dLLyM96N2KePnh9fE6Wwm4RWw9H9?usp=sharing
109.
Hichem Felouat -Algeria - hichemfel@gmail.com 109 Thanks For Your Attention Hichem Felouat ...