This is a multiclass image classification & localization project for SINGLE object using CNN's and TensorFlow API (no Keras) on Python3.
- Collecting images via Google Image Download. Only one object must be in the image.
- Labeling images via LabelImg
-
After data augmentation, create_training_data.py script is creating suitable xml files for augmented images.
-
Making our data tabular. Input is image that we feed into CNN. Output1 is one hot encoded classification output. Output2 is the locations of bounding boxes(regression) in create_training_data.py.
-
Determining hypermaraters in train.py.
-
Separating labelled data as train and CV in train.py.
-
Defining our architecture in train.py. I used AlexNet for model architecture.
-
Creating 2 heads for calculating loss in train.py. One head is classification loss. The other head is regression loss.
-
Training the CNN on a GPU (GTX 1050 - One epoch lasted 10 seconds approximately)
-
Testing on unseen data colled from the Internet.
AlexNet is used as architecture. 5 convolution layers and 3 Fully Connected Layers with 0.5 Dropout Ratio. 60 million Parameters. 
.png)
.png)
