A pipeline is built to process real-world, user-supplied images. Given an image of a dog, the algorithm will identify an estimate of the canine's breed. If supplied an image of a human, the code will identify the resembling dog breed. The test accuracy target for the CNN is 90% i.e., the model identifies the dog breed 9 times out of 10 correctly. The accuracy metric on the testing dataset is used to measure the performance of our models
- Utility libraries -
random
(for random seeding),timeit
(to calculate execution time),os
,pathlib
,glob
(for folder and path operations),tqdm
(for execution progress),sklearn
(for loading datasets),requests
andBytesIO
(load files from the web) - Image processing - OpenCV (
cv2
),PIL
Keras
andfastai
for creating CNNmatplotlib
for viewing plots/images andnumpy
for tensor processing
- Dog Images - The dog images provided are available in the repository within the Images directory further organized into train, valid and test subfolders
- Human Faces - An exhaustive dataset of faces of celebrities have also been added to the repository in the lfw folder
- Haarcascades - The algorithm uses the Haar frontal face to detect humans. So the expectation is that an image with the frontal features clearly defined is required
- Test Images to check algorithm - A folder with certain test images have been added to be able to check the effectiveness of the algorithm
- Pre-computed features for networks currently available in Keras (i.e.
VGG19
,InceptionV3
andXception
) will be made available from S3 - The folders in the repository have been organized as lfw (containing human images), images (containing dog images organized into train, valid and test sub folders), Haarcascades (containing Haarcascade files), test_images (containing 8 images to check the algorithm).
- The files are Readme.md, dog_breed_classifier.ipynb (the main iPython notebook), extract_bottleneck_features.py (a file to extract the predictions from the keras transfer learning models), and sample_cnn.png (an illustrative CNN model)
- The
face_detector
function takes a string-valued file path to an image as input and returns True or False depending on whether a human face is detected in an image or not
- The
dog_detector
function, returns True if a dog is detected in an image (and False if not)
- A CNN model with 4 convolutional layers alternating with max-pooling layers, dropout and batch normalization with Keras has been fit for 10 epochs for a test accuracy of 6.7%.
- Bottleneck features of
VGG16
was used to generate a transfer learning model which generated a test accuracy of 48%.
- Other Keras models such as
VGG-19
,ResNet50
,InceptionV3
andXception
have also been used for Transfer Learning based models. These models brought the accuracy upto 80+%. fastai
was also used to create a CNN model. This resulted in test accuracy upto 89%.
- Given higher accuracy generated by the
fastai
model, this model was chosen to generate the final predictions - The
predict_breed
function takes an input of a file_path and outputs the breed of the dog - The
algo
function determines if the provided file_path contains a dog or human or neither - The
provide_output
outputs a message based on the predicted species and dog breed
- 6/6 dogs were correctly identified as dogs. 5/6 were accurate breeds. Only 1 dog (a Rajapalayam, a native breed was identified as a Great Dane, possibly because Rajapalayam is not one of the 133 breeds in the ImageNet dataset.
- 2/2 humans were correctly identified as humans and a dog breed was predicted for them
- The final model obtained 89.8% testing accuracy close to the targeted 90%.
- There are a few breeds that are virtually identical and are sub-breeds.
- There's also a possibility of some images being either blurred or having too much noise.
- There's also a possibility of enhancing the quality by additional image manipulation.
A simple web application in Flask could be built to leverage the model to predict breeds through user-input images.
StackOverflow, various Kaggle kernels