Detecting objects on an image can be accomplished by using a deep learning model. There a lot of pre-trained models on the Internet that you can use. Sometimes you might want to train your model to detect a specific object (sharks, squirrels, a mask on a person’s face…). There are multiple tutorials about how to train these models on custom datasets. Don’t.
Before even thinking about creating your own dataset. Downloading 100s of images and labeling them using labelimg can take a lot of time. And in some cases, it might be unnecessary. Test some out of the box models before going into the long route. This step won’t take long and can save you a ton of time.
I am going to use a couple of examples. I want to detect certain animals in pictures. Sharks and squirrels. Let’s say we have a research purpose to do this. Before using the long route let’s try the other approach.
We are going to use the official Google Collab notebook to test the different models. The notebook is pretty straight forward if you run all the cells you are going to use a common model trained on a dataset called Coco. The full name of the model: ssd_mobilenet_v1_coco_2017_11_17. This is a fast model but looks like it won’t work for our purposes. I uploaded an image to /content/models/research/object_detection/test_images and this is the result we got:
Not the results we were expecting. It has a low confidence and not a very good prediction. If we assume that’s a squirrel.
Before changing some stuff on the code, you can find some pre-trained ready to use models on the Tensorflow detection model zoo. There are models trained on different datasets and with different performances.
Let’s change a couple of lines of code and test again. First we are going to use a model from the Inaturalist dataset:
- On the section “Loading Label Map” we are going to use the following code: PATH_TO_LABELS = ‘models/research/object_detection/data/fgvc_2854_classes_label_map.pbtxt’ The label map helps us interpret the output of the new model. It ties the category number to a name.
- On the section “Detection” we are going to use the following code: model_name = ‘faster_rcnn_resnet101_fgvc_2018_07_19’ This will tell the code which model to download.
With the new model this are the results:
Higher confidence and a weird latin name (scriurus carolinensis). If we use wikipedia we find that the other known name is Eastern gray squirrel. Not bad. The same if we test a shark image:
Using Wikipedia we can find that it’s a whale shark.
Just for the same of experimenting I used a model trained on the OpenImage dataset. I used the following label map and model:
- model_name = ‘faster_rcnn_inception_resnet_v2_atrous_oid_2018_01_28’
Here we got a higher confidence but not as accurate detection. Depending on your use case one model or the other could be better.
In some cases you’ll still need a custom dataset and going the long route. But checking this avenue won’t hurt and might save you a lot of precious time.