Tech Talk with HyperLabel: “Creating an Object Detector Using Your Own Synthetic Data”

This webinar was given by Sage Elliot on May 7, 2020. Sage is a technical evangelist and video AI expert with experience in manufacturing quality assurance, architecture design generation, and wildlife monitoring.

You can find the video of this Tech Talk here or below. And, here’s the summary script with reference links below.

Summary

Learn how to build and label your own dataset, train a model, and then augment your dataset with synthetic data to improve your model’s accuracy.

There’re plenty of options out there for taking pre-made datasets and using them to train a model. But what if you want to collect and use your own data? For our example today, we used a smartphone to capture our data. We built an object detector for Canadian geese.

Data Collection

Key factors to consider when collecting your data:

  • Object angles
    • side
    • top
    • back
    • front
  • Object positions
    • sitting
    • swimming
    • eating
    • flying
  • Object variations
    • age
    • color
    • type
  • Object environment
    • backgrounds
    • lighting
    • weather

In total, we only took 87 photos. Many images were very similar. Some had multiple instances of geese in them.

Synthetic Data

Because we were only able to get a limited number of images and in a limited number of settings, we will use synthetic data to increase our dataset.  This way, we can have data for our objects (Geese) in varied settings where we were not able to capture images of them (e.g., water, grass, rocky, indoor, dock, etc.).

We used photoshop to create the synthetic data:

Photoshop process:

  • Object selection
  • Photoshop crop to content
  • save as a png (for transparent background)
  • Open up a background image in Photoshop
  • Drag your object in
  • Ctr + t free transform
  • Import multiple backgrounds to make quickly

Data Augmentation

You may already be familiar with a more widely used concept of data augmentation.

This allows you to make adjustments to your images when training, like flipping, skewing, lightness, etc., but it does not create a different environment like our synthetic dataset.

This is usually done while training the model.

Labeling Your Data

There are different types of labels for desired results that include:

  • Bounding boxes
  • Polygons
  • Classification
  • Feature Points

We used HyperLabel to create the labels.

The process was very intuitive and fast:

  1. Download HyperLabel from the app store
  2. Create a project
  3. Add a data source
  4. Create your label definition
  5. Label images (we used bounding boxes fin this case)
  6. Export your labels (we used Pascal VOC format for this project)

Review Your Annotations

Make sure that your annotations appear the way that you expect (e.g., your origin is set up correctly) using openCV.

Training

Select a model type:

Transfer learning

With most popular deep learning frameworks, you can load pre-trained weights into your network. These have been trained extensively on quite a few objects and animals.

You can then adjust those weights during training to work for your specific dataset.

A good rule of thumb is to start with a minimum 200 images for transfer learning. But this can vary a lot depending on your data and the results you want.

Our goose dataset has less than 200 images, but over 200 instances of a goose.

Without transfer learning, you will probably need thousands of images and a lot more time. Keep in mind that synthetic data may be a way to turn hundreds into thousands. This really depends on your data and what you’re doing with it.

We trained the model with Tensorflow; it took around 90min

Results

The model was pretty good at recognizing geese in the environments from the data but really bad at detecting the that are not geese (humans, chairs, shadows, etc.)

We corrected this by generating more synthetic data with people in it, indoors, etc. so that the model could better determine what is not a goose.

Resources

Share This!