In one of our previous blog posts and webinars, we’ve written about the power of synthetic data and Neurolabs’ synthetic data engine that transforms 3D assets into robust synthetic datasets for computer vision. At Neurolabs, we believe synthetic data holds the key for making computer vision and object detection more accurate, more affordable and more scalable. We provide an end-to-end platform for generating and training computer vision models on rich and varied annotated synthetic data, taking away the costs of labelling and considerably shortening the time-to-value of object detection solutions.
This is the second of a series of blogposts in which we explore the power of synthetically generated datasets as a basis for object detection in real environments. The first post illustrates the power of randomising the parameters of a synthetic scene to generate robust synthetic datasets and validates this power of representation on an object recognition task in a simple environment.
In this blogpost, we incrementally increase the complexity of our problem and explore how well our synthetically generated datasets scale to more difficult object detection environments.
We analyse the impact of synthetic data on a new scenario. We collect a real dataset similar with the one from the previous blog, but having more complexity. On the top of the previous dataset, we add features like:
- Occlusions between objects (each occlusion covers at most 25% of an object)
- Using 3 different instances of the same class (e.g. 3 different bananas) instead of using only one instance per class
The images have the same camera view and the same background as the first dataset. The classes remain the same: Orange, Banana, Red Apples, Green Apples, Bun Plain, Bun Cereal, Croissant, Broccoli, Snickers, Bounty.
For the synthetic data generation, the main changes we have brought to the original dataset are:
- Using three 3D assets for each class
- Increased the amount of allowed overlap between objects in an image (max 25% of an object occluded)
- The number of objects per image has been extended from [2,4] to [2, 6]
- Scaling has been set from [1x, 2x] to [0.75x, 1.75x], to accommodate for the larger number of objects in an image
We show an example of a real image compared to a synthetic one. The overlap between the objects in our simulated environment mimics the one of the real data, but we can see a clear difference between the textures and the lighting.