From Products to Waste: Synthetic Waste Data Generation with NVIDIA Omniverse
by Rémi Pyronnet, Computer Vision Engineer

AIRECO has already possessed a large dataset collected using traditional procedures, i.e. capturing images of waste and labeling them. While it is of good quality and useful for training our AI models, it has several limitations:
- imbalance: some products are more commonly used than others, resulting in imbalanced waste data
- only a subset of an object's variations is shown: orientation, deformation, dirtness, lighting could be different, but our model won't get to know that
- need for labeling: while some automation procedures exist, labeling real data usually still involve a part of manual labor, which means the production rate of labeled images is limited
- inaccurate labeling: waste data hard to label because of its variety and possible degradation, leading to more mistakes from human labeling
This means we need to augment our dataset in other ways, to both increase its quality and reduce human efforts.
Synthetic data generation can remedy these issues. By being able to choose which object is generated, we can create balanced data. Modifications can be applied to each object to represent all possible variations. And the labels are generated automatically in parallel with images, with no possible errors.
We designed a "Products to Waste" backward approach to getting waste data. While usually we get images of waste and try to guess what the original product is for labeling, here we start from 3D models of unaltered products and alter their condition to make them look like waste. This process can be described in four steps:
- get the objects 3D models: this can be done in different ways. Models can be captured using dedicated cameras, obtained from a third party or created from scratch
- object alteration: damages and dirtness are simulated on the object
- scene creation: multiple objects are loaded together in NVIDIA Isaac Sim and physics is applied to let them fall together on a textured background plane
- data generation: images and labels of the scene are generated using NVIDIA Replicator
From a fixed number of 3D models, we can potentially generate an infinite amount of different images.
We validated this procedure by recording 3D models of objects yet unseen in the training data, and generating synthetic images from them. These are then used to train the model while real images of said object are captured and labeled for testing purposes only. If the model trained on synthetic only images perform well on the real test images, it means it could learn from synthetic data. Once the we proved its efficiency, we can this images to our existing real images dataset,
Overall synthetic data generation seems like the way to scale up our training data, both in quantity and quality, while giving us more control over its content. The bottleneck of this method is the initial acquisition of objects' 3D models. The easiest way is to capture the models with a 3D aware camera, but this asks for continuous human labor and quality can be variable. That's why we hope to collaborate with producers to get 3D models from the source, while we could provide insightful data analytics back to them. Another is to create the model from scratch which would allow for customization of the objects and a truly automated synthetic data generation pipeline.

