Harnessing the Power of Synthetic Data for Deep Learning Image Analysis

[ad_1]

The Synavis framework provides a solution to overcome data scarcity for training deep learning biological image analysis models.

Deep learning models have revolutionized plant modeling by automating the extraction of plant features and characteristics from images. This high-throughput data enables researchers to analyze complex plant traits, such as growth patterns and disease susceptibility, more efficiently.

Deep learning models must be trained using diverse images to develop robust and generalized representations. However, obtaining this type of data is a time-consuming and resource-intensive process. Apart from conducting experiments, it involves the meticulous collection of substantial volumes of high-quality images, which then need to be segmented and stored appropriately. Additionally, the images must be annotated, where specific information about the objects, regions, or attributes depicted within them is added to each file. This step is crucial in enabling the algorithms to comprehend and learn from the data effectively.

To overcome the scarcity of training data, researchers have explored the use of synthetic data generation, which involves creating artificial plant images that mimic real-world data. Synthetic data can help in training deep learning models more effectively by providing large and diverse datasets.

A new article published in in silico Plants by Dirk Helmrich, PhD student at Forschungszentrum Jülich and the University of Iceland, and colleagues introduces a framework called Synavis which generates synthetic plant data and connects and directly communicates with deep learning training frameworks.

A figure with an explanation of how plants are simulated in CPlantBox at the top. First, model parametrization uses parameters from direct measurements. Then, the model simulates a 2d image of a plant. Last, the plant is reconstructed with geometry to create a 3d image.
At the bottom are examples of photorealistic environments rendered using Unreal Engine. These are images of a field with a rainy, morning, foggy or sunny environment.
Individual plants are simulated in CPlantBox using measured data. Their architecture is defined by topological and geometric information in CPlantBox. Unreal Engine uses this data to produce photorealistic renderings of the plants within a virtual environment and is capable of augmenting scene data.

Synavis is composed of two components: a Functional–Structural Plant Model (FSPM) and Unreal Engine.

FSPMs simulate realistic plant morphology, mimicking various plant development dynamics under specific environmental conditions. The FSPM CPlantBox is used to generate graph-like plant structural data using algorithms. A visualization module is then used to produce 3D plants from the CPlantBox data.

Then, Unreal Engine, a graphics engine capable of photorealistic rendering, is used to generate visual representations of the plants within a virtual environment. Unreal Engine possesses the capability to augment scene data, including plant position, density, age, and lighting, thereby generating a variety of image variations.

A video overview of Synavis created by Dirk Helmrich.

The authors tested the validity of the data rendered using Synavis by comparing it with real-world data from an experiment that was conducted previously. To create simulated data, CPlantBox was configured to virtually replicate the experiment. Images from the experiment were input into CPlantBox and the simulated individual plant geometries were inserted into UE and scaled up to field scale. They then compared the measurements of leaf blade length from the actual plants in the experiment and the simulated plants. The measurements obtained from the synthetic images were closely related to those from the actual experiment.

The resulting images can be directly integrated with a deep learning model for training purposes using Synavis. During training, the model learns to recognize patterns, features, and relationships within the images. By exposing the model to a wide range of image variations it becomes capable of generalizing and understanding the underlying structures and characteristics of the visual data.

“We believe that synthetic data can be extremely helpful to combat data scarcity. With Synavis, we have developed a toolset that connects individually very powerful frameworks. Most importantly, we wanted to devise a way to check how well we can actually replicate the data, in a way that is more practical – by subjecting our virtual images to a typical data analysis pipelines and checking if we successfully end up where we started,” explained Helmrich.

A figure with three panels. On the right is a 3D image of a synthetic plant. In the center is a similar image of a real-world image of a plant. On the right is a comparison of leaf lengths between the synthetic images and the experimental data. The figure shows that the data follows the same trends, but that the synthetic data values are a bit lower.
Comparison of the parameter extraction pipeline between synthetic and real-world data and resulting data.

This is not the first such framework, but it has several advantages over other approaches. “Synavis connects frameworks by providing a platform to talk to each other. The coupling is very straight forward, standardized and also does not require storing of data. The simulation, the virtual world being rendered, and the deep learning toolsets exist concurrently. If the model predicts correctly once and gets it wrong another time, you can interpolate between those states, always with a cohesive virtual environment in between,” Helmrich concluded.

READ THE ARTICLE:

Dirk Norbert Helmrich, Felix Maximilian Bauer, Mona Giraud, Andrea Schnepf, Jens Henrik Göbbert, Hanno Scharr, Ebba Þora Hvannberg, Morris Riedel, A scalable pipeline to create synthetic datasets from functional–structural plant models for deep learning, in silico Plants, Volume 6, Issue 1, 2024, diad022, https://doi.org/10.1093/insilicoplants/diad022


The code used in Helmrich et. al. (2023) is open source and available under the Synavis and SynavisUE repositories with an example available under SynavisUEexample. The CPlantBox official code can be found at on the institute’s GitHub page. The branch associated with this article has been forked to this page.

German translation by Dirk Helmrich.

[ad_2]

Source link

Leave a Reply

Your email address will not be published. Required fields are marked *

Related Posts