top of page


The GMAI team is committed to deliver the most valuable synthetic data for computer vision tasks. As a team of high-profile scientists we conduct research toward enabling synthetic data and generative AI.

Augmenting Datasets with GenAI

We introduce a specialized procedural model for generating synthetic agricultural scenes, focusing on soybean crops, along with various weeds. The model simulates distinct growth stages of these plants, diverse soil conditions, and randomized field arrangements under varying lighting conditions. The integration of real-world textures and environmental factors into the procedural generation process enhances the photorealism and applicability of the synthetic data. We validate our model's effectiveness by comparing the synthetic data against real agricultural images, demonstrating its potential to significantly augment training data for machine learning models in agriculture. This approach not only provides a cost-effective solution for generating high-quality, diverse data but also addresses specific needs in agricultural vision tasks that are not fully covered by general-purpose models.


M. Cieslak, U. Govindarajan, A. Garcia, A. Chandrashekar, T. Hädrich, A. Mendoza-Drosik, D. Michels S. Pirk, C.-C. Fu, W. Palubicki 

Generating Diverse Agricultural Data for Vision-Based Farming Applications

IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshop: Vision for Agriculture, 2024 

AI-Powered Growth Parameter Estimation of Plants

We introduce LAESI, a Synthetic Leaf Dataset of 100K synthetic leaf images on millimeter paper, each with semantic masks and surface area labels. This dataset provides a resource for leaf morphology analysis aimed at beech and oak leaves. We evaluate the applicability of the dataset by training machine learning models for leaf surface area prediction and semantic segmentation, using real images for validation. Our validation shows that these models can be trained to predict leaf surface area with a relative error not greater than an average human annotator. LAESI also provides an efficient framework based on 3D procedural models and generative AI for the large-scale, controllable generation of data with potential further applications in agriculture and biology. We evaluate the inclusion of generative AI in our procedural data generation pipeline and show how data filtering based on annotation consistency results in datasets allows training the highest performing vision models.


​J. Kaluzny, Y. Schreckenberg, K. Cyganik, P. Annighöfer, S. Pirk, D. L. Michels, M. Cieslak, F. Assaad, B. Benes, W. Palubicki

LAESI: Leaf Area Estimation with Synthetic Imagery

IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshop: Synthetic Data for Computer Vision, 2024

Paradigms for Designing Synthetic Data

The rise of artificial intelligence (AI) and in particular modern machine learning (ML) algorithms during the last decade has been met with great interest in the agricultural industry. While undisputedly powerful, their main drawback remains the need for sufficient and diverse training data. The collection of real datasets and their annotation are the main cost drivers of ML developments, and while promising results on synthetically generated training data have been shown, their generation is not without difficulties on their own. In this paper, we present a paradigm for the iterative, cost-efficient generation of synthetic training data. Its application is demonstrated by developing a low-cost early disease detector for tomato plants (Solanum lycopersicum) using synthetic training data. A neural classifier is trained by exclusively using synthetic images, whose generation process is iteratively refined to obtain optimal performance. In contrast to other approaches that rely on a human assessment of similarity between real and synthetic data, we instead introduce a structured, quantitative approach. Our evaluation shows superior generalization results when compared to using non-task-specific real training data and a higher cost efficiency of development compared to traditional synthetic training data


J. Klein, R. E. Waller, S. Pirk, W. Pałubicki, M. Tester, and D. L. Michels

Synthetic Data at Scale: A Paradigm to Efficiently Leverage Machine Learning in Agriculture

Social Science Research Network, 2023

Synthetic-Data Enabled Robot-Plant-Interactions

Robotic harvesting has the potential to positively impact agricultural productivity, reduce costs, improve food quality, enhance sustainability, and to address labor shortage. In the rapidly advancing field of agricultural robotics, the necessity of training robots in a virtual environment has become essential. Generating training data to automatize the underlying computer vision tasks such as image segmentation, object detection and classification, also heavily relies on such virtual environments as synthetic data is often required to overcome the shortage and lack of variety of real data sets. However, physics engines commonly employed within the robotics community, such as ODE, Simbody, Bullet, and DART, primarily support motion and collision interaction of rigid bodies. This inherent limitation hinders experimentation and progress in handling non-rigid objects such as plants and crops. In this contribution, we present a plugin for the Gazebo simulation platform based on Cosserat rods to model plant motion. It enables the simulation of plants and their interaction with the environment. We demonstrate that, using our plugin, users can conduct harvesting simulations in Gazebo by simulating a robotic arm picking fruits and achieve results comparable to real-world experiments.


J. Deng, S. Marri, J. Klein, W. Palubicki, S. Pirk, G. Chowdhary, D. L. Michels

Gazebo Plants: Simulating Plant-Robot Interaction with Cosserat Rods

International Conference on Robotics and Automation Workshop - Robotics And Sustainability, 2024

Simulations of Weather

Due to the complex interplay of various meteorological phenomena, simulating weather is a challenging and open research problem. In this contribution, we propose a novel physics-based model that enables simulating weather at interactive rates. By considering atmosphere and pedosphere we can define the hydrologic cycle – and consequently weather – in unprecedented detail. Specifically, our model captures different warm and cold clouds, such as mammatus, hole-punch, multi-layer, and cumulonimbus clouds as well as their dynamic transitions. We also model different precipitation types, such as rain, snow, and graupel by introducing a comprehensive microphysics scheme. The Wegener-Bergeron-Findeisen process is incorporated into our Kessler-type microphysics formulation covering ice crystal growth occurring in mixed-phase clouds. Moreover, we model the water run-off from the ground surface, the infiltration into the soil, and its subsequent evaporation back to the atmosphere. We account for daily temperature changes, as well as heat transfer between pedosphere and atmosphere leading to a complex feedback loop. Our framework enables us to interactively explore various complex weather phenomena. Our results are assessed visually and validated by simulating weatherscapes for various setups covering different precipitation events and environments, by showcasing the hydrologic cycle, and by reproducing common effects such as Foehn winds. We also provide quantitative evaluations creating high-precipitation cumulonimbus clouds by prescribing atmospheric conditions based on infrared satellite observations. With our model we can generate dynamic 3D scenes of weatherscapes with high visual fidelity and even nowcast real weather conditions as simulations by streaming weather data into our framework.


J. A. Amador Herrera, T. Hädrich, W. Pałubicki, D. T. Banuti, S. Pirk, D. L. Michels

Weatherscapes: Nowcasting Heat Transfer and Water Continuity

ACM Transactions on Graphics (SIGGRAPH Asia), 2021

bottom of page