Machine Learning for Earth System Modelling Hackathon

Machine Learning
HPCs
AI
This hackathon at the University of Bonn explored machine learning applications for Earth System Modeling. Participants worked with satellite and observational data to develop predictive models for environmental variables such as precipitation. Using AI frameworks and high-performance computing, our team fine-tuned models, evaluated performance, and implemented innovative solutions. The event fostered collaboration and hands-on experience, bridging data science, remote sensing, and climate modeling for actionable environmental insights.

Introduction

From 27–29 August 2025, I had the opportunity to participate in the CESOC Hackathon on Machine Learning for the Earth System in Bonn, Germany.This event brought together teams interested in applying machine learning and AI techniques to solve Earth system challenges.

I was part of the Observation / Satellite Data Challenge (Challenge 3), where our task was to improve precipitation predictions by working with satellite data. The original AIFS model, trained on ERA5 reanalysis data, struggled to capture realistic precipitation features. Our goal was to enhance its performance.

Our Approach: Fine-Tuning with IMERG Data
Our solution involved fine-tuning the AIFS model using IMERG (Integrated Multi-satellitE Retrievals for GPM), a high-quality satellite-based observational dataset. Unlike ERA5, IMERG is known for capturing more realistic precipitation patterns and better handling extreme events.

This creative approach led to the development of our model, AIFS-Indra.

Implementation and Results
To fine-tune the model, we implemented a few key changes. We decreased the learning rate and added warm-up steps to help the model escape local minima during training. Due to GPU limitations, we were only able to train for ten epochs, but despite this the results showed significant improvements.

Our results were promising: the precipitation distribution from AIFS-Indra began to align closely with the IMERG data. We also showed a notable decrease in the global RMSE when compared against an independent year of IMERG data, confirming the positive impact of our fine-tuning.

Ultimately, despite constraints like resolution and limited compute, our project successfully demonstrated the benefits of integrating observational data into large-scale climate models.

Here’s our full presentation for more insights on the challenge:

The creativity of our approach was recognized by the judges, and our team was honored to be named the winner of the challenge!