Summer School in Ostrava: AI Unlocks New Opportunities for Data Stewards

From June 3–5, Ostrava hosted a summer school for data stewards, focused on modern approaches to managing and sharing scientific data. Participants expanded both their expertise and their network within the Czech community, now numbering over 350 professionals.

13 Jun 2025 Lucie Sobková Lucie Skřičková

No description

Electronic Lab Notebooks & Python

On the first day, Marek Cebecauer and Michal Tarana demonstrated electronic lab notebooks (ELNs), software tools that replicate, in digital form, the familiar layout of a paper lab notebook. In the afternoon, Tomáš Martinovič presented the use of Python and the Rye project-management tool in data workflows.

Data Cleaning & Artificial Intelligence

The second day featured an online presentation by Christopher Steiner from the University of Graz on data cleaning with LLMs and OpenRefine. He emphasized that modern data stewards don’t need to master specific tools in depth, they should instead have a broad understanding of available technologies and be able to strategically apply AI. Crucial skills include crafting effective prompts and understanding how language models operate.

In a hands-on session, he showcased OpenRefine, a local open-source tool for cleaning and enriching datasets. He demonstrated how to remove inconsistencies, fill in missing data, and link records to external sources such as Wikidata, enabling secure and efficient data enhancement.

This was followed by a discussion led by Jan Vališ on the Zenodo platform and the "Zenodo Community" concept. Participants explored whether to create project-based or institutional communities and how to address practical challenges in research-data management.

The day concluded with an excursion to the IT4Innovations supercomputing center and the industrial heritage site of Dolní Vítkovice, where participants had the opportunity to network informally.

Statistics & Team Collaboration

On the final day, Lucie Hošková guided participants through practical data analysis in R using RStudio. The program closed with Ondřej Mottl’s presentation on GitHub, building on last year’s session on Version Control and Git. He charmed the audience by sharing the transformation from a tentative scientist to an enthusiastic open science advocate. He also showed the attendees in detail how tasks can be effectively linked to specific code changes (Linking a pull request to an issue - GitHub Docs). This way of managing tasks increases clarity, efficiency of team collaboration and transparency of the whole project.

Thank you to all participants, speakers, and organizers. We look forward to the next Data Steward community meeting on September 25, 2025, in Prague.


Photogallery

Photo: Training Centre EOSC CZ


More articles

All articles

You are running an old browser version. We recommend updating your browser to its latest version.