Open Science II in the Context of EOSC CZ and the National Data Infrastructure
The opening session of the conference focused on the broader context of the Open Science II project and the development of the National Data Infrastructure (NDI), presented by Martin Nečaský, Matej Antol, Luděk Matyska, and Jan Tuček. Alongside technological development, the discussion highlighted the importance of long-term research data stewardship and data sharing across disciplines, as well as strengthening expert support in the areas of FAIR principles and research data management. The speakers also emphasized that the National Data Infrastructure is not an isolated national initiative, but part of the broader European EOSC Federation ecosystem, which aims to improve the interoperability of services, repositories, and research communities across Europe.
“For researchers and academic staff, the National Data Infrastructure must provide a robust environment capable of supporting them throughout the entire data lifecycle — from the initial creation of research data to its long-term preservation and reuse in repositories,” said Matej Antol, Lead Project Manager of the IPs EOSC-CZ project, during the opening session of the conference.
Science in the Age of AI
One of the highlights of the first day was the panel discussion Open Science Without Borders, moderated by Matej Antol and featuring Jana Klánová, Jan Hajič, and Jiří Vondrášek. The discussion focused on how to conduct high-quality research in an era shaped by artificial intelligence and the growing importance of research data, and whether current research evaluation and funding systems are capable of adapting to these changes.
The panel also highlighted that working with large datasets is not a recent phenomenon but has long been an integral part of research across many scientific disciplines. Participants reflected on the increasing administrative burden associated with research funding and questioned whether current evaluation systems adequately recognise the work involved in producing and maintaining high-quality datasets, software, and documentation.
Another key topic was the future role of automated tools in environments where funding decisions are often made between dozens of highly competitive proposals, and where even minor differences can significantly influence the direction of research. The broader implications of automation for the research ecosystem were also discussed. “If AI begins to take over some of the tasks through which junior researchers traditionally gain experience, an important question arises: where will future senior experts acquire the skills and expertise essential for their professional development?” noted linguist and computer scientist Jan Hajič.
Finding the Balance Between Data Sharing and Data Protection
The afternoon programme of the first day focused on the management of sensitive research data, repository governance, and the use of artificial intelligence in data-related workflows. Lucie Houdová presented the first analysis of needs and current practices in collaborative research, while Věra Franková discussed the management of sensitive data within the NDI and the emerging policies being developed for repositories. A significant part of her presentation focused on how these policies can be designed to enable future research use while safeguarding participant privacy and maintaining institutional trust.
“The governance of repositories containing sensitive data should be based on the principle of proportionality, balancing benefits and risks, leaving data stewardship in the hands of data producers, and ensuring transparency and support through Data Access Committees,” said Věra Franková, Associate Professor of Bioethics at Charles University.
The topic of secure handling of sensitive data was further explored by Vojtěch Bystrý and Michal Růžička, who presented a federated data analysis approach based on the principle of “sending the question instead of the data.” Martin Žádník and colleagues then demonstrated applications of AI within the National Repository Platform (NRP), including provenance management and the anonymisation of sociological survey data. Jan Martinovič also introduced the development of a complementary AI application connected to the repository of the Data Management for Artificial Intelligence and Machine Learning (DM4AI) group, which is intended to support data-related workflows across other repository environments within the NDI in the future.
Repositories Spanning Disciplines from Archaeology to Biological Data
The second day of the conference focused primarily on repositories being developed within the Open Science II project and the NRP for research data. Presentations showcased a broad spectrum of disciplines, ranging from archaeology and social sciences to biological imaging data, herbarium collections, and repositories supporting machine learning research. Beyond the repositories themselves, speakers shared practical experiences related to data structures, nomenclature, user interfaces, and the individual stages involved in developing repository services.
A significant part of the Tuesday programme was dedicated to hands-on workshops exploring practical aspects of open science. The session for repository developers focused on lessons learned from repository design and implementation, common technical challenges, and practical questions related to data storage, nomenclature, and user interface design. In parallel, a workshop led by Petra Černohlávková and Ilona Trtíková for data curators explored dataset FAIRification, common research data management challenges, and practical support for data stewards and curators.
The two-day programme demonstrated that building the National Data Infrastructure is no longer just a matter of strategies and plans. Alongside new repositories and technical solutions, increasing attention is being paid to the practical experience of research teams, data stewards, and data curators, who will play a key role in shaping the future of research data management through their day-to-day work.