Open Science II (OS II)

Project background information

Title of Operational Programme: Programme Johannes Amos Comenius
Call title and number: Open Science II, 02_24_030
Project name: Open Science II
Project registration number: CZ.02.01.01/00/24_030/0015041
Recipient: Charles University
Období realizace projektu: 1 October 2025 – 31 December 2028

Project annotation

The Open Science II project supports the construction of the National Data Infrastructure (NDI) and the implementation of the EOSC initiative in the Czech Republic. It focuses on the development and creation of industry repositories, the FAIRification of data, interoperability, development of tools, education and communication. It also engages the wider research community in activities through open competition, known as mini-projects, through which it contributes to systematic change in the handling of research data in accordance with FAIR principles.

Logo of Charles University featuring a red crown above the name "Charles University" in blue letters.

Contact

Principal Project Manager
Michaela Hynková
michaela.hynkova@ruk.cuni.cz

Expert Project Manager
Jan Tuček
jan.tucek@fhs.cuni.cz

Project Objective

The objective of the project is to ensure a unified and coordinated infrastructure that also respects industry specifics and will enable the effective management, sharing and reuse of research data. It will operate in accordance with FAIR principles and in accordance with the intentions of the implementation of the EOSC initiative in the Czech Republic, in the following areas:

    1. Development, consolidation and creation of new domain and interdisciplinary repositories of research data and their connection to the NRP and integration into the NDI, including support and expansion of user communities. 16 new domain repositories will be created and 6 domain repositories will be innovated.
    2. Ensuring of a secure, standardized and long-term sustainable environment for data storage and sharing, support for interoperability, creation and standardization of metadata models, standardization and reorganization of data, consistency control and data cleaning. Ensuring the FAIRification and domain and cross-domain interoperability are complementary to the aforementioned Sub-Objective 1.
    3. Development and piloting of tools and services important for the provision and development of NDI services, in particular specific services for working with sensitive data, for facilitating access to and publication of data, for strengthening cybersecurity and for generating and managing trusted provenance. New services and tools will be developed to expand the range of NDI services. A secure entry portal providing technical and user support using AI technology will be built.
    4. Dissemination of know-how to research communities, among data support staff and users of repositories in general, tools and NDI services through communication, raising of awareness and educational activities.
    5. Broader involvement of target groups in the activities of the OS II project through the mini-project grant scheme, especially in the field of ensuring the FAIRification of research data, ensuring the standardization/compatibility and interconnection of systems for the purpose of development and ensuring the functionality of the NDI.

Key Activities of the Project

Key Activity 1 (KA1) Project Management

The project is implemented by the applicant (Charles University) in cooperation with 11 partners. The implementation team consists of an administrative team and an expert team.

The administrative team is led by the Chief Project Manager, who is responsible for the overall management of the project, coordination of the administrative and expert team, compliance with P JAC rules and the achievement of objectives and budget. The Project Manager and the Financial Manager oversee administrative and financial coordination at the level of the applicant and partners. Administrative staff oversee day-to-day operations.

The expert team is led by the Expert Project Manager and consists of guarantors of thematic and cross-cutting key activities, expert guarantors of outputs, expert researchers and other specialists according to the needs of individual activities.

The highest decision-making body is the Project Internal Management Authority. Coordination and supervision of the implementation of the project is overseen by the Project Executive Committee.

This structure supports effective management, flexible response to changes and the quality assurance of outputs in line with project objectives.

Key activity 2 (KA2) Thematic Cluster Bio/Health/Food

The aim of TKA B/H/F is to build repository infrastructure for the management, sharing and reuse of diverse types of biological, chemical and clinical data generated in basic and applied research in the fields of medicine, biology, chemistry and related sciences. Repository sub-activities share a common framework that is firmly anchored in FAIR principles, both in technical terms (standardized metadata, access interfaces, formats) and in procedural terms (curating, versioning, permission management, citation).

An essential feature of all repository systems created is to ensure interoperability with the National Repository Platform (NRP), the use of unified authentication/authorization through Life Science Login and the possibility of exporting metadata to national and European catalogues and infrastructures. FAIRification is not perceived as a one-off output, but as an ongoing and living framework that enables effective and sustainable data sharing and reuse across disciplines and institutions. For selected repositories, FAIRification is further extended to include direct support for advanced data processing tools (e.g. AI models in chemical biology, data annotation using ontologies).

The key activity does not represent an isolated effort, but a systematic change in access to data in the field of health, biology and chemistry in the Czech Republic, with an emphasis on international compatibility, repeatability of research, openness and sustainability. The goal is to create an infrastructure that will allow not only the storage, but also the full use of data for advanced research (meta)analysis, education, cross-domain collaboration and knowledge transfer.

Key activity 3 (KA3) Thematic Cluster Materials Science and Technology

TKA MATECH will create an environment for the high-quality management of FAIR data research in the field of material sciences and technologies in the Czech Republic. The central activity is the implementation of a new DANTEc domain repository with an adequate metadata profile, selection of relevant licenses and a user interface that will help improve the storage, but also the searching and reuse, of research data. The repository will be connected to data management tools for users and within research infrastructures and a tool for subsequent work with data in the repository. This interconnection will contribute to the improvement of stored domain-specific data and will also facilitate the broad cross-domain use of the repository using modern ML and AI-based tools.

Key activity 4 (KA4) Thematic Cluster Data Management for Artificial Intelligence and Machine Learning

Thematic cluster Data Management for Artificial Intelligence and Machine Learning will ensure the creation of a new, domain-focused AI/ML (Artificial Intelligence/Machine Learning) repository known as Data Management for Artificial intelligence and Machine Learning based on the Clarin DSpace repository system. The repository aims to offer a unified platform inspired by the global Hugging Face application that will enable efficient sharing and management of AI/ML models, datasets and workflows, including the provision of advanced tools for working with data and the possibility of connecting to computing infrastructures through the LEXIS Platform.

Key activity 5 (KA5) Thematic Cluster Social Sciences

With the TKA SOC, a domain repository platform for social sciences will be created based on the innovation of two existing CSDA and DataHub repositories supplemented by a newly built repository for sensitive data. The effectiveness of achievement of the goals is based on the connection of existing systems to the NDI EOSC in the Czech Republic and their supplementation with the necessary, previously missing element of the domain infrastructure. The comprehensive environment for the implementation of the Open Science policy in the social sciences aims at both the storage of data and their reuse in social science and cross-domain research. Existing systems are integrated into the international data services ecosystem; at the same time, the NDI will be connected to the European level of the domain data infrastructure. The KA will also focus on the systematic dissemination of project outputs and the development of educational capacities in the field of data management, and not only for the professional community in social sciences. The goal is to raise awareness of new services, analyses and educational materials created in the project through a targeted communication strategy and a newly created web platform.

Key activity 6 (KA6) Thematic Cluster Physical Sciences

Thematic cluster Physical Sciences is building a specialized branch repository known as "Physics" based on the implementation of the Invenio system in the National Repository Platform (NRP). The repository will offer robust storage, clearly defined metadata models and automated tools for the mass transfer and verification of large data packages from the ATLAS ITk, DUNE, CTAO SST-1M, Auger and crystallographic structural analysis experiments. This ensures that data generated today will be stored in accordance with FAIR principles from the very beginning and will be traceable and usable over the long term for future generations of researchers.

The technical part is followed by innovative services: a tool for automated metadata creation connected to Electronic Laboratory Logs (ELN), a web component for the direct visualization of multidimensional data (HDF5/NeXus, fits, etc.) in a browser and e-learning modules and materials for workshops that will quickly expand FAIR data management skills among both Czech and international teams. These open-source tools will also be immediately usable in other NDI clusters and will strengthen the interoperability of the entire Czech infrastructure.

Key activity 7 (KA7) Thematic Cluster Humanities and Arts

Key Activity 7 will focus on the innovation of existing and the development of new repositories in the humanities cluster, as well as the development of tools and services for the development of the NDI.

The repositories are concentrated around four existing large field infrastructures for the humanities as central producers of research data and competence centres in the field of open science within the cluster (Archaeological Information System of the Czech Republic – AIS CR, Czech Literary Bibliography – CLB, Czech National Corpus – CNC and LINDAT/CLARIAH-CZ). The activity will utilise existing repository solutions, which will be supplemented by new repositories or repository communities in selected fields with significant research data production. The following repositories will be upgraded:

    1. LINDAT/CLARIAH-CZ, which will extend its user community to the field of corpus linguistics by integrating data of the CNC research infrastructure, on the basis of which a separate collection/community of the repository will be created.
    2. The Digitalia Muni ARTS institutional repository.
    3. The ArchaeoVault archaeology repository.

Literary Bibliography research infrastructure and, depending on the results of internal analysis, will be operated as a separate collection/community within one of the existing repositories or as a separate instance.

Key activity 8 (KA8) Thematic Cluster Environmental Science

Environmental Science Thematic Cluster will focus on several points:

  • the creation of metadata models, standards and methodologies enabling the processing of mass spectroscopic data, photographic records for the purpose of studying biodiversity, the genetic bank of wildlife, zoological collections, toxicological and ecotoxicological data and GIS data. It will further focus on the introduction of metadata standards to ensure semantic interoperability (interconnection with controlled dictionaries, thesauri and ontologies) and the assignment of citable persistent identifiers to data objects or extracts;
  • preparation of an application interface for the metadata exchange of managed datasets (DCAT) and an interface for data exchange;
  • preparation of supporting materials, including instructions for the creation and validation of FAIRified datasets;
  • analysis of other needs of the research community.

In all cases, the issue of ensuring the quality and interoperability of repositories and data in an international context, the issue of licencing and legal models and connections to external data sources will be addressed.

Key activity 9 (KA9) Thematic Cluster Sensitive Data

Thematic cluster Sensitive Data will add a layer of sensitive data management throughout the NRP and NDI space to the project. It will continue to support activities necessary for the FAIRification of data in repositories and work on licencing and other legal regimes/models. It will expand current capabilities with the tools and functionalities necessary for the management sensitive data. It will create detailed and guidelines for the FAIRification of sensitive data, tools or services applicable within the life cycle of sensitive data management and, last but not least, it will focus on the good practice of sharing and processing sensitive data generated from, among other things, cooperation between academia and the private sector. The TKA aims to create clear and harmonised rules for accessing sensitive data that cannot be shared under current conditions.

Key activity 10 (KA10) Cross-cutting themes

The key activity with cross-cutting themes responds to current fundamental challenges in the field of Open Science, the solutions for which, taking into account the very rapid current development, were not, or could not be, included in the forthcoming architecture of the NRP project. However, the Open Science II project cannot by its nature ignore them, and by addressing them, it will significantly contribute to the development of the NDI. In the context of the OS II project as a whole (in relation to the field KAs), the term "cross-cutting themes" must be understood in the sense of "complementary". If the NDI is to constitute a unified/standardized complex and an evolving structure, it is currently impossible to avoid the building of a secure entry portal providing the broadest and most effective technical and user support. This cannot be built and operated without the use of AI technology, taking into account the expected number and content width of cases processed. The emphasis on cybersecurity must then also be understood and directed in legal contexts, particularly in relation to all forms and modes of the handling of sensitive data, while not forgetting the issues associated with the generation and management of credible provenance in order to support the traceability of the predecessors of objects stored in repositories, or from laboratory journals or other SW and beyond. A further integral component of the KA is targeted communication with research communities and institutions. The effective sharing of information on the procedures and results of the project is key to promoting cross-domain collaboration, the involvement of professionals and increasing awareness of emerging services. The project focuses on topics such as the interoperability of research data, the benefits of FAIR principles, the usability of OS II services, vocational training and the wider context of open science within the EOSC European initiative. These topics are communicated with regard to the needs of individual scientific communities and current challenges in the field of data management.

Key activity 11 (KA11) Mini-projects

In the key activity, the implementation of the EOSC initiative will be supported through calls for cooperation on OS II Mini-Projects. A complete grant scheme of calls for cooperation will be implemented.


Project Target Group

The OS II project is primarily aimed at employees of research organizations, who represent not only the main creators of the project outputs, but also their main users. These experts are actively involved in the development, testing and implementation of tools, methodologies and repository services. The project thus directly supports the development of their competences, strengthens the capacities of research organizations in the field of research data management and contributes to increasing the quality and openness of research work in the Czech Republic.

The project addresses a wide range of users across research institutions, public administration, higher and vocational education and professionals. These groups differ not only in the field of focus, but also in the manner of involvement and the degree of utilisation of individual project outputs.

The definition of all target groups of the project is based on a qualitative description, the manner of involvement and the expected benefits, while their quantification will be possible only at a later stage of the project implementation based on feedback, statistics on the use of repositories or participation in educational activities.


Overview of institutions participating in the project

Schéma rozdělení konzorcia s různými týmy a vzájemnými vztahy, zahrnující barvy a popisky, na pozadí bílého papíru.

You are running an old browser version. We recommend updating your browser to its latest version.