Infrastructure Is Not Enough. Open Science Requires a Cultural Change

How can artificial intelligence help scientists make better use of precious synchrotron beamtime? And why are metadata, databases, and FAIR principles becoming increasingly important in modern materials research? In this interview with Ridha Eddhib, a researcher from the New Technologies – Research Center at the University of West Bohemia in Pilsen, we discuss machine learning, photoemission spectroscopy, 2D materials, and the growing importance of research data management in physics.

4 May 2026 Lucie Skřičková

From Tunisia to Pilsen

What was your path into research, and what brought you to the Czech Republic, specifically to the University of West Bohemia in Pilsen and the NTC?

I completed all my university studies in Tunisia and complemented them with international internships in Germany and Turkey. My academic path began with fundamental physics, where I first encountered quantum mechanics and solid-state physics — the idea that particles we cannot see determine the properties of everything around us. It was love at first sight.

Later, I focused on materials science and nanostructure physics, which underpin many modern technologies. My master’s thesis brought me to angle-resolved photoemission spectroscopy (ARPES) —one of the few methods that allows us to study the electronic structure of materials directly. This is also the field of Professor Jan Minár, one of its leading experts. His group at the New Technologies – Research Center of the University of West Bohemia in Pilsen is exceptional because it connects advanced theoretical approaches with experimental research. Thanks to the EUSpecLab Marie Skłodowska-Curie doctoral network, I was able to join the group as well. It has been an extraordinary experience.

Your work combines machine learning and photoemission spectroscopy, particularly in the field of 2D materials. Could you describe what you are currently working on and why this research matters?

In my current work, I use machine learning to address a very practical experimental challenge: calibrating photon-beam polarization in real time during synchrotron ARPES measurements.

Synchrotron beamtime is extremely valuable and limited. Scientists often receive only a few days of measurement time per year, so every minute matters. Yet a significant part of that time can be lost to manual polarization calibration, without certainty that the conditions are truly optimal.

To support this work, we developed SparkkFLOW, which automates simulation workflows, and OSCARpes, a dedicated database for one-step photoemission results. These tools help us create clean, scalable, and well-structured training data. The practical benefit is clear: instead of spending hours on manual calibration, researchers can tune the experiment dynamically and devote their limited beamtime to the science itself.

AI Not Only in Physics

How is artificial intelligence transforming research in condensed matter physics and the analysis of experimental data?

Artificial intelligence is changing condensed matter physics in two main ways. In theory, it accelerates the discovery of new materials: models can screen thousands of candidate compounds and predict their electronic properties much faster than traditional first-principles calculations. Work that once took years can now be shortened to days.

At the experimental level, the impact may be even greater. Synchrotron facilities and ARPES measurements produce enormous volumes of data that cannot be manually reviewed during the measurement itself. AI enables real-time analysis of data, pattern recognition, and anomaly detection while the experiment is still running. Scientists can therefore make decisions immediately, rather than weeks later during data processing.

What I find most interesting is the connection between these two worlds: models trained on high-quality theoretical simulations can then be used directly in experiments. In our case, simulations help develop an AI tool that supports real-time measurement setup. This connection between computation, artificial intelligence, and experiment is where I see the greatest potential.

Data, Metadata, and Cultural Change in Open Science

In your work, you deal with complex datasets. What do you see as the main challenges in their management, long-term storage, and sharing?

“These tools help us create clean, scalable, and well-structured training data. The practical benefit is clear: instead of spending hours on manual calibration, researchers can tune the experiment dynamically and devote their limited beamtime to the science itself.”

This is something I deal with every day, and it is often underestimated. Our simulations generate large, multidimensional datasets. Each of them is linked to specific physical parameters, computational settings, and convergence criteria. The first challenge is simply keeping track of what has already been calculated.

Without proper infrastructure, duplicate calculations, missing metadata, and results that cannot be reproduced after six months quickly become problems. That is why we created OSCARpes, a structured database designed to index, deduplicate, and provide one-step photoemission results with full provenance. It may sound more like engineering than physics, but without this layer, science cannot scale.

Long-term storage is another challenge. Academic groups often rely on local servers or institutional clusters, where long-term availability is not always guaranteed. When a PhD student leaves, important data can effectively disappear.

The field is gradually moving towards FAIR principles, but in practice, their adoption remains slow, partly because they require additional work upfront and do not always lead directly to publications.

Sharing is perhaps the most cultural challenge. Condensed matter physics often still works in such a way that data are shared only when an article is published, if at all. But if we want AI-based approaches to work truly, we need large, well-curated, and accessible datasets. Simulations from one group can save another group months of computation — but only if the data are structured and reusable.

These issues may not sound particularly attractive, but without a high-quality data infrastructure, such approaches cannot be developed in the long term. Physics and algorithms are advancing rapidly, and the way we work with data must keep pace.

Do you see differences in approaches to research data and Open Science between countries, for example, between the Czech Republic, Europe more broadly, and Tunisia?

Yes, the differences are quite visible. At the European level, EOSC is becoming a key environment for publishing, finding, and reusing research data across countries and disciplines. For our field, projects such as PaNOSC are also important, as they bring major synchrotron and neutron facilities into this ecosystem.

In materials science, Germany is very advanced. FAIRmat, one of the NFDI consortia, represents the condensed matter physics and chemical physics communities and builds on NOMAD, one of the largest data infrastructures for computational materials science. In my research environment, this is exactly the type of infrastructure that makes FAIR data practically usable.

In the Czech Republic, things are beginning to move significantly. EOSC CZ is helping to build a national infrastructure for FAIR data, and within Open Science II, a specialized repository called DANTEc is being developed for materials science and technology. Discipline-specific solutions like this can greatly help researchers put FAIR principles into practice.

In Tunisia, connections to global research infrastructure are gradually being strengthened, for example, through the adoption of persistent identifiers and cooperation with DataCite. The country has a strong expert community, and linking it with European platforms such as NOMAD or EOSC could significantly accelerate further development.

Even where infrastructure already exists, however, adoption is still slow. Established workflows do not change overnight. Platforms are being built, but the real change is cultural — and that takes time in every country.

You are a member of the EOSC CZ working group focused on metadata and physical sciences. How important are high-quality metadata for data reusability and interdisciplinary collaboration?

“Sharing is perhaps the most cultural challenge. Condensed matter physics often still works in such a way that data are shared only when an article is published, if at all. But if we want AI-based approaches to work truly, we need large, well-curated, and accessible datasets. Simulations from one group can save another group months of computation — but only if the data are structured and reusable.”

Metadata is absolutely essential. Without them, data are just numbers in a file that no one else — and often not even the original author after a few months — can properly interpret.

In photoemission spectroscopy, for example, the spectrum itself is not enough. For the data to be reused, it must be clear under what conditions it was created — with what settings, parameters and computational model. If this information is missing, the data cannot be reliably reproduced, compared, or further used. That is why we designed the OSCARpes database with structured metadata from the beginning.

Working within EOSC CZ shows me how challenging it is to set up metadata so that they are useful across disciplines. What is essential for a physicist may not be equally important for a biologist or a social scientist. The real challenge, therefore, is not only to create standards but also to convince researchers to use them in their everyday workflows. And this brings us back to the cultural change that Open Science requires.

You had the opportunity to meet several Nobel Prize laureates, including Sir Konstantin Novoselov. Did these encounters influence your future scientific direction?

Definitely, meeting Sir Konstantin Novoselov reminded me that even a material such as graphene, about which science already knows a great deal, still holds many unanswered questions. It strengthened my conviction that research into 2D materials still has significant potential.

Meeting Anne L’Huillier was also a very powerful experience for me. Her work shows that we can observe electron motion in real time. For me, that is a fascinating idea: not only knowing what a material looks like at the electronic level, but actually observing what is happening inside it. This is the direction I would like to follow in my future work.

“Metadata is absolutely essential. Without them, data are just numbers in a file that no one else — and often not even the original author after a few months — can properly interpret.”

Ridha Eddhib

comes from Tunisia, where he studied at the University of Carthage, and later completed an internship at Universität Rostock in Germany. He is currently based at the New Technologies – Research Center of the University of West Bohemia in Pilsen. He specializes in condensed-matter physics, machine learning, and advanced spectroscopic methods, particularly in 2D materials. At NTC, he is involved in the European doctoral network EUSpecLab, funded by the Marie Skłodowska-Curie programme, which connects universities and companies across Europe. His research focuses on the application of machine learning to model and develop modern photoemission spectroscopy methods, including experiments on SARPES and 2D materials.

All articles