Molecular Biophysics Database: How Does a Database for Biophysicists Work and What Is Its Benefit for Research?

Measurements in molecular biophysics help scientists understand how biomolecules behave, how their properties are reflected in biological processes, and how they interact with other biomolecules or small molecules. This knowledge can, for example, support the development of drugs targeting specific biological mechanisms.

24 Jun 2026 Lucie Skřičková

The outcomes of these experiments are not only conclusions published in scientific articles, but also extensive sets of raw data that may retain value long after the original research has been completed.

However, for these data to be reused, it is not enough to store them. They must be clearly described, findable, citable, and available in a form that enables further interpretation. This is precisely the purpose of the Molecular Biophysics Database (MBDB), a database for storing, describing, sharing, and reusing experimental biophysical data.

Why a Database for Raw Biophysical Data Is Needed

Raw experimental data obtained using molecular biophysics techniques often contain information that may be valuable to other research teams, to new analytical approaches, or for later verification of published results. In practice, however, such data have often been stored only locally or in general-purpose repositories without clear rules or the ability to perform systematic searches. This has reduced their findability, reusability, and the possibility of independent verification.

At the same time, the field lacked specialized infrastructure to store and provide access to raw data from different experimental methods in a standardized, reusable form. The need for such a solution began to be discussed more intensively within the European research infrastructure project MOSBRI (Molecular-Scale Biophysics Research Infrastructure), in which the Institute of Biotechnology of the Czech Academy of Sciences also participated. This is where the first proposals for a database for experimental biophysical data emerged.

The development of MBDB involved a team from the Institute of Biotechnology of the Czech Academy of Sciences led by Jan Dohnálek, who contributed to the design of the pilot version and the subsequent development of the system. The path to the current form of the database took several years. Its development was not only a technical challenge. It was necessary to find a way to meaningfully describe data from different experimental methods within a single environment, ensure their mutual compatibility, and, at the same time, preserve sufficient detail for specific measurement types.

One of the greatest challenges was designing a data model that was sufficiently detailed yet practical to use. The result is a database that creates a unified framework for storing data from multiple biophysical methods, without the need to build a separate repository for each method. In this way, the database connects the principles of open science with the practical needs of the biophysics community.

One of the greatest challenges was designing a data model that was sufficiently detailed yet practical to use.

Data, Metadata, and the MBDB Metadata Model

For experimental data to be truly reusable, it is not enough to store only the file produced by an instrument. It is necessary to know what was measured, under what conditions, using which instrument, how the data were processed, and who produced them. MBDB therefore works with three basic types of information: metadata about the origin of samples, their preparation, the measurement method, and final results; raw data from instruments; and derived data.

Raw data are numerical data obtained directly from instruments, such as the amount of heat as a function of the added amount of sample, or the change in fluorescence intensity as a function of the concentration of one component of a complex. Derived measurement data are created by processing the original data, for example, by subtracting the background or transforming it into another form.

Metadata plays a key role. They provide context for the data and describe the molecular system studied, the samples used, the chemical environment, instrument settings, the method of analysis, the results obtained, and the authors, their affiliations, related publications, or funding sources. The results represent biophysical parameters derived from the measurements, such as dissociation constants or the stoichiometry of binding partners. It is metadata that makes data more than just “a file somewhere in a database”; it makes the data findable, understandable, and reusable.

Another important concept is the metadata model, which determines how metadata should be structured. In simple terms, it can be imagined as a carefully designed table with named fields that precisely define what information should be added to a record and in what form. MBDB uses a general part of the metadata model common to all experimental methods, along with a specific part that differs for each method. This makes it possible to search the database across measurement techniques while also capturing the detailed information needed for each specific measurement type.

Which Methods Does the Database Support

It is metadata that makes data more than just “a file somewhere in a database”; it makes the data findable, understandable, and reusable.

In its first phase, MBDB focuses on methods for which no suitable infrastructure for storing raw data had previously existed and for which the research community most strongly perceived the need for standardization. MBDB currently supports BLI (Bio-Layer Interferometry), ITC (Isothermal Titration Calorimetry), MST (Microscale Thermophoresis), and SPR (Surface Plasmon Resonance).

The development of the database continues. In the newest public version, it will also be possible to store data obtained using mass photometry, and the database is being prepared for expansion to include additional experimental techniques, such as DSF (Differential Scanning Fluorimetry).

Importantly, MBDB is not intended to be a closed technical solution for only a few selected methods. It is designed as an extensible database that can grow gradually to meet the needs of research communities.

Practical Use of MBDB

Importantly, MBDB is not intended to be a closed technical solution for only a few selected methods. It is designed as an extensible database that can grow gradually to meet the needs of research communities.

Let us imagine a researcher studying the interaction between a protein and a small molecule. She performs a measurement, obtains raw data, processes them, and publishes the results in a scientific article. Without a suitable repository, the raw data would often remain stored only in the laboratory, on a personal drive, or in internal storage.

If she deposits them in MBDB, however, she adds the necessary metadata: what exactly was measured, which molecules were used, in what chemical environment the measurement took place, which instrument and settings were selected, and how the data were analyzed. The record receives a DOI, allowing it to be cited similarly to a scientific article.

Other researchers can return to the data, verify the interpretation of the results, compare them with their own measurements, or use them for new analytical approaches. The benefit, however, is not only for others. The authors of the data themselves also benefit from well-described data deposition. They gain a secure place where they can find their data even after many years, compare it with other experiments, and present it as an independent, citable output of their research.

This is also important for research infrastructures that provide researchers with service access to instruments. Service laboratories need to register newly generated datasets, preserve them over the long term, and, when needed, document their origin, content, and connection to a specific measurement.

Benefits for the Research Community

MBDB helps researchers not only store data but also gradually harmonize the way experimental information in molecular biophysics is described. This is important for data reuse, comparison, and future automated processing.

Structured data may, in the future, serve, for example, for the systematic comparison of measurement protocols, the development of better analytical tools, or the use of machine learning methods. The database, therefore, helps not only individual laboratories but also contributes to better agreement within the biophysics community on standards, data quality, and data sharing.

They gain a secure place where they can find their data even after many years, compare it with other experiments, and present it as an independent, citable output of their research.

How Data Can Be Deposited in MBDB

The process of depositing data is similar to publishing a scientific article. The user creates an account (which currently requires an active ORCID account), prepares a draft of the record, and submits it for review once it is complete. In the upcoming version of the database, it will also be possible to log in through the NRP AAI.

Records can be created not only through the web interface, but also using an API. This is particularly useful when depositing larger amounts of data or extensive metadata. The database does not assess the scientific significance of the measurement, but checks the completeness, consistency, and quality of the data description. After approval, the user can decide when to publish the record. Once published, the record becomes publicly available and receives a DOI through DataCite. This allows the data to be cited in a similar way to scientific publications.

MBDB emphasizes adherence to the FAIR principles and provides detailed documentation describing the process for depositing data and metadata, the data model, and the curation workflow. This facilitates both the deposition of data itself and its subsequent searching and reuse.

Data can be deposited and searched directly on the official MBDB website. The database is also built on open technologies, and its code is available as open source. The Molecular Biophysics Database shows what practical infrastructure for open science in a specific field can look like. It is not merely a technical storage solution but an environment that helps provide experimental data with context, lasting value, and the possibility of further use.

Part of the National Data Infrastructure

MBDB is built on the CESNET Invenio repository system and represents a pilot repository developed within the National Repository Platform for Research Data (NRP) project. At the same time, it is part of the National Data Infrastructure (NDI), whose aim is to provide research organizations with reliable services for the management, sharing and long-term preservation of research data. MBDB thus connects the field-specific needs of molecular biophysics with the development of a national infrastructure for open science and research data management.

If you are interested in the MBDB repository in more detail, read the original scientific article published in The European Biophysics Journal.

Molecular Biophysics Database

All articles

Molecular Biophysics Database: How Does a Database for Biophysicists Work and What Is Its Benefit for Research?

Why a Database for Raw Biophysical Data Is Needed

Data, Metadata, and the MBDB Metadata Model

Which Methods Does the Database Support

Practical Use of MBDB

Benefits for the Research Community

How Data Can Be Deposited in MBDB

Part of the National Data Infrastructure

More articles

From JupyterLab to Metadata: What the 2026 Summer School for Data Stewards Offered

“Published Scientific Literature Is Systematically Biased Toward Positive Findings,” Says Researcher Michal Smetana

Data Stewardship Wizard: A Smart Assistant for Research Data Management

“Without public trust, it won’t work,” Zdenka Dudová talks about the future of health data