To content
DATA CHAMPION: RESEARCH DATA MANAGEMENT IN THE RESOLV CLUSTER OF EXCELLENCE

"We want to get a scientific and societal value out of RDM"

Prof. Stefan Kast © Aliona Kardash​​/​​TU Dortmund
Prof. Stefan Kast

Prof. Stefan M. Kast from the Faculty of Chemistry and Chemical Biology coordinates the Task Force Research Data Management (RDM) in RESOLV ("Ruhr Explores Solvation"), the joint Cluster of Excellence of TU Dortmund University and Ruhr University Bochum. He is also a member of "NFDI4Chem", a consortium within the National Research Data Infrastructure (NFDI) initiative, which aims to coordinate the management of research data for a large part of the chemical disciplines. In the interview, Prof. Kast reports, among other things, on the challenges posed by the RDM at the technical, personnel and conceptual levels, and on the solutions that have been developed at RESOLV, for example for electronic lab notebooks and repositories.

Prof. Kast, why is an RDM task force necessary in a cluster of excellence and what specifically distinguishes the work?

Research data management is part of good scientific practice according to modern standards, which was recognized early on in the Cluster of Excellence RESOLV. But initially, we in the cluster lacked the technical and conceptual framework to sustainably deal with the heterogeneity of research data. In addition, we faced very diverse requirements due to the different research directions. Therefore, we established the Task Force with members from RESOLV representing the respective methodological areas. The work in the task force is not only a technical challenge. Rather, we want to prepare and manage research data in a way that adds scientific and societal value. Through our weekly meetings and coordination with the RESOLV Board, we have gradually been able to flesh out our requirements more and more and identify the systems that are relevant to us. However, one problem is still that the relevance of RDM is not clear to everyone. As a task force, we therefore also have to raise awareness and show the working groups that RDM is not just a necessary evil, but offers real added value by keeping data usable and understandable in the long term.

What approaches to solutions for needs-based RDM have you already developed in the task force?

In order to meet the identified needs, we evaluated various systems and finally selected electronic lab notebooks (ELNs) with extended lab management functions and a central data repository with metadata functionality. To put it simply, the ELN is our prototypical documentation notebook. ELNs are always useful as soon as processes are standardized to some extent. This is often the case in synthetic chemistry, especially inorganic and organic chemistry. Such documentation standards are essential to prevent reproducibility crises. If something is maximally reproducible and, in the case of digitally controlled processes or computational methods, automatable, it can be used to generate qualified - we also say curated - data for training machine learning models. Finally, we have chosen Dataverse as our repository. Our central task remains to ensure that electronic lab notebooks and repositories are routinely incorporated into research practice without a major initial hurdle. This also gives rise to the requirements for technical interfaces and metadata. Above all, it is of central importance to establish a uniform definition and use of metadata (such as time of measurement, authorship, methods used). Meaningful "vocabularies" are currently being developed within the NFDI, but are still at an early stage and are not yet readily usable for RESOLV with its particularly interdisciplinary projects. The goal is not to be forced to perform processes such as documentation and archiving of data manually, but rather to automatically link experiment, theory and method, for example. This reduces the susceptibility to errors and increases the traceability and reusability of research results. However, setting up these standards and maintaining the system is very personnel-intensive.

How can the RDM tools currently being developed by the subject consortia of the National Research Data Infrastructure fit into RESOLV's RDM strategy?

The NFDI consortia are currently still developing and evaluating RDM solutions that meet the very different requirements of the individual disciplines. The fact that the heterogeneity of the research disciplines has been recognized and considered is positive. Currently, therefore, national and also local efforts are being made at all levels to find universally valid solutions, as we are doing in RESOLV. Science is fundamentally a democratic process that is not controlled by an authoritarian force. Nevertheless, how the process - in this case RDM - is carried out is a governance issue. We will need to expect RESOLV members to adhere to standards. Therefore, it is important that those with technical responsibility also consistently implement the overarching concepts themselves: Data must be findable, accessible, compatible, and reusable. The scientific actors' requests in terms of support or direction is reflected jointly within RESOLV and also the NFDI. In this way, RESOLV and the NFDI mutually enrich each other.

About the person:

  • Since 2009 Professor for Theoretical Physical Chemistry at TU Dortmund University.
  • 2012 Founding member and since then Vice Chairman of the Dortmund Center for Scientific Computing (DoWiR).
  • since 2018 Dean of the "integrated Graduate School Solvation Science" (iGSS) of the Cluster of Excellence RESOLV
  • since 2018 Dean of the Faculty of Chemistry and Chemical Biology at the TU Dortmund University of Technology

Prof. Kast is portrayed as a Data Champion because he is testing innovative strategies of data handling with his task force and wants to make RDM visible as an independent scientific achievement.

Further informations: