Article by Dr. Katrin Crameri in MIRACUM Journal #3, March 2020 (translation of the original German text)
In tomorrow's digital, learning healthcare system, healthcare and research must go hand in hand. In order to meet this aspiration, current national projects are also striving for international harmonization and interoperability.
The efforts are enormous, but there is no alternative if the collection and utilization of health data is to be carried out in accordance with European values. It is not surprising why there is currently (finally) a jolt in European medical informatics. There is still time, even if it is becoming scarcer, to position and equip national research repositories in such a way that citizens do not have to fear a sell-out.
Enabling data-driven research
In an environment where large companies and entire countries are taking up positions, the population's demand that their own data are or will be of use is also growing. Given the vast amount of health data available today, healthcare decisions should no longer be based on population averages, but should take into account individual patient characteristics. However, only the analysis of a large amount of patient data from routine care and other sources has the potential to drive relevant changes in medicine. We are talking here, for example, about more effective prevention, more efficient care and the possibility of new, personalized therapies through data-driven research.
In Switzerland, the Parliament has allocated a total of CHF 68 million to make health data more usable for research. Within the framework of the "Swiss Personalized Health Network", a national research infrastructure initiative (duration: 2017-2020) - similar to the medical informatics initiative in Germany - we have now been working for three years, among other things, on making clinical routine data, image data and data from molecular and genomic examinations of our five university hospitals interoperable. A second funding period until 2024 will follow.
Even though the consolidation of data in federal Switzerland is in itself challenging, we are also pursuing a uniform semantic interoperability strategy with the SPHN initiative. This is because data points can only be understood by humans (i.e. in our case by researchers from a wide range of disciplines), but also by machines, if they are precisely formulated. With regard to the structuring and comparability (and thus combinability) of data, terminology and agreement on standards play an important role, especially for research - and in today's world this is ideally done in line with international harmonization efforts.
«Digitization is more than just the conversion from analogue to digital. Its goal should be to achieve added value and benefits for health care, in coordination with national and international standardization initiatives».
The future belongs to sustainable interoperability
For data transport and data storage, a flexible technical solution is needed that allows data to be transported without affecting its semantic meaning and that allows maximum flexibility in data reuse. Ideally, the transport format is data model independent, since the choice of such a model (such as OMOP, i2b2 or CDISC) depends strongly on the requirements and needs of data usage and should therefore be determined by the recipient rather than the data provider. To achieve this, SPHN is currently evaluating the Resource Description Framework (RDF), a commonly used approach developed to address a similar challenge on the World Wide Web.
A collective Herculean task
The definition of semantic guidelines is a (collective!) Herculean task, as it should best cover all fields and indications. For the researchers who are waiting for the data for their research projects to be delivered from the hospitals - the data exchange within SPHN always takes place in compliance with all data protection regulations and ethical standards - it would in principle suffice for the data to be interoperable within their project. However, if we are striving for sustainable interoperability - across projects, across systems, across national borders and over time - compromise solutions or quick wins won’t get us very far.
At the beginning of the Swiss initiative, we assumed that our driver projects, which help us to develop and validate the infrastructures, would primarily use data already available in the hospitals for their research projects. However, we soon found that most of the projects took a prospective approach and defined on a project-specific basis which data from which patient groups they would like to include in order to answer their questions. This indeed presents the systems with additional challenges, because in some places the desired data is not collected at all in routine care, or at least not in the desired way, which is equivalent to not available for research.
Late mapping holds potential for errors
A further challenge is that the standardization pursued is only brought into play at the very end of the data processing chain. This means that data is recorded in the various source systems of the clinics - not in the form that is optimal for research (as structured and standardized as possible) - but in the way that is customary for care, namely largely in the form of free text and without the use of uniform standards. Within university hospitals, structured and unstructured data usually flow into a so-called data-lake, i.e. a large, common repository. Only when the data requested for research is fished out of this lake or directly from the primary systems and processed, the interoperability specifications recommended by SPHN come into play. If the data is not available at the respective locations in the agreed standards or terminology, additional mapping is required. This is possible if the primary data is available in sufficiently fine granularity and data dictionaries are provided. Since these prerequisites are often not met, "late" mapping can lead to a lot of work and has a certain potential for error.
The requirements for the data that we have from a research point of view, in terms of structuring and standardization, but also in terms of their quality, i.e. their completeness, consistency, validity and accuracy, are also relevant for the use of big data analysis and artificial intelligence for direct knowledge acquisition in patient care. The system therefore has a definite interest beyond research in optimizing the processes surrounding data acquisition at the forefront.
Digitization is more than just the conversion from analog to digital, and should not be used for its own sake, but with the aim of achieving added value and benefits for healthcare, in coordination with national and international standardization initiatives. In this respect, we have a lot of catching up to do, both in Switzerland and in Germany.