The SPHN Semantic Interoperability Framework
Semantic interoperability ensures that information is consistently interpretable by both machines and humans - across projects, systems, countries and over time. To enable the use of health data from clinical routine and other sources for research, SPHN has developed a semantic interoperability framework.
How it is implemented
Semantic representation (Pillar 1)
SPHN concepts are generalizable building blocks, which can be used in different contexts. Each concept contains all information necessary to understand it, and concepts can be combined to composed concepts, which again can be combined to more complex compositions. It is important to find the right level between abstraction and granularity to optimize the power of expression. The approach can be illustrated with the example of “substance”. A substance can be an active or inactive ingredient of a drug or it can be the substance someone is allergic to. Therefore, we can abstract “substance” as a concept on its own. The concept of substance is composed of two concepts: “code” and “generic name”. These concepts describe a substance no matter if it is the active ingredient of a drug or the substance to which someone is allergic to.
To make the SPHN concepts comparable nationally and internationally, we express the meaning of an SPHN concept using existing semantic standards (controlled vocabularies), by creating a meaning binding wherever possible to SNOMED CT or/and LOINC. The data element of a concept can be expressed using one or several recommended standards (e.g. LOINC, SNOMED-CT, ICD-10, ICD-O-3, CHOP, ATC). For example, the instance of substance code under the concept “allergy” can be an ATC code, a SNOMED CT code, or a code form another semantic standard. If needed value sets are defined and if possible, a value set binding to SNOMED CT is added. Descriptions for concepts and value sets as well as standards are aligned with national and international sources.
Data transport and storage (Pillar 2)
SPHN concepts (blue) and their instances (i.e., the data in red) can easily be mapped from/to other data representations or merged with other RDF data sets without losing their semantics.
The relations between different concepts and the data are expressed in the form of triples composed of a “subject”, a “predicate”, and an “object”. In the example, the RDF triple indicates that the “Patient” (subject) “has” (predicate) “Birth Date” (object), and “Birth Date” (subject) “has a datetime” (predicate) which is “01.01.2020” (object). Since RDF does not depend on a specific semantic standard, it allows the use of different semantic standards, and value sets as defined in the SPHN Dataset.
In RDF subjects, predicates and objects have a Unique Resource Identifier (URI). In our example, the concept Allergy has the URI https://biomedit.ch/rdf/sphn-ontology/sphn/allergy, which uniquely and unambiguously identifies it in the context of SPHN. The SNOMED CT meaning binding is introduced by linking this URI to the corresponding URI in SNOMED CT in the SPHN RDF Schema.
To facilitate the use of external standards in SPHN RDF Schema, the DCC provides RDF files of the following coding systems: ATC, CHOP, LOINC, SNOMED CT, ICD-10, UCUM.
How does this strategy help to address the following FAIR criteria:
- F1. (Meta)data are assigned a globally unique and persistent identifier
--> in RDF, each concept, instance, and relation have a URI
- F2. Data are described with rich metadata
--> each SPHN concept contains a set of information to describe the data element
- A1. (Meta)data are retrievable by their identifier using a standardized communications protocol
--> SPARQL is the official RDF Query Language which is a standard protocol of the World Wide Web Consortium (W3C). It facilitates the exploration of data in RDF
- I1. (Meta)data use a formal, accessible, shared, and broadly applicable language for knowledge representation
--> RDF is the standard language used for knowledge representation
- I2. (Meta)data use vocabularies that follow FAIR principles
--> SNOMED CT and LOINC along with other standards are used as vocabulary in the SPHN RDF Schema
- I3. (Meta)data include qualified references to other (meta)data
--> when possible, the SPHN URIs of concepts and instances are linked to the URIs of the external resources like SNOMED CT, LOINC, ATC, ICD-10.