Guidance for de-identification of health-related data in compliance with Swiss legal and data protection regulations
Current data governance practice in hospitals allows data sharing if certain conditions and criteria are fulfilled. For most of the Swiss research projects this includes a verification of the:
- Project plan
- Patients’ consent
- Ethical approval
- Legal agreement among project parties
- Technical security measures
- De-identification of data
The de-identification of health-related data (together with other conditions) postulates an essential approach to protect patient privacy and is a prerequisite for data sharing among a broader research community. Even though there are international guidelines available concerning the de-identification of data, there is no guidance for the de-identification of health-related data specifically taking into account the Swiss law and data protection regulations. The U.S.-Health Insurance Portability and Accountability Act (HIPAA) Privacy Rule establishes national standards to protect individuals' medical records and other personal health information, and also Swiss research projects often refer to the HIPAA when documenting the de-identification process. However, the HIPAA Privacy Rule cannot be executed in its form in Switzerland.
The PHI Group has therefore launched the de-identification project to develop Swiss recommendations for de-identifying health-related data making data sharable in compliance with Swiss legal requirements and data protection regulations. The recommendations are being elaborated with representatives of Swiss university hospitals and legal opinion leaders, pooling experiences and knowledge regarding responsible data sharing. Feasibility aspects for the implementation of recommended de-identification approaches are considered, but implementation per se belongs to the responsibility of each institution.
- Obtaining a legal opinion to inquire the data de-identification approach according to legal requirements in Switzerland.
- Development of hands-on guidance for de-identifying data considering i) the overall project specifications following a quantitative and qualitative risk-based approach and ii) the pragmatic mitigation of re-identification risk by producing a set of de-identification rules.
- Development of a template to document and justify the approach of de-identification.
Swiss legal framework for de-identification of health-related data
To ensure that elaborated de-identification recommendations are in accordance with Swiss legal requirements, the PHI Group has requested an independent legal opinion. The Homburger AG presents in its memorandum* the process of de-identifying personal data and its key elements under the Swiss Data Protection Act (DPA) and the Human Research Act (HRA), and gives evidence on the de-identification requirements to be met. Moreover, it discusses to what extend the two methods provided by the HIPAA – the “Expert Determination” method and the “Safe Harbor” method – meet Swiss legal requirements. The “Expert Determination” method represents a formal determination by a qualified expert. The “Safe Harbor” implies the removal of specified identifiers as well as absence of actual knowledge by the medical professionals that the remaining information could be used alone or in combination with other information to identify the individual.
In summary, the memorandum implies that the simple removal of the direct identifiers does not necessarily result in the data being re-identified only with disproportionate effort, since it does not consider the risk by combination or other remaining risk. Therefore, any rule-based approach will have to be combined with a risk assessment, in order to satisfy Swiss law requirements.
Used terms in the Swiss and international research context
The de-identification process results in coded (pseudonymized) or anonymized data, depending on the method used to be consistent with the specifications of the research project (see figure 1).
Figure 1: De-Identification process results in coded or anonymized data.
For the further use of data, it is crucial to consider the respective legal framework depending on the country where data is processed (Table 1). In this context it needs to be considered that terms may differ in their naming in the applicable legal regulation, but are possibly describing similar categories of data, such as coded and pseudonymized data. On the other hand, coded or anonymized data, may be used in a different sense by the research parties even though the definition of the applicable law appears to be clear. Moreover, it needs to be differentiated between terms describing already de-identified data (coded or anonymized data) or the process of coding (pseudonymization) or anonymization itself, which might be legally defined.
In this context, the European Data Protection Supervisor has recently published the most important misunderstandings related to anonymization, which can be found here.
Table 1: Terms & legal regulations concerning further use of data effective in Switzerland, the European Union & the United States of America.
Data are supposed to be truly anonymized, if re-identification of a person is only possible with a disproportionate effort. Anonymization can include an irreversible masking or deletion The concordance table as depicted in figure 1 should be for example deleted.
Coded or pseudonymized data are de-identified data which are still considered as personal data. The process of coding or pseudonymization is reversible, so that re-identification of the data subject is possible with the according key (concordance table) but is restricted to duly authorized users. Although using coded/pseudonymized data might be accompanied with a higher risk of re-identification, it brings some advantages for the researchers legitimating the use of coded/pseudonymized data only, such as:
- Facilitating follow up research
- Avoiding loss of value of data that have been anonymized
- Informing data subjects or their care providers of reportable, incidental findings
A phased approach concept for the de-identification of data
The description and documentation of the applicable de-identification process for coding/pseudonymization or anonymization becomes currently part of the data governance framework of Swiss hospitals in order to be conform with institutional data protection regulations to safeguard patients’ privacy.
The general approach of de-identification focuses on risk reduction, it does not aim at eliminating absolutely all risk of re-identification of a data subject. The goal is to reduce the risk of re-identification to a level consistent with the law while also achieving the objectives of the intended use.
According to the legal opinion provided by Homburger AG* it needs to be taken into account whether there is a reasonable risk that a person with access to the data could and would re-identify the data, considering all relevant circumstances in order to assess whether data is de-identified.
Phase 1 – De-identification use case evaluation
The purpose of phase 1 is to describe the project profile and the infrastructure used to collect, transfer and process health-related data for the project. The output of this phase is a filled-out questionnaire which constitutes the basis for the assessment of the risk of re-identification. It contains information regarding the project context, project data controllership, information security, data transfer, use and processing and type of data and variables.
Phase 2 – Re-identification risk assessment
The filled out questionnaire will serve as risk evaluation sheet whereas each answer is associated with a defined risk level and weight. Depending on the project profile the risk for re-identification will be evaluated. The output of phase 2 is the risk assessment report.
Phase 3 – Re-identification risk mitigation
Based on the risk assessment report identified risks are evaluated and according to applicable de-identification rules mitigated. The output of phase 3 is the documentation of applied risk mitigation methods.
Phase 4 – De-identification implementation
The implementation of the respective de-identification methods chosen by the case-by-case assessment will be planned on an institutional level.
Phase 5 - Re-identification residual risk evaluation
The goal of phase 5 is to identify residual risks. If applicable, it allows to go back to phase 3 in order to apply additional or other rules which result in additional risk mitigation. If there is no residual risk remaining the de-identification process results in a de-identified dataset (please note that a periodic review as suggested in phase 6 is recommended)
Phase 6 – Use case and risk periodic review
Since it could happen that during the project or after additional risk mitigation processes, new records are added or contextual changes may apply, it is recommended to perform a periodic review of the phased approach.