Secondary Use of Personal Information in Health Research: Case Studies, November 2002

Executive Summary

The challenge facing Canada today is to reach a workable and practical balance between the value Canadians place on the improvement of their health, the effectiveness of their health care services and the sustainability of their health care system, and the equally compelling value they place on their right to privacy and confidentiality with respect to their personal information.

Health research, particularly in the areas of health services and policy, population and public health, critically depends on the ready availability of existing data about people. Such data include health surveys; hospital, physician and laboratory records; provincial and federal billing and registration data; birth and death records; socio-demographic data; cancer registry data; and employment records. Large volumes of such data are generally needed in order to assemble unbiased samples from which health researchers can draw meaningful conclusions that are representative of populations.

The data are analyzed for the purposes of:

  • monitoring the health of the population;
  • identifying populations at high risk of disease;
  • determining the effectiveness of treatment;
  • quantifying prognosis and survival;
  • assessing the usefulness of preventive strategies, diagnostic tests and screening programs;
  • informing health policy through studies on cost-effectiveness;
  • supporting administrative functions; and
  • monitoring the adequacy of care.

Health research based on the secondary use of data contributes to our present level of understanding of the causes, patterns of expression and natural history of diseases. It also helps us to assess the impact of strategies for improving prevention, diagnosis and treatment and to evaluate policies for increasing the effectiveness and economic efficiency of health services.

Indeed, in the present climate of major public concern about the quality and sustainability of our health care system, health research is urgently needed to help inform and guide health care reform.

While health research is of great social importance, Canadians also highly value their rights to privacy and confidentiality. These rights are intimately connected with the right to respect for one’s dignity, integrity and autonomy in a free and democratic society and are constitutionally enshrined in the Canadian Charter of Rights and Freedoms (Part I of the Constitution Act, 1982, being enacted as Schedule B to the Canada Act 1982, c. 11) and Quebec’s Charter of Human Rights and Freedoms (R.S.Q. c. C-12). Privacy and confidentiality lie at the root of international and national ethics guidelines, as well as professional codes of conduct.

Data protection legislation is rapidly emerging across the country with different requirements applying either at the provincial, territorial or federal level, to personal information generally or personal health information, in the private or public sector. Yet, health services and population health research often crosses provincial or even national borders, and require access to general personal information (e.g. income level, education, work history, etc.), as well as personal health information (e.g. physician, laboratory, hospital records, registration and billing data, etc.), that may be derived from either public or private sources. By their very nature, therefore, these types of studies can potentially invoke multiple laws with varying, and often, inconsistent requirements.

Despite the patchwork of existing legislation, most data protection laws are generally modeled after the internationally accepted Guidelines on the Protection of Privacy and Transborder Flows of Personal Data developed by the Organization for Economic Cooperation and Development (OECD) in 1980. These guidelines have since been adapted by Canadian businesses, consumer groups and governments, under the auspices of the Canadian Standards Association, reformulated into the Model Code for the Protection of Personal Information CAN/CSA-Q830-96 (the ‘CSA’ Code) and more recently incorporated as Schedule 1 of the federal Personal Information Protection and Electronic Documents Act (S.C. 2000, c.5).

The CSA Code is based on ten fair information principles:

  • Accountability
  • Identifying purposes
  • Consent
  • Limiting collection
  • Limiting use, disclosure and retention
  • Accuracy
  • Safeguards
  • Openness
  • Individual access
  • Challenging compliance

Secondary Use of Personal Information in Health Research: Case Studies – November 2002 It is in this context that CIHR created an ad hoc working group composed of health researchers in order to provide real-life examples of research involving the secondary use of data in Canada (see Appendix A for a list of members and Appendix B for a description of the method followed). The objective of this initiative is to foster dialogue:

  • with those who draft policies and laws and those responsible for interpreting them, by providing tangible illustrations of their practical application in the health research context;
  • among researchers about how to comply with the spirit of the fair information principles and how to improve current information practices; and
  • with privacy advocates and the broader public on the benefits and concrete realities of health research.

Nineteen (19) case studies were developed to describe real-life examples of actual research involving secondary use of data in Canada2. They attempt to outline why each study is socially valuable and the benefits that have resulted, or may result. They detail the kind of information health services and population health researchers need and why. They explain how these researchers collect, use and disclose data, what retention practices are followed, what security safeguards are used and what review and oversight mechanisms are in place. When considered in light of existing laws and ethics guidelines, these case studies highlight the practical challenges that arise when applying various legal and ethical norms in the specific context of population health and health services research. Accordingly, the case studies identify a number of ethical and legal issues that warrant further consideration and discussion.

An in-depth analysis of these cases studies focused on a number of important questions specific to health services and population health research, including:

  1. Why do health researchers need to make secondary use of data?
  2. Why is it sometimes impracticable to obtain consent?
  3. Why do research databases need to be retained over a long period of time?
  4. Why do current access policies and linkage practices among data custodians vary so?
  5. What security safeguards do health researchers currently use?
  6. What review and oversight mechanisms are currently in place?

2 Note, although case studies 5 and 6 involve secondary use of tissue originally collected for another purpose, the present document does not purport in any way to cover the complex issues specifically related to the collection and storage of human tissue for research purposes. These warrant further development and analysis that are beyond the scope of this project.

Secondary Use of Data

The case studies demonstrate that the ability to conduct health research, particularly research on health services and population health, depends heavily on large volumes of readily-accessible, existing data. Such data may include information derived from: personal interviews; analyses of tissue samples; results of scientific tests; physician, hospital and laboratory records; birth and death records; billing claims and employee records. The case studies focus on examples of research using data that were originally collected for another purpose (secondary use of data). Such existing data are often found to be extremely useful for identifying and understanding problems, as well as for providing potential solutions.

Researchers who study health services or the health of populations rarely have any direct interest in knowing the specific identities of the people they study. Their focus is on aggregate trends. So, while personal information about identifiable individuals may be the source of data, this type of research is conducted with information that has either been made completely anonymous or has had as many identifiers as possible removed and replaced with encrypted codes. Indeed, many investigators conducting studies would not need any personal identifiers at all were it not for the need to consider the effect of important individual characteristics or to link data about individuals so as to construct histories over time.

In some cases, therefore, the possibility of linking de-identified data to other potentially identifying information remains crucial. This is necessary in order to:

  • study the relationship between certain health determinants and health status;
  • group together individuals on the basis of common characteristics such as age or geographic location; or
  • track individuals over time in order to study the evolution of certain diseases after long latent periods or to assess their progress through the continuum of health care.

Researchers should implement deliberate strategies that make it impossible (or at least extremely difficult) to determine the identity of an individual from the data they use. Current practices for anonymizing, de-identifying and linking personal information (whether carried out by the original data-holder before releasing the data for research purposes or by the researchers themselves once in possession of the data) tend to vary significantly according to what is considered ‘identifiable’. The ongoing challenge will be to reach agreement on what constitutes an appropriate degree of identifiability, recognizing that this concept will continue to evolve. Approaches for de-identifying and linking data need to achieve greater consistency to streamline efforts for meeting and continually improving best practices.

Consent

In clinical research studies, researchers directly interact with potential participants in well-defined protocols and provide them with the detailed information required for obtaining their informed consent. However, strict application of traditional consent

Secondary Use of Personal Information in Health Research: Case Studies – November 2002 procedures in health services and population health research raises problematic issues, particularly for retrospective studies that rely on already-existing, historical or archival data, including sample survey data. Among the factors that often make seeking consent impracticable, impossible or self-defeating in these particular types of studies are:

  • the sheer size of the populations studied;
  • the proportion of individuals who may have since relocated or died;
  • the risk of introducing potential bias through the consent procedure itself thereby affecting the generalizability and validity of research results;
  • the creation of even greater privacy risks by having to link otherwise de-identified data with nominal identifiers in order to communicate with individuals so as to seek their consent;
  • the risk of inflicting psychological, social or other harm by contacting individuals and/or their families in delicate circumstances;
  • the difficulty of contacting individuals directly when there are no ongoing relations with them;
  • the difficulty of contacting individuals even indirectly through public means such as advertisements and notices; and
  • the undue hardship that would be caused by the additional financial, material, human, organizational or other resources required to obtain individual consent.

As regards prospective data collection (for inclusion in a registry, for example), obtaining express consent for future research purposes also poses a challenge. On the one hand, obtaining specific consent for all possible secondary uses of the information which often cannot be predicted at the time of collection, is not feasible. On the other hand, obtaining unqualified, blanket consent for undefined future health research purposes is empty and meaningless and may sometimes reduce, rather than increase, privacy protection on false assurances that informed consent has been obtained.

The case studies demonstrate the need for constructive, creative and innovative ways of respecting peoples’ right to know and to control how their information is used without necessarily requiring that express consent be obtained from every individual in every instance. The case studies demonstrate the need to develop appropriate alternatives to the traditional consent model, specifically for population health and health services research, taking into account the overall balance of risks and benefits both to individuals and society as a whole. These alternatives do not in any way abrogate the obligations to ensure, among other things, that:

  • an open, transparent and accountable process is implemented for managing privacy;
  • the appropriate confidentiality agreements are in place binding users of the information; and
  • effective safeguards are taken to protect the data against unauthorized disclosure.

Retention and Destruction of Data

The existence of data archives and registries, and the retention of data by researchers more generally, make it possible to reconstitute the data if and when needed, since researchers are often required to retain data for possible verification and auditing purposes (though these requirements tend to vary among sponsors and/or publishers). They also enable the expansion of the research question to eventually incorporate additional dimensions or examine related hypotheses in the future. Finally, in unique cases, they allow the identification of potentially affected patients in order to notify them about possible long-term risks of contracting fatal diseases or experiencing adverse effects that were unknown at the time of certain interventions (e.g. risks of contracting Creutzfeldt-Jakob disease in human growth hormone trials, contracting HIV in hepatitis B trials or experiencing adverse effects from certain vaccines).

The automatic destruction of data and/or all possible identifiers immediately upon the fulfillment of each discrete research purpose would prevent these and other important subsequent uses. Furthermore, the destruction of large databases would result in a huge waste of valuable public funds. Having to re-create new data archives for each new research project would be completely impossible and/or entirely cost-prohibitive. Creative means need to be further explored to assess under what conditions databases should be retained in the long-term and if so, how they should be secured (for instance kept in the hands of trusted guardians, subject to formal periodic audits and proper oversight).

Data Access Policies and Capacity for Data Linkage

Access to existing data for purposes of carrying out health services or population health research often involves many different data stewards or custodians. These may include hospitals, public health clinics and laboratories, physicians’ offices, research centres, pharmacies, employers, registries, health information producers, and federal/provincial/ territorial/ municipal government departments and agencies.

The case studies reveal a significant degree of variation in the access policies of these different custodians. Data access policies may involve research agreements that custodians require researchers to sign before releasing any data to them. Whether such agreements are more or less sophisticated and/or detailed in content and conditions is often dictated by different legislative requirements that vary across institutional settings, sectors and jurisdictions. Data access policies may also require, as a condition for releasing data, review and oversight by designated bodies. Some custodians may require approval by their own internal oversight body, in addition to any other review and oversight mechanism(s) that may be required by law. The criteria by which these multiple oversight bodies will consider data access requests will depend, once again, on any internal criteria specific to the custodian and whatever other criteria are imposed by legislation across institutional settings, sectors and jurisdictions.

Moreover, the ability of custodians to perform data linkage in-house for researchers will largely depend on their capacity and resources, which also tend to vary quite significantly. Custodians that can perform data linkage completely in-house rarely need to release any identifying information to researchers. However, custodians that do not have the capacity to perform this service on behalf of researchers are often requested to release identifying information in order for researchers to be able to carry out the linkage themselves. Clearly there is a need to harmonize legislative requirements and data access policies, accommodating for different capacity levels across institutions, in order to better streamline requests for access to information for research purposes.

Security Safeguards

The case studies reveal a wide range of security safeguards that are commonly used in the context of research, including:

  • organizational safeguards, such as limited personnel access, security clearance and employee confidentiality agreements;
  • physical safeguards, such as locked rooms, filing cabinets and facilities; and
  • technological safeguards, such as data encryption, special algorithms, passwords, access codes, tracking features, firewalls, etc.

Further options for protecting personal information are multiplying rapidly with advances in computing technology and a spectrum of new solutions is emerging. More sophisticated techniques are increasingly available for authenticating individual users and limiting access to only the minimal data needed in the most general form possible, thereby ensuring confi- dentiality of the data while also retaining their usefulness for research purposes. The challenges now lie in: better disseminating information about existing security systems and processes; developing a set of minimum standards; ensuring greater consistency in the application of minimum standards; and, continually reviewing, updating and adapting those standards as technology advances. There is clearly a need to identify best practices that are pragmatic, cost-effective and sufficiently flexible to evolve over time and accommodate different research methods.

Review and Oversight Mechanisms

Health services and population health studies conducted in universities and affiliated institutions are typically reviewed by research ethics boards (REBs). REBs are composed of specialized experts in research, ethics and law, as well as lay members of the community. REBs are close to the ground and sensitive to local needs and values. They play a critical role in ensuring the protection of individual privacy, within a larger ethical framework, in accordance with fundamental principles of:

  • respect for human dignity;
  • respect for free and informed consent;
  • respect for vulnerable persons;
  • respect for privacy and confidentiality;
  • respect for justice and inclusiveness;
  • balancing of harms and benefits;
  • minimizing harm; and
  • maximizing benefit.

In Canada, only those research studies approved by REBs are eligible for funding by the three federal granting agencies and/or regulatory approval under the Food and Drugs Act (R.S.C. c. F-27). In particular cases involving secondary use of data or proposed data linkages, academic REBs consider, among other factors: the sensitivity of the information involved; the possibility of identifying particular individuals; the magnitude and probability of harm or stigma resulting from identification; the context in which the information was originally collected; the possibility of obtaining consent; the appropriateness of using alternative strategies for informing participants and/or consulting with representative members of the study group; as well as any legal provisions that may apply in the situation. In their review, REBs apply a proportionate approach in balancing risks and benefits and modulate their requirements accordingly. (Canadian Institutes of Health Research, Social Sciences and Humanities Research Council, Natural Sciences and Engineering Research Council, Tri- Council Policy Statement: Ethical Conduct for Research Involving Humans, August 1998).

Areas for further improvement include: strengthening privacy expertise and education of REB members particularly in light of rapidly evolving technology and emerging legislation; ensuring adequate resources for REBs to meet their mandate for continuing review, monitoring and periodic audits; increasing public accountability and transparency of REBs; and, further exploring the relationship between REBs, privacy commissioners and other oversight bodies. Indeed, in some provinces, legislation requires that privacy commissioners or special privacy committees designated by law also approve (or at least be notified of) the proposed research or data linkage. As discussed above, various data custodians may, in addition, require review and approval by their own internal data access committees. The challenge, therefore, will be to align these various review and oversight bodies to ensure complimentary forms of meaningful protection rather than imposing unnecessary and duplicative hurdles.

Conclusion

In summary, the case studies provide examples of how researchers, who use secondary data, attempt, through various ways, to comply with the spirit of the fair information principles contained in the CSA Model Code. They suggest that there is a need to further develop creative, effective and innovative mechanisms for protecting privacy and confidentiality of data, as well as the need for ongoing discussion and continual improvement of best practices. The CIHR case studies further provide concrete illustrations of the importance for interpreting and applying privacy laws and policies in a flexible, feasible and workable manner in order to permit the valuable social benefits of health research to continue. The case studies suggest that the health research community should work actively with privacy advocates, consumers and the general public to identify and implement strategies for balancing the right of individuals to have their personal information protected and their desire for improved health, more effective health services and a strengthened and sustainable health system.