Data resource basics

Canada’s primary health care system

The Canadian health care system provides universal, publicly funded health services to its population. Coverage includes visits to primary care, specialists, hospitalization and, in some provinces, universal drug coverage. Health care is funded and managed within each provincial and territorial jurisdiction, under the oversight of the Canada Health Act and with financial assistance from the federal government.

Primary care is typically the first point of contact in the health care system and is responsible for much of the prevention, diagnosis and management of chronic disease in the community. Among high-income countries, Canada has one of the lowest rates of electronic medical record (EMR) use among family physicians.1 To this extent, Canada largely relies on administrative databases (e.g. physician billing claims, hospital discharge abstracts) and surveys to answer questions about primary health care. Many countries with a longer history of EMR use, such as the UK and The Netherlands, have been able to successfully harness these data for research and surveillance for many decades. In recent years, Canada has been rapidly catching up with its international peers in the use of EMR systems and the opportunity to use these data now exists. The detailed clinical information found in EMR data can provide a more comprehensive and contextually relevant perspective of primary care activities in Canada, which is useful for policy decision making, disease surveillance, health services research and clinical quality improvement.

The Canadian Primary Care Sentinel Surveillance Network (CPCSSN)

CPCSSN is the first, largest and only pan-Canadian primary care EMR database in the country. It began development in 2005 for the purposes of establishing a data source to support primary care research and national chronic disease surveillance, though increasingly, clinicians have found these data useful for improving patient care and practice efficiencies.

CPCSSN is organized as a ‘network of networks’, in which existing primary care practice-based research networks (PC-PBRN) across the country (Figure 1) unite to contribute de-identified patient data from their participating family physicians and nurse practitioners practising in full-service, community-based primary care clinics, who thus become CPCSSN ‘sentinels’. Recently, some PC-PBRNs have expanded recruitment to include community paediatricians, though currently this represents a small proportion of the data holdings. As data custodians, sentinels consent on behalf of their patients to participate in CPCSSN and all patients are therefore included in the database unless they have specifically opted out.2 Each province operates under this implied consent model with the exception of Quebec, where health legislation requires patients to give consent individually.3 Participating practices are given patient information posters and brochures to display within the clinic. These inform patients about the CPCSSN project and provide information about opting out.

Figure 1.

Location of primary care practice-based research networks (PC-PBRNs) contributing to CPCSSN.

Figure 1.

Location of primary care practice-based research networks (PC-PBRNs) contributing to CPCSSN.

CPCSSN has developed extraction processes to capture data from 15 different EMR systems across the country (Appendix 1, available as Supplementary data at IJE online). The routine extraction of de-identified patient data is performed remotely in order to minimize disruption to the participating clinics. Once the relevant data are extracted, they are transferred to a local server specific to each contributing PC-PBRN, where cleaning and coding algorithms are applied and the data are standardized into a common format. The data from each network are then merged into the national database. Both local and national databases are held on CPCSSN servers within the Centre for Advanced Computing4 at Queen’s University in Kingston, ON, Canada.

Case definitions

The comprehensiveness and variability of EMR data necessitates developing case definitions designed specifically for this data source. Although most Canadian primary care EMR products use the International Classification of Disease Version 9 (ICD-9) to code diagnoses, free text is also frequently used, which adds to the complexity of identifying patients with a particular condition. Case definitions for eight chronic and neurological diseases were created by CPCSSN using a variety of text words, ICD-9 codes and disease-specific criteria such as medications or laboratory results.5,6 The case definitions were applied to the CPCSSN database and validated by networks across the country using the original patient chart as the gold standard, where most definitions demonstrated very good sensitivity and specificity.5,6 These included hypertension, diabetes, osteoarthritis, depression, chronic obstructive pulmonary disease, dementia, epilepsy and parkinsonism. Definitions for additional conditions are under development.

Ethics

Each PC-PBRN has received research ethics approval for the CPCSSN project at their respective host universities. Any studies external to current CPCSSN approvals require separate research ethics approval from the researcher’s home university and an Institutional Data Sharing Agreement with Queen’s University.

Data security & confidentiality

CPCSSN has taken numerous measures to ensure the highest data security and adherence to privacy policies.2 All data are transferred securely using an encrypted Virtual Private Network (VPN) connection. Both regional and national CPCSSN data are stored in the Centre for Advanced Computing (CAC) at Queen’s University, a high-quality computing facility with multiple layers of physical and digital security.4

Each contributing PC-PBRN has conducted at least one Privacy Impact Assessment, involving extensive documentation of all project-related processes with an assigned risk score for each privacy requirement, the development of risk mitigation procedures, and a review of provincial health legislation and university-specific security policies as applicable to the project.

Ensuring patient and provider confidentiality is a core value of the CPCSSN network. The names and personal information of participating sentinels are never released without their explicit consent. Re-identification of patients cannot occur outside the clinic environment or without the consent of the sentinels and separate approval from the research ethics board, giving sentinels full control over their patients’ sensitive information.

Data resource area & population coverage

As of May 2016, there were nearly 1200 sentinels participating in CPCSSN from over 200 practice sites.7 Clinical and demographic information for more than 1.5 million patients is contained within the database, with approximately 700 000 patients recording at least one clinic visit in the previous 12 months. PC-PBRNs recruit practices from most provinces and territories across Canada, including British Columbia, Alberta, Manitoba, Ontario, Quebec, Newfoundland and Labrador, Nova Scotia and the Northwest Territories (Figure 1). CPCSSN continues to recruit sentinels and additional PC-PBRNs.

Representativeness

As a data source for surveillance and research, much consideration is given to ensuring that contributing patients, sentinels and clinics are representative of their respective Canadian base populations. As expected in a primary care sample,8,9 older adults and females are over-represented in the CPCSSN database as compared with the national population.10 As such, it is important to consider age and sex standardization and/or adjustment for all surveillance and research studies employing these data. It may also be likely that patients in the CPCSSN database have higher socioeconomic status than that of the general Canadian population.11 However, to date, the CPCSSN database does not contain the necessary information to allow users to properly adjust for systematic differences in socioeconomic status.

Participating physician and clinic characteristics were evaluated against the 2013 National Physician Survey, which collects information about location of practice (urban or rural), type of practice (academic or community-based), and age and sex of the provider.12 Sentinel physicians contributing to CPCSSN tend to be more often female and younger than physicians in general in Canada, though provincial variability is evident.10

Data collected

Patient data

Depending on when contributing clinics introduced their EMR system into their practice or when historical data were entered into the patient chart, CPCSSN data start at various time points, with some records going back to the early 1990s. Data from 2008 onwards are considered acceptably robust for analysis, as the increased uptake and more complete use of EMR systems over time has contributed to better data quality and volume in the later years.

Patient-level information is collected from almost the entire medical record, including non-identifiable demographics, current and historical diagnoses, medications, physical measures (such as blood pressure, height and weight), laboratory results, referrals, medical procedures, risk factors and physician billing submissions (if available). Data elements currently being captured are summarized in Box 1. At present, CPCSSN does not extract scanned documents because the data within these are not easily extracted and often include identifiable text that can be difficult to redact. Physician notes are also excluded for similar reasons. Due to the exceptionally large volume of laboratory and physical examination data, CPCSSN processes only those tests and values that are related to the eight conditions for which a case definition has been validated in Box 1. Additional laboratory and examination values will be added based on resources and priorities within the network.

CPCSSN has developed extensive cleaning algorithms to address data quality issues that can impede the use of EMR data. Many data elements are assigned to a cleaned field (including both ICD-9 codes and text words) alongside the original entry. The cleaning process takes into consideration the many different ways of entering diagnoses into the EMR, including abbreviations and misspelled words, and maps them to a diagnosis category based on ICD-9 classification headings.

Box 1. CPCSSN Data Elements (as of May 2016)


 
Practice and provider information  

 
Provider Practice site 
 Provider ID  Site ID 
 Provider type  Location type 
 Provider start date  Site postal code 
 Provider end date  Urban/rural location 
 Provider birth year  Academic practice (Y/N) 
 Provider sex  Practice focus, if applicable 
 Approximate panel size Electronic Medical Record system 
 Medical qualification obtained from  EMR name 
Canadian school (Y/N)  EMR version 
 Year advanced medical qualification  EMR effective date 
obtained  Extraction date 

 
Patient information  

 
Demographics Physical examinationsc 
 CPCSSN patient ID  Record creation date 
 Sex  Examination name 
 Birth month and year  Performed date 
 Ethnicity  Examination result 
 Occupation Risk factors 
 Highest education  Record creation date 
 Housing status  Start/end date 
 Language  Risk factor name 
 Residence postal code  Risk factor status 
 Patient EMR status Allergies 
 Year deceased  Record creation date 
Encounter diagnoses (visit)  Start date 
 Record creation date  Allergy name 
 Encounter date  Reaction type 
 Diagnosis text  Severity 
 Diagnosis ICD-9 code  Status 
Profile/health conditions Vaccines 
 Record creation date  Record creation date 
 Onset date  Given date 
 Diagnosis text  Expiry date 
 Diagnosis ICD-9 code  Vaccine name 
CPCSSN conditiona  ATC code 
 Condition present  DIN 
 Condition count  Route, dose, lot 
 Disease case indicator Medical procedures 
 Index date Record creation date 
Medication  Procedure name 
 Start/stop date Referrals 
 Medication name  Record creation date 
 ATC code  Referral name 

 
Practice and provider information  

 
Provider Practice site 
 Provider ID  Site ID 
 Provider type  Location type 
 Provider start date  Site postal code 
 Provider end date  Urban/rural location 
 Provider birth year  Academic practice (Y/N) 
 Provider sex  Practice focus, if applicable 
 Approximate panel size Electronic Medical Record system 
 Medical qualification obtained from  EMR name 
Canadian school (Y/N)  EMR version 
 Year advanced medical qualification  EMR effective date 
obtained  Extraction date 

 
Patient information  

 
Demographics Physical examinationsc 
 CPCSSN patient ID  Record creation date 
 Sex  Examination name 
 Birth month and year  Performed date 
 Ethnicity  Examination result 
 Occupation Risk factors 
 Highest education  Record creation date 
 Housing status  Start/end date 
 Language  Risk factor name 
 Residence postal code  Risk factor status 
 Patient EMR status Allergies 
 Year deceased  Record creation date 
Encounter diagnoses (visit)  Start date 
 Record creation date  Allergy name 
 Encounter date  Reaction type 
 Diagnosis text  Severity 
 Diagnosis ICD-9 code  Status 
Profile/health conditions Vaccines 
 Record creation date  Record creation date 
 Onset date  Given date 
 Diagnosis text  Expiry date 
 Diagnosis ICD-9 code  Vaccine name 
CPCSSN conditiona  ATC code 
 Condition present  DIN 
 Condition count  Route, dose, lot 
 Disease case indicator Medical procedures 
 Index date Record creation date 
Medication  Procedure name 
 Start/stop date Referrals 
 Medication name  Record creation date 
 ATC code  Referral name 
 DIN Physician billing 
 Strength, dose, frequency  Record creation date 
Laboratory resultsb  Service date 
 Record creation date  Diagnosis text 
 Performed date  Diagnosis ICD-9 code 
 Laboratory name  
 Laboratory result  
 DIN Physician billing 
 Strength, dose, frequency  Record creation date 
Laboratory resultsb  Service date 
 Record creation date  Diagnosis text 
 Performed date  Diagnosis ICD-9 code 
 Laboratory name  
 Laboratory result  

ATC, Anatomical Therapeutic Chemical classification; DIN, Drug Identification Number; Y/N, yes/no.

a

For conditions with a validated CPCSSN case definition: hypertension, diabetes, osteoarthritis, depression, chronic obstructive pulmonary disease, dementia, parkinsonism, epilepsy.

b

Laboratory results for: fasting glucose, glucose tolerance, hemoglobin A1C, high density lipoprotein (HDL), low density lipoprotein (LDL), total cholesterol, triglycerides, microalbumin, urine albumin creatinine ratio, international normalized ratio (INR), thyroid stimulating hormone (TSH), hemoglobin, creatinine, glomerular filtration rate (GFR).

c

Physical examinations for: systolic & diastolic blood pressure, height, weight, waist circumference, waist to hip ratio, body mass index, peak expiratory flow rate.


 
Practice and provider information  

 
Provider Practice site 
 Provider ID  Site ID 
 Provider type  Location type 
 Provider start date  Site postal code 
 Provider end date  Urban/rural location 
 Provider birth year  Academic practice (Y/N) 
 Provider sex  Practice focus, if applicable 
 Approximate panel size Electronic Medical Record system 
 Medical qualification obtained from  EMR name 
Canadian school (Y/N)  EMR version 
 Year advanced medical qualification  EMR effective date 
obtained  Extraction date 

 
Patient information  

 
Demographics Physical examinationsc 
 CPCSSN patient ID  Record creation date 
 Sex  Examination name 
 Birth month and year  Performed date 
 Ethnicity  Examination result 
 Occupation Risk factors 
 Highest education  Record creation date 
 Housing status  Start/end date 
 Language  Risk factor name 
 Residence postal code  Risk factor status 
 Patient EMR status Allergies 
 Year deceased  Record creation date 
Encounter diagnoses (visit)  Start date 
 Record creation date  Allergy name 
 Encounter date  Reaction type 
 Diagnosis text  Severity 
 Diagnosis ICD-9 code  Status 
Profile/health conditions Vaccines 
 Record creation date  Record creation date 
 Onset date  Given date 
 Diagnosis text  Expiry date 
 Diagnosis ICD-9 code  Vaccine name 
CPCSSN conditiona  ATC code 
 Condition present  DIN 
 Condition count  Route, dose, lot 
 Disease case indicator Medical procedures 
 Index date Record creation date 
Medication  Procedure name 
 Start/stop date Referrals 
 Medication name  Record creation date 
 ATC code  Referral name 

 
Practice and provider information  

 
Provider Practice site 
 Provider ID  Site ID 
 Provider type  Location type 
 Provider start date  Site postal code 
 Provider end date  Urban/rural location 
 Provider birth year  Academic practice (Y/N) 
 Provider sex  Practice focus, if applicable 
 Approximate panel size Electronic Medical Record system 
 Medical qualification obtained from  EMR name 
Canadian school (Y/N)  EMR version 
 Year advanced medical qualification  EMR effective date 
obtained  Extraction date 

 
Patient information  

 
Demographics Physical examinationsc 
 CPCSSN patient ID  Record creation date 
 Sex  Examination name 
 Birth month and year  Performed date 
 Ethnicity  Examination result 
 Occupation Risk factors 
 Highest education  Record creation date 
 Housing status  Start/end date 
 Language  Risk factor name 
 Residence postal code  Risk factor status 
 Patient EMR status Allergies 
 Year deceased  Record creation date 
Encounter diagnoses (visit)  Start date 
 Record creation date  Allergy name 
 Encounter date  Reaction type 
 Diagnosis text  Severity 
 Diagnosis ICD-9 code  Status 
Profile/health conditions Vaccines 
 Record creation date  Record creation date 
 Onset date  Given date 
 Diagnosis text  Expiry date 
 Diagnosis ICD-9 code  Vaccine name 
CPCSSN conditiona  ATC code 
 Condition present  DIN 
 Condition count  Route, dose, lot 
 Disease case indicator Medical procedures 
 Index date Record creation date 
Medication  Procedure name 
 Start/stop date Referrals 
 Medication name  Record creation date 
 ATC code  Referral name 
 DIN Physician billing 
 Strength, dose, frequency  Record creation date 
Laboratory resultsb  Service date 
 Record creation date  Diagnosis text 
 Performed date  Diagnosis ICD-9 code 
 Laboratory name  
 Laboratory result  
 DIN Physician billing 
 Strength, dose, frequency  Record creation date 
Laboratory resultsb  Service date 
 Record creation date  Diagnosis text 
 Performed date  Diagnosis ICD-9 code 
 Laboratory name  
 Laboratory result  

ATC, Anatomical Therapeutic Chemical classification; DIN, Drug Identification Number; Y/N, yes/no.

a

For conditions with a validated CPCSSN case definition: hypertension, diabetes, osteoarthritis, depression, chronic obstructive pulmonary disease, dementia, parkinsonism, epilepsy.

b

Laboratory results for: fasting glucose, glucose tolerance, hemoglobin A1C, high density lipoprotein (HDL), low density lipoprotein (LDL), total cholesterol, triglycerides, microalbumin, urine albumin creatinine ratio, international normalized ratio (INR), thyroid stimulating hormone (TSH), hemoglobin, creatinine, glomerular filtration rate (GFR).

c

Physical examinations for: systolic & diastolic blood pressure, height, weight, waist circumference, waist to hip ratio, body mass index, peak expiratory flow rate.

Practice and provider data

Providers and practices are distinguished in the database by a non-identifying study ID. A select number of provider and clinic characteristics are available and linked to each patient record. These variables include type of provider (family physician, nurse practitioner or paediatrician), sentinel year of birth, sex, Canadian or foreign medical education, year of completed medical training, whether the clinic is academic or community based and whether in a rural or urban setting (Box 1).

Data anonymization

To meet the technical requirements for ensuring CPCSSN data repositories contain only anonymized information, CPCSSN employed a Research Privacy and Ethics Officer to review and evaluate a three-phase approach to de-identification. The first phase was the exclusion of all structured direct and indirect identifier data fields (i.e. name, address, provincial health number) in the source EMR data. This is conducted during the initial data extraction phase by either the PC-PBRN data manager or directly through the EMR vendor. Alternatively, the PC-PBRN data manager can enter into a confidentiality and security agreement with the clinic, which permits the data manager to access the most recent back-up file stored in an encrypted folder on the EMR server. Once the health data are stripped of any directly and indirectly identifying information from the structured data fields, only data from specific EMR fields are extracted.

The second phase of de-identification is the application of algorithms to replace patient identifiers that may appear in unstructured, free-text fields with random digits (for example, a phone number 316‐544‐8371 would become &l;tel#&gt). First and last names, as well as regular expression pattern matching for other identifiers, are suppressed and replaced by a series of X’s.

The third and final phase of data anonymization is the application of the PARAT tool13 to further reduce the statistical risk of re-identification by combinations of different fields. If the tool detects higher or unacceptable levels of potential re-identification risk, it suppresses fields within certain records or reduces the level of detail within certain fields, such as dropping one or more characters in a postal code or rounding the year of birth to 5-year bands. The PARAT tool is applied before the release of CPCSSN data for ethics board-approved research.

Data resource use

CPCSSN provides a unique data source not currently available elsewhere in Canada. The national CPCSSN data have been used to answer a variety of relevant primary care research questions, with over 45 publications and 250 conference posters and presentations to date. Most notably, along with the publication of the validated CPCSSN case definitions,6 we have explored the epidemiology of these conditions in Canadian primary care settings.14–19 Additionally, CPCSSN has an emerging role in pharmacovigilance, where the data can be used to monitor adverse reactions in the post-marketing surveillance of medications prescribed in primary care in Canada.20

CPCSSN data are used at regional and local levels to answer research questions derived from these contexts, as well as at a provider and clinic level for quality improvement purposes. For instance, a family physician in rural southern Alberta was able to examine patient outcomes related to a lifestyle intervention developed specifically for obese patients, which was completed without cost to the clinic and assisted the clinic in empirically evaluating their programme.21 A primary care team in Toronto, ON, used CPCSSN data to create registries of patients with chronic diseases to assist with their clinical management and monitoring patient outcomes.22

CPCSSN data have also formed the basis of multiple student research projects at the graduate and undergraduate levels, including several important methodological advancements to which students have contributed. One such project explored the opportunity for using CPCSSN data as a source of national healthy weight data.23 Whereas the data have some important limitations that need to be carefully considered, the CPCSSN database contains, for example, more body mass index (BMI) records than all the objective BMI measurements collected by Statistics Canada health surveys over the past 20 years.23

More recently, the Public Health Agency of Canada has funded CPCSSN to further develop, implement and evaluate the CPCSSN Data Presentation Tool (DPT) in primary care clinics and departments of public health across the country.24 The CPCSSN-DPT is a customized web-based graphical interface that provides users with ready access to clinic- or jurisdiction-specific CPCSSN data after it has undergone processing and cleaning. It is anticipated that the CPCSSN-DPT will facilitate the adoption of public health methods of surveillance by primary care practitioners to enhance the monitoring, prevention and management of chronic disease across Canada.

A full list of publications and conference abstracts involving CPCSSN data can be found on the website at [www.cpcssn.ca] (English version) or [www.rcsssp.ca] (French version).

Future opportunities

Linkage studies combining CPCSSN data with those from other sources of health, social or census data are a significant opportunity for research that has recently been explored in several provinces. Primary care EMR data from CPCSSN linked with administrative sources provide a powerful method for following patients throughout the primary-tertiary health care system, contributing significant insight into important topics such as high system use and predicting hospitalizations, and including social determinants when reporting on chronic diseases. This type of linkage research has taken place in local networks (for instance, CPCSSN data linked with census data for studying socioeconomic status and obesity25 and linkages with deprivation scores to examine the socioeconomic influences on diabetes health26) and more broadly within several provinces, such as collaborations with the Institute of Clinical Evaluative Sciences (ICES) in Toronto, ON, the Manitoba Centre for Health Policy (MCHP) in Winnipeg, MB, and the Newfoundland and Labrador Centre for Health Information (NLCHI) in St. John’s, NL. Other CPCSSN networks are closely following suit, with linkage activities beginning to take place in their respective provinces.

Researchers across the country are currently using the CPCSSN data to develop EMR-specific definitions for pelvic floor disorders in women, childhood asthma, speech disorders in the elderly, chronic kidney disease, chronic pain and heart failure. Plans to include menopause, inflammatorry bowel disease, multiple sclerosis and injury from falls are under way. Researchers with an interest in a specific condition are welcome to present proposals for new case definition development and validation using the CPCSSN dataset, including acute and communicable diseases.

Strengths and weaknesses

Strengths

A key strength of the CPCSSN database is the ability to follow patients over time and perform longitudinal, patient-level analyses using up-to-date clinical data. CPCSSN data are more comprehensive and fine-grained than traditional sources of Canadian primary care information, such as physician billing claims and other administrative datasets. Using CPCSSN data reduces subjective biases found in self-reported health surveys, since EMR data comprise physician-identified diagnoses, objective laboratory and examination results, and prescriptions issued to patients.

One distinctive feature of CPCSSN is the ongoing cleaning, coding and standardization of the data extracted from multiple EMR systems, which are often entered as free text at the clinic. Without these continuous cleaning processes, the data would not be useable for research and analysis.

Further, the national scope of the database is a major asset. In Canada, health is federally mandated and provincially administered, meaning that administrative data are housed within each province; this makes inter-provincial comparisons complex and time consuming. CPCSSN amalgamates data from different provinces into a single federated database in a privacy-sensitive way, allowing for an inexpensive and cohesive source of national primary care information.

Weaknesses

The challenge of using EMR data for purposes other than clinical care is to transform them so that they are fit for secondary use (i.e. research and surveillance). CPCSSN’s cleaning and coding algorithms have converted unstructured, unformatted clinical information into searchable data items, though there are still limitations imposed by the overall data quality inherent in most EMRs. Large blocks of narrative text are especially difficult to clean and parse out useable information. Often behavioural risk factors, such as smoking, diet, alcohol use and exercise, are documented in this way.

Missing data are not uncommon–for instance, less than 3% of patients in the CPCSSN database have their ethnicity recorded. Ultimately, CPCSSN is only able to extract data that are entered into the EMR in some reasonably useable manner. Therefore, the data available are limited to: (i) those patients who attend community, primary care clinics; (ii) data that are entered into the record; and (iii) data that can be directly accessed and, if necessary, can be cleaned and coded.

Additionally, the health system in Canada is organized so that patients may seek care from multiple physicians and other providers of their choosing. Although new primary care models emphasize patient rostering, whether formal or informal, it may be the case that some patients exist in the CPCSSN database more than once. At present, CPCSSN is unable to differentiate duplicated patients within its anonymous database. Whereas this problem is believed to be small, it does present a limitation to the current structure of the CPCSSN database, especially as expansion of the patient sample continues. As well, it is difficult to monitor attrition from patients leaving the clinic, moving to a new city or province or dying, as this is not always known or recorded in the EMR.

Last, the non-random enrolment of providers may impart a selection bias, as sentinels are likely early adopters of EMR systems with an interest in research and quality improvement. The CPCSSN database excludes providers using paper records, though this is quickly becoming an obsolete practice.

Data resource access

Researchers interested in conducting primary care research using CPCSSN data are encouraged to visit the website at [www.cpcssn.ca] (English version) or [www.rcsssp.ca] (French version). A CPCSSN Data Product Package is available by request, which contains the data dictionary, the entity relationship diagram (ERD) for the CPCSSN database, a short presentation summarizing the data holdings and its potential uses, and a sample dataset of 200 anonymized and delinked patient records. Qualified researchers are able to submit a letter of intent online, summarizing the proposed research project using CPCSSN data. After an internal review by CPCSSN’s Surveillance and Research Sub-Committee, applicants are invited to submit a full protocol and provide their letter of research ethics approval. A secure transfer of the CPCSSN dataset is initiated after the appropriate documentation is complete.

The CPCSSN data are available to university-affiliated (or equivalent) health researchers on a cost recovery basis, and a discounted rate is available for students. CPCSSN can provide additional services, such as data manipulation or analysis, for a nominal fee.

CPCSSN also welcomes collaborative research; please visit the website for information about current research projects, publications and CPCSSN co-investigators across the country. Any additional queries can be sent directly to CPCSSN at [research@cpcssn.org].

Profile in a Nutshell

  • The Canadian Primary Care Sentinel Surveillance Network (CPCSSN) is the country’s first and largest primary care electronic medical record (EMR) database available for health research, disease surveillance and practice improvement.

  • Since 2005, CPCSSN has recruited more than 1200 sentinel family physicians and nurse practitioners, who contribute de-identified EMR data for over 1.5 million patients. These data include patient demographics, diagnoses, medications, laboratory results, physical measurements, risk factors, medical procedures, billing submissions, referrals, allergies and vaccinations.

  • CPCSSN has created and validated robust case definitions specific to its EMR data, including conditions such as diabetes, hypertension, depression, chronic obstructive pulmonary disease, osteoarthritis, dementia, epilepsy and parkinsonism.

  • Extensive cleaning and processing algorithms are applied to the CPCSSN database to address quality issues that arise from free-text documentation and unstandardized data entry.

  • Many provinces are beginning to link CPCSSN data with other health and social data sources, such as hospitalizations, emergency department use, deprivation indices and population registries. This will enable novel use of EMR data in Canada and will contribute to a more comprehensive understanding of primary care activities nationally.

  • Information about the CPCSSN database, including a publication list, a sample data package, and guidelines for submitting an electronic letter of intent for potential research projects, is available at [www.cpcssn.ca] (English version) or [www.rcsssp.ca] (French version).

Supplementary Data

Supplementary data are available at IJE online.

Funding

The Public Health Agency of Canada initially provided a substantial contribution agreement to begin the development of CPCSSN in 2005, and has continued to provide smaller project-specific funding. Since 2015, additional funding support has been received from Canada Health Infoway, Health Canada, several provincial health ministries, universities and the private sector.

Conflict of interest: None.

References

1
Schoen
C
,
Osborn
R
,
Squires
D
,
et al
.
A survey of primary care doctors in ten countries shows progress in use of health information technology, less in other areas
.
Health Aff
2012
;
31
:
2805
16
.
2
Kotecha
JA
,
Manca
D
,
Lambert-Lanning
A
et al. 
Ethics and privacy issues of a practice-based surveillance system
.
Can Fam Physician
2011
;
57
:
1165
73
.
3
Gagnon
J
,
Leggett
JA
,
Richard
C
,
Lussier
MT
.
Facilitating informed consent for EMR research in Quebec
.
Can Fam Physician
2014
;
60
:
90
.
4
Centre for Advanced Computing
.
2016. http://cac.queensu.ca (8 May 2016, date last accessed)
.
5
Kadhim-Saleh
A
,
Green
M
,
Williamson
T
,
Hunter
D
,
Birtwhistle
R
.
Validation of the diagnostic algorithms for 5 chronic conditions in the Canadian Primary Care Sentinel Surveillance Network (CPCSSN): a Kingston primary care practice-based research network (PCPBRN)
.
J Am Board Fam Med
2013
;
26
:
159
67
.
6
Williamson
T
,
Green
ME
,
Birtwhistle
R
et al. 
Validating the 8 CPCSSN case definitions for chronic disease surveillance in a primary care database of electronic health records
.
Ann Fam Med
2014
;
12
:
367
72
.
7
Canadian Primary Care Sentinel Surveillance Network
.
2016. www.cpcssn.ca (15 September 2015, date last accessed)
.
8
Nabalamba
A
,
Millar
WJ
.
Going to the doctor
.
Health Rep
2007
;
18
:
23
35
.
9
Bertakis
KD
,
Azari
R
,
Helms
LJ
,
Callahan
EJ
,
Robbins
JA
.
Gender differences in the utilization of health care services
.
J Fam Pract
2000
;
49
:
147
52
.
10
Queenan
J
,
Williamson
T
,
Khan
S
et al. 
The cross sectional representativeness of patients and providers in the Canadian Primary Care Sentinel Surveillance Network (CPCSSN)
.
CMAJ Open
2015
;
4
:
E28
32
.
11
Birtwhistle
R
,
Glazier
R
,
Green
M
et al. 
Linking electronic medical records with administrative data: diabetic control and hospital emergency room utilization
.
North American Primary Care Research Group 43rd Annual Meeting; 24–28 October 2015
,
Cancun, Mexico, 2015
.
12
National Physician Survey
.
2013 National Physician Survey
.
2016. http://nationalphysiciansurvey.ca/surveys/2013-survey/ (15 September 2015, date last accessed)
.
13
Privacy Analytics
.
Privacy Analytics
.
2016. http://www.privacy-analytics.com (10 June 2016, date last accessed)
.
14
Greiver
M
,
Williamson
T
,
Barber
D
et al. 
Prevalence and epidemiology of diabetes in Canadian primary care practices: a report from the Canadian Primary Care Sentinel Surveillance Network
.
Can J Diabetes
2014
;
38
:
179
85
.
15
Wong
ST
,
Manca
D
,
Barber
D
et al. 
The diagnosis of depression and its treatment in Canadian primary care practices: an epidemiological study
.
CMAJ Open
2014
;
2
:
E337
42
.
16
Godwin
M
,
Williamson
T
,
Khan
S
et al. 
Prevalence and management of hypertension in primary care practices with electronic medical records: a report from the Canadian Primary Care Sentinel Surveillance Network
.
CMAJ Open
2015
;
3
:
E76
82
.
17
Green
ME
,
Natajaran
N
,
O’Donnell
DE
et al. 
Chronic obstructive pulmonary disease in primary care: an epidemiologic cohort study from the Canadian Primary Care Sentinel Surveillance Network
.
CMAJ Open
2015
;
3
:
E15
22
.
18
Birtwhistle
R
,
Morkem
R
,
Peat
G
et al. 
Prevalence and management of osteoarthritis in primary care: an epidemiologic cohort study from the Canadian Primary Care Sentinel Surveillance Network
.
CMAJ Open
2015
;
3
:
E270
75
.
19
Drummond
N
,
Birtwhistle
R
,
Williamson
T
,
Khan
S
,
Garies
S
,
Molnar
F
.
Prevalence and management of dementia in primary care practices with electronic medical records: a report from the Canadian Primary Care Sentinel Surveillance Network
.
CMAJ Open
2016
;
4
:
E177
84
.
20
Williamson
T
,
Lévesque
L
,
Morkem
R
,
Birtwhistle
R
.
CPCSSN’s role in improving pharmacovigilance
.
Can Fam Physician
2014
;
60
:
678
.
21
Garies
S
,
Irving
A
,
Williamson
T
,
Drummond
N
.
Using EMR data to evaluate a physician-developed lifestyle plan for obese patients in primary care
.
Can Fam Physician
2015
;
61
:
e225
31
.
22
Greiver
M
,
Martin
K
,
Aliarzadeh
B
,
Lambert-Lanning
A
,
Leggett
J
;
Canadian Primary Care Sentinel Surveillance Network
.
Implementing a Scalable Tool for Quality Improvement in Primary Care: A Report for Canada Health Infoway
.
Toronto, ON
:
Canadian Primary Care Sentinel Surveillance Network
,
2013
.
23
Rigobon
AV
,
Birtwhistle
R
,
Khan
S
et al. 
Adult obesity prevalence in primary care users: an exploration using Canadian Primary Care Sentinel Surveillance Network (CPCSSN) data
.
Can J Public Health
2015
;
106
:
e283
89
.
24
Queenan
JA
,
Birtwhistle
R
,
Drummond
N
.
Supporting primary care public health functions
.
Can Fam Physician
2016
;
62
:
603
.
25
Biro
S
,
Williamson
T
,
Leggett
JA
et al. 
Utility of linking primary care electronic medical records with Canadian census data to study the determinants of chronic disease: an example based on socioeconomic status and obesity
.
BMC Med Inform Decis Mak
2016
;
16
:
32
.
26
Greiver
M
,
Aliarzadeh
B
,
Moineddin
R
,
Meaney
C
,
Ivers
N
.
Diabetes screening with hemoglobin A1c prior to a change in guideline recommendations: prevalence and patient characteristics
.
BMC Fam Pract
2011
;
12
:
91
98
.

Supplementary data