Skip to main content

Assessing methods for measurement of clinical outcomes and quality of care in primary care practices



To evaluate the appropriateness of potential data sources for the population of performance indicators for primary care (PC) practices.


This project was a cross sectional study of 7 multidisciplinary primary care teams in Ontario, Canada. Practices were recruited and 5-7 physicians per practice agreed to participate in the study. Patients of participating physicians (20-30) were recruited sequentially as they presented to attend a visit. Data collection included patient, provider and practice surveys, chart abstraction and linkage to administrative data sets. Matched pairs analysis was used to examine the differences in the observed results for each indicator obtained using multiple data sources.


Seven teams, 41 physicians, 94 associated staff and 998 patients were recruited. The survey response rate was 81% for patients, 93% for physicians and 83% for associated staff. Chart audits were successfully completed on all but 1 patient and linkage to administrative data was successful for all subjects. There were significant differences noted between the data collection methods for many measures. No single method of data collection was best for all outcomes. For most measures of technical quality of care chart audit was the most accurate method of data collection. Patient surveys were more accurate for immunizations, chronic disease advice/information dispensed, some general health promotion items and possibly for medication use. Administrative data appears useful for indicators including chronic disease diagnosis and osteoporosis/ breast screening.


Multiple data collection methods are required for a comprehensive assessment of performance in primary care practices. The choice of which methods are best for any one particular study or quality improvement initiative requires careful consideration of the biases that each method might introduce into the results. In this study, both patients and providers were willing to participate in and consent to, the collection and linkage of information from multiple sources that would be required for such assessments.

Peer Review reports


Primary care, the first point of contact between patients and the health care system, includes disease prevention, health promotion, and chronic disease management. For over a decade, improving primary care has been a key element of health system reform around the world to improve health and health outcomes and reduce the cost of health care [19] A program of research to evaluate the impacts of such reforms is essential [10, 11]. The Canadian Institute for Health Information (CIHI) has proposed a comprehensive set of primary care outcome indicators [12], and also noted significant gaps in the availability and quality of the data sources to populate these indicators [13]. There is a need for a set of validated field tested measurement tools to facilitate the population of these indicators [14]. These tools could also be used for quality improvement initiatives in primary care practices and to enhance accurate reporting of performance for pay for performance incentives [15]. Administrative data, electronic health records, chart audits, patient surveys, provider surveys, population level surveys and direct observation have all been used for performance measurement in primary care, and could be used in the evaluation of primary care reform initiatives [14, 1628].

The specific aim of this study was to attempt a comprehensive measure of the quality of primary care provided by community based, multidisciplinary, primary care practices and to assess which measurement methods were best for which elements of care. This paper focuses on indicators related to the CIHI objectives on the delivery of “comprehensive” and “high quality and safe primary health care services” [12]. Five of the proposed comprehensive care indicators and fifteen proposed quality of care indicators under this objective were selected for inclusion (Table 1). Our underlying hypothesis was that the method of measurement would have a significant impact on the results obtained for many outcomes. An improved understanding of the types, magnitude, and direction of bias or error introduced by particular methods was identified as being required to aid in the appropriate selection of methods for practice level performance reporting and quality improvement activities.

Table 1 Indicators and potential data sources


Setting and participants

This cross sectional study was set in seven Family Health Teams (FHTs) in Eastern Ontario. FHTs are multidisciplinary group practices that share one of three non-fee for service funding mechanisms and receive support for information technology and integration of allied health providers into the practice. A convenience sample of 7 teams was approached and all agreed to participate. Within each FHT 5-7 physicians were selected for participation as determined by local factors such as office locations and division of larger sites into functional units. In office, sequential recruitment of 20-30 patients per physician was conducted over a 9 month period in 2008. Recruitment was conducted on regular clinic days where a mix of patients were being seen. Providers and practices were not informed about which patients had agreed to participate in the study. Each participating practice, physician and associated nursing or allied health professional (AHP) was asked to complete a survey and the physicians were asked to consent to identification of their information in administrative data sets. Each patient was asked to complete a 2 part survey and consent to a chart audit and linkage of their survey and chart audit data to administrative data sets. Physician, AHP, and practice survey data was collected for measurement of other outcomes (for example team functioning) that were provided to the teams as a part of our feedback process and are reported elsewhere [29].

Data collection

Table 1 presents a summary of the CIHI indicators selected, their definitions and the data sources that could be used to populate them. In addition, data on specific guideline related outcomes for the chronic diseases included were also collected, including detailed medication use data, neuropathy screening in DM, and control of hyperlipidemia.

Patient survey

This survey consisted of 2 sections. The first section was completed in the waiting room before the visit with the provider. It captured patient descriptive information and elicited patients’ experiences of the practice’s performance on measures covering a broad range of dimensions of health care service delivery. The second section was completed after the appointment with the provider and captured visit-specific information, including measures of activities related to health prevention, promotion and chronic disease management. Questions were derived from other validated survey tools, including the Primary Care Assessment Tool (PCAT-Adult), the Patient Perceptions of Patient-Centredness(PPPC), the Canadian Community Health Survey and the National Physician Survey [2226].

Chart audit

The chart audit forms captured information for four thematic areas: 1) patient demographic information, 2) visit activities (including referrals, prescriptions and orders), 3) chart organization and 4) measures of performance of technical quality of care, including prevention, and chronic disease management. Chart abstractors were all provided standardized training and detailed written support material, which was based on those used in another major study of primary care models in Ontario [21]. For items such as colorectal cancer screening where multiple options for screening are available information on each individual method was collected then a calculated value for completion determined during data analysis. While all practices in this study had electronic health records (EHRs), we used a trained research associate to extract data rather than any automated data search strategy. This allowed a search of free text areas of EHRs, supplementary paper records (which were still retained or in use in many practices), scanned image files of reports, and old charts. Questions arising during a chart abstraction were emailed centrally and resolved with input from the investigators. Independent re-abstraction of a sample of 60 charts was conducted to validate the data extraction process. The discrepancies were adjudicated by a third party and all disagreements between chart abstractors were recorded and tallied. There was over 95% agreement between abstractors. The final data set was adjusted with the consensus value when discrepancies were noted.

Administrative billing data review

The Institute for Clinical Evaluative Sciences (ICES) is an agency supported by the government of Ontario yet operating at arms length, which is charged with analysis of health sector administrative data in Ontario. Data holdings include a number of databases with information on providers and their practices, such as physician billings, drug utilization for publicly funded prescription medications, hospital inpatient and emergency room care, and census data. Consent from participating patients and providers to access their related administrative billing data was obtained for all physicians and patients. For those measures for which administrative data was available, a performance score using this data was determined using algorithms developed for earlier studies [18]. A data set with the results for each patient was created and linked to the chart and survey data using the ICES Key Number. The ICES key number (IKN) is a unique identifier assigned based on the Ontario Health Insurance Plan(OHIP) number of the patient. Provider ID numbers at ICES are based on College of Physicians and Surgeons of Ontario registration numbers and can also be linked in similar manner. All other study data was indexed using anonymous study ID numbers. Profiles of participating physicians and practices were created to allow comparison between study patients; all patients of study physicians(to assess bias introduced by recruitment method for patients); and a larger sample of all patients of FHTs in comparable locations (All Ontario FHT patients with the exception of those in Toronto and rural areas). This final comparison was to see if our convenience sample, which included 5 academic FHTs and 2 community based FHTs from two urban areas in Ontario outside of Toronto, was similar to other FHTs in comparable locations.

Data management and analysis

Survey and audit data were entered into SPSS version 16. Statistical analysis of comparisons between survey and chart audit data was also conducted in SPSS. Any analysis including comparisons to administrative data was conducted in SAS version 9.2. The results obtained from each data source were compared directly using both a comparison of proportion of concordant pairs and the kappa statistic. The clinical significance of the presence of discordant items, the likely reasons underlying discordant data, and the magnitude of the difference in result between methods were also considered [30, 31]. To facilitate analysis we made some minor modifications to some of the definitions established by CIHI (for example looking at the individual components of composite measures, modifying time frames to match standard administrative data analysis algorithms). We did not assume that the chart was the “gold standard” for each item, using clinical experience and knowledge of the limitations of each data source to instead consider why the data sources did not always agree.


Ethics review and approval was obtained from the Research Ethics Boards at Queen’s University, The University of Ottawa, The Ottawa Hospital, and Sunnybrook Hospital (ICES). All study procedures underwent privacy review and approval at ICES.


Seven teams, 41 physicians, 94 associated staff and 998 patients were successfully recruited. Results from one site that kept detailed logs revealed an overall patient participation rate of 90%. Fifteen to twenty patients per day were recruited by a team of 2 research assistants. Completed surveys were returned by 813 patients (81%), 38 physicians (93%) and 77 associated staff (83%). For the items in this paper for which survey data is presented valid responses were obtained from most subjects (86%-99%). Chart audits were successfully completed on all but one patient. There was over 95% agreement between chart abstractors on the sample of charts selected for validation. Linkage to administrative data was successful for 100% of participating patients and physicians. The results of the physician, AHP, and practice surveys were not used to determine patient level outcomes and will be reported elsewhere. Table 2 outlines the socio-demographic characteristics of the study patient sample in relation to other patients from the same practices and patients from all the FHTs in Ontario. The table shows that the study participants included more female, older, sicker patients than those found in the same practices and in other Ontario FHTs.

Table 2 Socio-demographics of study sample

TableS 3, Table 4, Table 5, Table 6 and Table 7 present the results for preventive health interventions (Table 3), health promotion (Table 4), and Chronic Disease Management (Table 6 and Table 7). For measurement of preventive health activities Table 2 shows that administrative data had both over 80% agreement with kappa statistics >.4 for mammography and BMD when compared to chart abstraction data. The kappa statistics were between .35-.45 and the levels of agreement lower (70-75%) for colorectal and cervical cancer screening, with a tendency for administrative data to underestimate rates of completion, while for immunization against influenza there was <60% agreement and a kappa of .25. The patient survey showed >75% agreement with the chart abstraction for mammography, BMD, cervical screening and clinical breast exam, but the concordance of the kappa statistic was only >.4 for BMD. There was less than 75% agreement for influenza immunization and colorectal screening with kappa values under .21 for both.

Table 3 Screening and preventive care (Mammography /PAP smear/ Influenza vaccination/Bone mineral density/Colorectal cancer screening) - Table of administrative data vs chart abstraction vs patient survey
Table 4 Health promotion (Diet/exercise/smoking status) - Table of chart abstraction vs patient survey
Table 5 Chronic disease status/Management between the chart abstraction and both administrative data and the patient survey
Table 6 Comparison of chronic disease management (CDM) between administrative data and patient survey
Table 7 Comparison of chronic disease management (CDM) between chart abstraction and patient survey

Table 4 outlines the agreement levels observed in health promotion activities between the patient survey replies and the information found in their chart only as this information is not available through billing records. The only one that had concordance over 75% was current smoking status. Past smoking status showed less than 50% agreement and provision of advice on diet and exercise showed 70-75% concordance with kappa levels all <.2. The levels of agreement between the administrative and chart data as well as between the patient survey and the chart are shown for the presence of the index conditions of interest and use of two broad categories of medications are presented in Table 5. Table 6 supplements this with a comparison between the survey data and administrative data for medication use on patients over 65. There was >85% agreement on the presence of the diagnosis. For medication use, concordance between the chart data and patient surveys was >75%, while the concordance between administrative and chart data was >75% for antihypertensives and only 57% for anti-lipidemics, with much higher rates of lipid lowering agent use noted in the charts. In Table 6 the level of agreement between administrative data and the patient between 70-75%, which is slightly lower than level of agreement between the chart and the patient. The kappa statistic was higher for antihypertensives, than for anti-lipidemic drugs.

Table 7 presents the comparison for a number of disease specific recommendations for the chronic diseases we examined, including a more detailed review of medication usage. Documentation of advice or resources provided has a level of agreement of <70%, with fewer than half of the events reported by patients being noted in the chart and a kappa <.4. For relatively less common but important key events such as MI and hospitalization, while agreement was >90% overall, there was poor agreement between the chart and patients who report having had these events, with less than half being identified in the charts. For the remaining process of care measures there was a mixed result, with levels of agreement being >75% for FBS, Lipid profiles, control of hyperlipidemia, AIC and foot exams and <75% for the remainder. Kappa values were <.4 for all process of care outcomes other than medications. The more detailed medication profiles showed levels of agreement >85% with kappas >.6 with the exception of ASA and Statins. ASA is an over the counter drug and despite a 78% agreement rate and kappa >.4, was only recorded in the chart in about 2/3 of patients who reported using it. In contrast, statins had only 54% agreement and a kappa <.2 due to patients reporting not taking them despite the drug being noted as active in their chart.


There are relatively few studies in primary care performance measurement which have examined the differences in results obtained through different methods of measurement, especially involving administrative data [17, 20, 32, 33]. This study examined the validity of data on primary care performance indicators obtained by various methods, as well as the acceptability, feasibility and potential biases of using a practice-based recruitment approach for the collection of linked data from a range of different methods. Our ability to collect data on multiple measures by audit, survey, and use of administrative data with good participation rates, high agreement rates between chart assessors, high rates of completion for surveys, and little objection to data linkage, shows that the collection of linked data from multiple sources is both acceptable and feasible. This study used sequential recruitment of patients presenting for care in participating practices [22]. Participants had 50-100% higher rates of chronic disease and multi-morbidity than the practice population, children were under-represented, and women and the elderly were overrepresented. Studies seeking a representative sample of all practice patients may wish to use other methods of recruitment,.Studies on the care experiences of chronic disease patients, workload or work process issues or the daily experiences of practitioners may find this to be both appropriate and efficient. As our study was focused on the concordance between data collection methods this design was unlikely to impact our results.

Significant differences in performance were found using the different data collection methods for many indicators. No single data collection method emerged as consistently the most valid across all performance indicators. A only a limited number of indicators had kappa statistics >.4 (moderate or better agreement) [30] however in come cases these were much worse than the degree of concordance estimated by proportion of concordant pairs. When interpreting our data both kappa and the degree of concordance should be considered [31]. With the increasing use of administrative data for primary care performance reporting [19, 34, 35], remuneration and funding decisions [28], disease registries [27] and public health reporting [36] the comparison of administrative data results to multiple different data sources is especially important for guiding the use and interpretation of administrative data results in future research, policy, and planning [27, 37, 38].

Good agreement across measurement methods was seen for preventive health imaging such as mammograms and bone mineral density scans. Previous research has found similar mammogram screening rates using patient surveys and chart audits [17, 32]. This study adds the use of administrative data to this comparison and finds it to be a reasonable alternative. Notable areas of discordance between administrative data and other methods of data collection included Pap smears, colon cancer screening and influenza vaccination, all of which had much lower rates noted in the administrative data. Manoeuvres like Pap tests and colon cancer screening may be completed in contexts other than the delivery of primary care. Where services that are not included in routine administrative data sources, and recourse to them is more widespread administrative data may be less accurate in capturing care received by the patient than other methods. Thus, the local context of care may significantly alter the validity of administrative data results for primary care performance.

Important differences between patient survey reports and chart audits were also found. Pap smears and influenza vaccination were reported at higher rates in the surveys. Patient over-reporting of Pap test rates has been previously reported [32]. Influenza vaccination in Ontario often occurs in public flu clinics and is therefore not always noted in the chart, so higher rates would be expected for this outcome. Patients who report having received diet and exercise advice had a notation to this effect in their chart over 90% of the time. The more significant difference was in the opposite direction, with 25-30% of patients whose charts indicated this was discussed failing to report this in the survey. This finding contradicts previous research comparing chart audit and patient survey to actual observation of patient visits, which found patients reported receiving advice more frequently than was noted in the chart [17]. This could be due in part to the multidisciplinary nature of these family health teams, where allied health providers would be contributing to the delivery of these services and documentation of this in the patient record. However, this finding may also reflect the difficulties in communicating messages on health promotion in ways that are memorable or retained by patients.

For most medications there was good agreement between chart and survey responses, and moderate agreement between survey and administrative data. There were two exceptions. For statins agreement between the chart and survey was poor, while there was much better agreement with administrative data. These discrepancies may represent issues of medication adherence, or poor understanding or recognition of a medication. Aspirin, which is available over the counter (OTC), was reported more frequently by patients than noted in the charts, likely representing lack of documentation in the chart. These findings suggest that patient survey may be the more accurate source for medications used, however this source is not without its potential biases. In terms of clinical outcomes, patients were more likely to report that their blood pressure and lipids were at target than was reported in the chart. These findings highlight the importance of importance of clear communication between patients, their physicians and other providers on issues such as medication use and management targets.

Recent efforts to assess the quality of chronic disease management (CDM) has relied on administrative data, EHR audits and population based surveys to identify the patient population with the condition of interest [36, 38]. For identifying the population with diabetes, there was strong agreement across measurement methods. For hypertension, the level of agreement was still good, but not as much as for diabetes. For estimation of prevalence in the sample the margin of difference is fairly small (39% vs 36%), but there are large numbers of discordant pairs. In these situations, administrative data identify more cases than in the charts, perhaps a reflection of care that occurs in specialist or hospital settings and not captured in the primary care chart. For the purposes of registry generation, either method would be appropriate, with the cost of administrative data being much lower. Performance measurement in primary care increasingly forms the basis of quality improvement investments, performance bonuses, population health planning and reporting [15]. Others have expressed concerns about the unintended consequences of pay for performance systems and on the impact of performance measurement more generally on good clinical practice. [38, 39] Our results indicate that careful consideration needs to be given to the methods used for assessing performance if these concerns are to be minimized. Future performance reporting should account for potential bias in results based on the data collection method and indicators measured.


The study was carried out in only one region of Ontario, Canada, using a convenience sample of practices that included a large number of academic teaching practices. Many of these practices used hospital labs, reducing the ability of administrative data to capture those tests. The structure and quality of records may impact the results of the chart audit and may not be reflective of chart content in other locations. Ontario has extensive administrative data sets that have been cleaned for use in health services research studies and an extensive program of development of programming expertise and algorithm development that may not be available in all jurisdictions. Patients in Canada may also have different attitudes towards data privacy and linkage than others than those in other countries. We applied rules for eligibility for manouvers based on age, sex, presence of index conditions and also assessed for any exclusion criteria within existing guidelines, but did not attempt to determine reasons for non-completion (ie. Patient refused, deliberate deviation from guidelines due to co-morbidities etc…). In addition, due to concerns with patient recall about timing of specific manouvers we elected to use an “ever received” format for our survey questions. The potential bias introduced would increase the degree of discrepancy observed and may partially explain the lower degree of agreement seen between the patient survey and chart abstraction for colorectal screening, which for most patients in Ontario is conducted with an annual FOBT. In any study there is the possibility of a Hawthorne effect, however in this study it is not likely as providers were not aware of which patients were participating and data collection was retrospective.


For many measures of technical quality of care, chart audit remains the most accurate method of data collection. Patient surveys are required for more accurate assessment of indicators such as immunizations, chronic disease advice/information dispensed, medication use, and some general health promotion items. Consecutive sampling of patients in the waiting room samples a population that is sicker, older and more likely to be female compared to the practice population. Administrative data appears useful for a number of indicators including several aspects of screening and chronic disease diagnosis. Administrative data are much less costly than other methods of data collection and can cover entire populations. Recruitment rates of physicians and patients remained high while requesting permission to link the data collected at the practice to the provincial health administrative databases. A comprehensive understanding of primary care performance will require the use of multiple data collection methods for the foreseeable future. The choice of which methods are best for any one particular study or quality improvement initiative requires careful consideration of the biases that each method might introduce into the results. Future studies should also consider assessing the reasons underlying divergence between decisions made at the individual patient level and recommended guidelines.

Prior Presentations

Portions of this manuscript content have been presented at the Canadian Institutes for Health Research Primary Care Summit, Toronto, Jan 18, 2010, the Canadian Association for Health Services and Policy Research Annual Meeting, Calgary May 12, 2009 and the North American Primary Care Research Group, San Juan Puerto Rico, Nov 16, 2008.


This project was funded by a grant from the Ontario Ministry of Health and Long Term Care. ICES and the Centre for Health Services and Policy Research are also supported in part by the Ontario Ministry of Health and Long Term Care. The opinions, results and conclusions reported in this paper are those of the authors and are independent from the funding sources. No endorsement by ICES or the Ontario MOHLTC is intended or should be inferred.



Family Health Team


Fee for Service


Canadian Institute for Health Information


Institute for Clinical Evaluative Sciences


Bone Mineral Density


Fecal Occult Blood Test


Fecal Occult Blood Test


Papanicolaou Smear.


  1. 1.

    Romanow RJ: Primary Health Care and Prevention. Building on Values: The Future of Health Care in Canada. 2002, Commission on the Future of Health Care in Canada, Saskatoon, 115-35.

    Google Scholar 

  2. 2.

    United States Congress: The Affordable Care Act: HR 3962. To provide affordable quality health care for all Americans and reduce the growth in health care spending and for other purposes. 2010, US Government Printing Office, Washington, DC, Accessed online at:

    Google Scholar 

  3. 3.

    Department of Health: Excellence and Equity: Liberating the NHS. London, England, Crown Copyright. 2010, UK: The Stationary Company Limited on behalf of the Controller of Her Majesty’s Stationary Office, London, Accessed online at:

    Google Scholar 

  4. 4.

    White KL, Williams TF, Greenberg BG: The Ecology of Medical Care. NEJM. 1961, 265: 885-892. 10.1056/NEJM196111022651805.

    CAS  Article  PubMed  Google Scholar 

  5. 5.

    Freyer GE, Green LA, Dovey SM, Yawn BP, Phillips RL, Lanier D: Variations in the Ecology of Medical Care. Ann Fam Med. 2003, 1: 81-89. 10.1370/afm.52.

    Article  Google Scholar 

  6. 6.

    Starfield B: Is Primary Care Essential?. Lancet. 1994, 344 (8930): 1129-1133. 10.1016/S0140-6736(94)90634-3.

    CAS  Article  PubMed  Google Scholar 

  7. 7.

    Starfield B, Shi L: Policy Relevant Determinants of Health: An International Perspective. Health Policy. 2002, 60 (3): 201-218. 10.1016/S0168-8510(01)00208-1.

    Article  PubMed  Google Scholar 

  8. 8.

    Macinko J, Starfield B, Shi L: The Contribution of Primary Care Systems to Health Outcomes Within the Organization for Economic Development (OECD) Countries, 1970-1998. Health Services Research. 2003, 38 (3): 831-865. 10.1111/1475-6773.00149.

    Article  PubMed  PubMed Central  Google Scholar 

  9. 9.

    Shortt SED: Primary Care Reform: Is There a Clinical Rationale?. Implementing Primary Care Reform: Barriers and Facilitators. Edited by: Wilson R, Shortt SED, Dorland J. 2004, Kingston: McGill-Queen’s University Press, 11-24.

    Google Scholar 

  10. 10.

    Watson DE, Broemeling AM, Wong ST: A Results Based Logic Model for Pirmary Healthcare: A Conceptual Foundation for Population-Based Information Systems. Healthcare Policy. 2009, 5 (sp): 33-36.

    PubMed  PubMed Central  Google Scholar 

  11. 11.

    Wong ST, Yin D, Bhattacharyya O, Wang B, Liqun L, Chen B: Developing a Performance Measruement Framework and Indicators for Community Health Service Facilities in Urban China. BMC Family Practice. 2010, 11: 91-10.1186/1471-2296-11-91.

    Article  PubMed  PubMed Central  Google Scholar 

  12. 12.

    CIHI: Pan-Canadian Primary Health Care Indicators: Report 1 Volumes 1 and 2. 2006, Ottawa: Canadian Institute for Health Information

    Google Scholar 

  13. 13.

    CIHI: Pan- Canadian Primary Health Care Indicators: Report 2. Enhancing the Primary Health Care Data Collection Infrastructure in Canada. 2006, Canadian Institute for Health Information, Ottawa

    Google Scholar 

  14. 14.

    Broemeling AM, Watson DE, Black C, Wong ST: Measuring the Performance of Primary Healthcare: Existing Capacity and Potential Information to Support Population-Based Analyses. Healthcare Policy. 2009, 5 (sp): 47-64.

    PubMed  PubMed Central  Google Scholar 

  15. 15.

    Johnston S, Dahrouge S, Hogg W: Gauging to gain: Primary care performance measurement. Can Fam Phys. 2008, 54: 1215-17.

    Google Scholar 

  16. 16.

    Bonomi AE, Wagner EH, Glasgow RE, VonKorff M: Assessment of chronic illness care (ACIC): A practical tool to measure quality improvement. HSR. 2002, 37 (3): 791-820.

    PubMed  PubMed Central  Google Scholar 

  17. 17.

    Stange KC, Zyzanski SJ, Smith TF, Kelly R, Langa D, Flock SA, Jaen C: How Valid are Medical Records and Patient Questionaires for Physician Profiling and Health Services Research?: A Comparison with Direct Observation of Patient Visits. Med Care. 1998, 36 (6): 851-867. 10.1097/00005650-199806000-00009.

    CAS  Article  PubMed  Google Scholar 

  18. 18.

    Jaakkimainen L, Klein-Geltink JE, Guttman A, Barnsley J, Zagorski BM, Kopp A, Saskin R, Leong A, Wang L: Indicators of Primary Care Based on Administrative Data. Primary Care in Ontario. Edited by: Jaakkimainen L, Upshur R, Klein-Geltink J, Leong A, Maaten S, Schultz S, Wang L. 2006, An ICES Atlas, ICES, Toronto, November

    Google Scholar 

  19. 19.

    Katz A, Soodeen R-A, Bogdanovic B, De Coster C, Chateau D: Can the Quality of Care in Family Practice be Measured Using Administrative Data?. HSR. 2006, 41 (6): 2238-2254.

    PubMed  PubMed Central  Google Scholar 

  20. 20.

    Gerbert B, Stone G, Stulbarg M, Gullion DS, Greenfield S: Agreement Among Physician Assessment Methods: Searching for the Truth Among Fallible Methods. Med Care. 1988, 26 (6): 519-535. 10.1097/00005650-198806000-00001.

    CAS  Article  PubMed  Google Scholar 

  21. 21.

    Hogg W, Gyorfi-Dyke E, Johnston S, Dahrouge S, Liddy C, Russell G, Kristjanssonn E: Chart audits in practice-based primary care research: A user’s guide. Can Fam Phys. 2010, 56: 495.

    Google Scholar 

  22. 22.

    Hogg W, Johnston S, Russell G, Dahrouge S, Gyorfi-Dyke E, Kristjanssonn E: Conducting waiting room surveys in practice-based primary care research: A user’s guide. Can Fam Phys. 2010, 56: 1375.

    Google Scholar 

  23. 23.

    Shi L, Starfield B, Jiahong N: Validating the Adult Primary Care Assessment Tool. J Fam Pract. 2001, 51: 161-Full text accessed online at

    Google Scholar 

  24. 24.

    Stewart M, Belle Brown J, Conner A, McWhinney IR, Oates J, Weston WW: The Impact of Patient-Centered Care on Outcomes. J Fam Pract. 2000, 49 (9): Full text accessed online at

    Google Scholar 

  25. 25.

    Canadian Community Health Survey: Accessed online at

  26. 26.

    National Physician Survey Long Form: 2007, Accessed online at

  27. 27.

    Manual DG, Rosella LC, Stuckell TA: Importance of accurately identifying chronic disease in studies using electronic health records. BMJ. 2010, 341: c4226-10.1136/bmj.c4226.

    Article  Google Scholar 

  28. 28.

    Cancer Care Ontario: ColonCancerCheck Screening Activity Reports for Family Physicians. 4, Accessed online at:

    Google Scholar 

  29. 29.

    Johnston S, Green ME, Thille P, Savage C, Roberts L, Russell G, Hogg W: Performance feedback in interdisciplinary primary care teams. BMC Family Practice. 2011, 12: 14-10.1186/1471-2296-12-14.

    Article  PubMed  PubMed Central  Google Scholar 

  30. 30.

    Altman DG: Practical Statistics for Medical Research. 1991, London England: Chapman and Hall

    Google Scholar 

  31. 31.

    Feinstein AR, Cicchetti DV: High agreement but low kappa: I. The problems of two paradoxes. J Clin Epidemiol. 1990, 43 (6): 543-549. 10.1016/0895-4356(90)90158-L.

    CAS  Article  PubMed  Google Scholar 

  32. 32.

    Conroy MB, Majchrzak NE, Silverman CB, Chang Y, Regan S, Schneider LI, Rigotti NA: Measuring provider adherence to tobacco treatment guidelines: A comparison of electronic medical record review, patient survey, and provider survey. Nicotine and Tobacco Research. 2004, 7 (S1): S35-S43.

    Google Scholar 

  33. 33.

    Montano DE, Phillips WR: Cancer Screening by Primary Care Physicians: A Comparison of Rates Obtained from Physician Self-Report, Patient Survey, and Chart Audit. Am J Public Health. 1995, 85: 795-800. 10.2105/AJPH.85.6.795.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  34. 34.

    Roos L, Guta S, Soodeen RA, Jebamani L: Data quality in an information rich environment: Canada as an example. Canadian Journal on Aging. 2005, 24: 153-10.1353/cja.2005.0055.

    Article  PubMed  Google Scholar 

  35. 35.

    Jaakkimainen L, Upshur R, Klein-Geltink J, Leong A, Maaten S, Schultz S, Wang L: Primary Care in Ontario. 2006, Toronto: An ICES Atlas, ICES, November

    Google Scholar 

  36. 36.

    Hux J, Ivis F, Flintock V, Bica A: Diabetes in Ontario: Determination of prevalence and incidence using a validated administrative algorithm. Diabetes Care. 2002, 25 (3): 512-6. 10.2337/diacare.25.3.512.

    Article  PubMed  Google Scholar 

  37. 37.

    Powel AE, Davies H, Thomson R: Using routine comparative data to assess the quality of health care: understanding and avoiding common pitfalls. Quality and Safety in Health Care. 2003, 12 (2): 122-8. 10.1136/qhc.12.2.122.

    Article  Google Scholar 

  38. 38.

    MacDonald R, Roland M: Pay for Performance in Primary Care in England and California: Comparison of Unitended Consequences. Ann Fam Med. 2009, 7: 121-127. 10.1370/afm.946.

    Article  Google Scholar 

  39. 39.

    Werner RM, Asch DA: Clinical Concerns About Clinical Performance Measurement. Ann Fam Med. 2007, 5: 159-163. 10.1370/afm.645.

    Article  PubMed  PubMed Central  Google Scholar 

Pre-publication history

  1. The pre-publication history for this paper can be accessed here:

Download references


We would like to acknowledge the following individuals for their contributions to this study: Walter Rosser, Alex Kopp, Marlo Whitehead, Irene Armstrong, Lynn Roberts, Tiina Liinimaa, Julie Klien-Geltink, Susan Efler, Anita Jessup, and Jennifer Biggs.

Author information



Corresponding author

Correspondence to Michael E Green.

Additional information

Competing interest

None of the authors have any conflicts of interest to report.

Authors’ contributions

MEG and WH conceived the study. All authors contributed to study design. MEG, WH, SJ and CS were responsible for primary data collection. RJL, RHG and MEG were responsible and CS for the administrative data analysis. MEG and CS directed the initial data analysis. All authors contributed to decisions on the interpretation of results and contributed to the drafting and of the manuscript. All authors approved the version of the manuscript prior to submission.

Rights and permissions

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and Permissions

About this article

Cite this article

Green, M.E., Hogg, W., Savage, C. et al. Assessing methods for measurement of clinical outcomes and quality of care in primary care practices. BMC Health Serv Res 12, 214 (2012).

Download citation


  • Performance measurement
  • Primary care
  • Quality of care
  • Evaluation