Cohort study

Jump to navigation Jump to search

Editor-In-Chief: C. Michael Gibson, M.S., M.D. [1]


A cohort study or panel study is a form of longitudinal study used in medicine and social science. It is one type of study design and should be compared with a cross-sectional study.

A cohort is a group of people who share a common characteristic or experience within a defined period (e.g., are born, leave school, lose their job, are exposed to a drug or a vaccine, etc.). Thus a group of people who were born on a day or in a particular period, say 1948, form a birth cohort. The comparison group may be the general population from which the cohort is drawn, or it may be another cohort of persons thought to have had little or no exposure to the substance under investigation, but otherwise similar. Alternatively, subgroups within the cohort may be compared with each other.


In medicine, a cohort study is often undertaken to obtain evidence to try to refute the existence of a suspected association between cause and disease; failure to refute a hypothesis strengthens confidence in it. Crucially, the cohort is identified before the appearance of the disease under investigation. The study groups, so defined, are observed over a period of time to determine the frequency of new incidence of the studied disease among them. The cohort cannot therefore be defined as a group of people who already have the disease. Distinguishing causality from mere correlation cannot usually be done with results of a cohort study alone.

The advantage of cohort study data is the longitudinal observation of the individual through time, and the collection of data at regular intervals, so recall error is reduced. However, cohort studies are expensive to conduct, are sensitive to attrition and take a long time to generate useful data.

Some cohort studies track groups of children from their birth, and record a wide range of information (exposures) about them. The value of a cohort study depends on the researchers' capacity to stay in touch with all members of the cohort. Some of these studies have continued for decades.


An example of an epidemiologic question that can be answered by the use of a cohort study is: does exposure to X (say, smoking) correlate with outcome Y (say, lung cancer)? Such a study would recruit a group of smokers and a group of non-smokers (the unexposed group) and follow them for a set period of time and note differences in the incidence of lung cancer between the groups at the end of this time. The groups are matched in terms of many other variables such as economic status and other health status so that the variable being assesed, the independent variable (in this case, smoking) can be isolated as the cause of the dependent variable (in this case, lung cancer).

In this example, a statistically significant increase in the incidence of lung cancer in the smoking group as compared to the non-smoking group is evidence in favor of the hypothesis. However, rare outcomes, such as lung cancer, are generally not studied with the use of a cohort study, but are rather studied with the use of a case-control study.

Shorter term studies are commonly used in medical research as a form of clinical trial, or means to test a particular hypothesis of clinical importance. Such studies typically follow two groups of patients for a period of time and compare an endpoint or outcome measure between the two groups.

Randomized controlled trials, or RCTs are a superior methodology in the hierarchy of evidence, because they limit the potential for bias by randomly assigning one patient pool to an intervention and another patient pool to non-intervention (or placebo). This minimises the chance that the incidence of confounding variables will differ between the two groups.

Nevertheless, it is sometimes not practical or ethical to perform RCTs to answer a clinical question. To take our example, if we already had reasonable evidence that smoking causes lung cancer then persuading a pool of non-smokers to take up smoking in order to test this hypothesis would generally be considered quite unethical.

An example of a cohort study that has been going on for more than 50 years is the Framingham Heart Study.

The largest cohort study in women is the Nurses' Health Study. Started in 1976, it is tracking over 120,000 nurses and has been analyzed for many different conditions and outcomes.


Retrospective cohort

A "prospective cohort" defines the groups before the study is done, while a "retrospective cohort" does the grouping after the data is collected. Whereas prospective cohorts should be summarized with the relative risk, retrospective cohorts should be summarized with the odds ratio. An example of a retrospective cohort is Long-Term Mortality after Gastric Bypass Surgery.[1]

Nested case-control study

An example of a nested case-control study is Inflammatory markers and the risk of coronary heart disease in men and women which was a case control analyses extracted from the Framingham Heart Study cohort.[2]

Household panel survey

Household panel surveys are an important sub-type of cohort study. These draw representative samples of households and survey them, following all individuals through time on a usually annual basis. Examples include the US Panel Study on Income Dynamics (since 1968), the German Socio-Economic Panel (since 1984), the British Household Panel Survey (since 1991), the Household, Income and Labour Dynamics in Australia Survey (since 2001) and the European Community Household Panel (1994-2001).

Statistical analysis

Because the non-randomized allocation of subjects in a cohort study, several statistical approached have been developed to reduce confounding from selection bias.

A comparison of study in which three approaches (multiple regression, propensity score and grouped treatment variable) were compared in their ability to predict treatment outcomes in a cohort of patients who refused randomization in a chemotherapy trial.[3] The comparison study examined how well three statistical approaches were able to use the nonrandomized patients to replicate the results of the patients who consented to randomization. This comparison found that the propensity score did not add to traditional multiple regression while the grouped treatment variable was least successful.[3]

Multiple regression

Multiple regression with the Cox proportional hazards ratio can be used to adjust for confounding variable. Multiple regression can only correct for confounding by independent variables that have been measured

Grouped treatment variable

Creating a grouped treatment variable attempts to correct for unmeasured confounding influences.[4] In the grouped treatment approach, the "treatment individually assigned is considered to be confounded by indication, which means that patients may be selected to receive one of the treatments because of known or unknown prognostic factors."[3] For example, in an observational study that included several hospitals, creating a variable for the proportion of patients exposed to the treatment may account for biases in each hospital in deciding which patients get the treatment.[3]

Inverse probability weighting

The inverse probability weighting attempts to correct for unmeasured confounding influences.[5]

Examples of cohort studies using this adjustment are:

  • the North American AIDS Cohort Collaboration on Research and Design (NA-ACCORD) study of Human Immunodeficiency Virus.[6]
  • Ar post‐hoc analysis of a cohort in the ARISE trial[7]

Principal components analysis

Principal components analysis was developed by Pearson in 1901.[8] The principal components analysis can only correct for confounding by independent variables that have been measured.

Prior event rate ratio

The prior event rate ratio has been used to replicate with observational data from electronic health records the results of the Scandinavian Simvastatin Survival Study[9] and the HOPE and EUROPA trials. [10][11] Like the grouped treatment variable, the prior event ration attempts to correct for unmeasured confounding influences. However, unlike the grouped treatment variable which controls for the proportion of subjects selected for treatment, the prior event rate ratio uses the "ratio of event rates between the Exposed and Unexposed cohorts prior to study start time to adjust the study hazard ratio".[10]

Limitations of the prior event ratio is that it cannot study outcomes that have not occurred prior to onset of treatment. So for example, the prior event ratio cannot control for confounding in studies of primary prevention.

Propensity score matching

The propensity score was introduced by Rosenbaum in 1983.[12][13] The propensity score is the "conditional probability of receiving one of the treatments under comparison ... given the observed covariates."[3] The propensity score can only correct for confounding by independent variables that have been measured.

Cohort studies with propensity matching may[14] or may not[15] resemble the results of randomized controlled trials. THis may depend on how closely the cohort study emulated the protocol of a randomized controlled trial as done in the RCT-DUPLICATE[14].

Sensitivity analysis

Sensitivity analysis can estimate how strong must a unmeasured confounder be to reduce the effect of a factor under study.[16] An example of this analysis was a nonrandomized comparison of when to initial treatment for asymptomatic Human Immunodeficiency Virus in the North American AIDS Cohort Collaboration on Research and Design (NA-ACCORD) study.[6]

Determining causality

Bradford Hill criteria

If statistically significant associations are found, the Bradford Hill criteria can help determine whether the associations represent true causality. The Bradford Hill criteria were proposed in 1965:[17]

  • Strength or magnitude of association?
  • Consistency of association across studies?
  • Specificity of association?
  • Temporality of association?
  • Plausibility based on biological knowledge?
  • Biological gradient: or dose-response relationship?
  • Coherence? Does the proposed association explain other observations?
  • Experimental evidence?
  • Analogy?


Immortal time bias

"Immortal time is a span of cohort follow-up during which, because of exposure definition, the outcome under study could not occur."[18]

Assessing the quality

Many scales and checklists have been proposed for assessing the quality of cohort studies.[19] The most common items assessed with these tools are:

  • Selecting study participants (92% of tools)
  • Measurement of study variables (exposure, outcome and/or confounding variables) (86% of tools)
  • Sources of bias (including recall bias, interviewer bias and biased loss to follow-up but excluding confounding) (86% of tools)
  • Control of confounding (78% of tools)
  • Statistical methods (78% of tools)
  • Conflict of interest (3% of tools)

Of these tools, only one was designed for use in comparing cohort studies in any clinical setting for the purpose of conducting a systematic review of cohort studies[20]; however, this tool has been described as "extremely complex and require considerable input to calculate raw scores and to convert to final scores, depending on the primary study design and methods".[19]

The Newcastle-Ottawa Scale (NOS) may help assess the quality of nonrandomized studies.[21][22]

Standards for reporting

Standards are available for the reporting of observational studies[23][24][25][26] with accompanying explanation and elaboration[27].

Alternative study designs

Case-control study

Rare outcomes, or those that slowly develop over long periods, are generally not studied with the use of a cohort study, but are rather studied with the use of a case-control study. Retrospective studies may exaggeration associations.[28]

Randomized controlled trial

Randomized controlled trials (RCTs) are a superior methodology in the hierarchy of evidence, because they limit the potential for bias by randomly assigning one patient pool to an intervention and another patient pool to non-intervention (or placebo). This minimizes the chance that the incidence of confounding variables will differ between the two groups.[29][30]

Empiric comparisons of observational studies and RCTs conflict and both find[31][32][33][34][35] and do not find[36][37] evidence of exaggerated results from cohort studies.

Nevertheless, it is sometimes not practical or ethical to perform RCTs to answer a clinical question. To take our example, if we already had reasonable evidence that smoking causes lung cancer then persuading a pool of non-smokers to take up smoking in order to test this hypothesis would generally be considered quite unethical.

See also


  1. Adams TD, Gress RE, Smith SC; et al. (2007). "Long-term mortality after gastric bypass surgery". N. Engl. J. Med. 357 (8): 753–61. doi:10.1056/NEJMoa066603. PMID 17715409.
  2. Pai JK, Pischon T, Ma J; et al. (2004). "Inflammatory markers and the risk of coronary heart disease in men and women". N. Engl. J. Med. 351 (25): 2599–610. doi:10.1056/NEJMoa040967. PMID 15602020.
  3. 3.0 3.1 3.2 3.3 3.4 Schmoor C, Caputo A, Schumacher M (2008). "Evidence from nonrandomized studies: a case study on the estimation of causal effects". Am. J. Epidemiol. 167 (9): 1120–9. doi:10.1093/aje/kwn010. PMID 18334500.
  4. Johnston SC, Henneman T, McCulloch CE, van der Laan M (2002). "Modeling treatment effects on binary outcomes with grouped-treatment variables and individual covariates". Am. J. Epidemiol. 156 (8): 753–60. PMID 12370164.
  5. Hernán MA, Lanoy E, Costagliola D, Robins JM (2006). "Comparison of dynamic treatment regimes via inverse probability weighting". Basic Clin. Pharmacol. Toxicol. 98 (3): 237–42. doi:10.1111/j.1742-7843.2006.pto_329.x. PMID 16611197.
  6. 6.0 6.1 Kitahata MM, Gange SJ, Abraham AG; et al. (2009). "Effect of Early versus Deferred Antiretroviral Therapy for HIV on Survival". N. Engl. J. Med. doi:10.1056/NEJMoa0807252. PMID 19339714.
  7. Bulle, Esther B; Peake, Sandra L; Finnis, Mark; Bellomo, Rinaldo; Delaney, Anthony; Peake, S.; Delaney, A.; Bellomo, R.; Cameron, P. A.; Higgins, A. M.; Holdgate, A.; Howe, B.D.; Webb, S.A.R.; Williams, P.; Peake, S.; Delaney, A.; Bellomo, R.; Cameron, P. A.; Cooper, D. J.; Cross, A.; Gomersall, C.; Graham, C.; Higgins, A.M.; Holdgate, A.; Howe, B.D.; Jacobs, I.; Johanson, S.; Jones, P.; Kruger, P.; McArthur, C.; Myburgh, J.; Nichol, A.; Pettilä, V.; Rajbhandari, D.; Webb, S.A.R.; Williams, A.; Williams, J.; Williams, P.; Bennett, V.; Board, J.; McCracken, P.; McGloughlin, S.; Nanjayya, V.; Teo, A.; Hill, E.; Jones, P.; O’Brien, E.; Sawtell, F.; Schimanski, K.; Wilson, D.; Bellomo, R.; Bolch, S.; Eastwood, G.; Kerr, F.; Peak, L.; Young, H.; Edington, J.; Fletcher, J.; Smith, J.; Ghelani, D.; Nand, K.; Sara, T.; Cross, A.; Flemming, D.; Grummisch, M.; Fulton, E.; Grove, K.; Harney, A.; Milburn, K.; Millar, R.; Mitchell, I.; Rodgers, H.; Scanlon, S.; Coles, T.; Connor, H.; Dennett, J.; Van Berkel, A.; Barrington‐Onslow, S.; Henderson, S.; Mehrtens, J.; Dryburgh, J.; Tankel, A.; Braitberg, G.; O’Bree, B.; Shepherd, K.; Vij, S.; Allsop, S.; Haji, D.; Haji, K.; Vuat, J.; Bone, A.; Elderkin, T.; Orford, N.; Ragg, M.; Kelly, S.; Stewart, D.; Woodward, N.; Harjola, V‐P.; Okkonen, M.; Pettilä, V.; Sutinen, S.; Wilkman, E.; Fratzia, J.; Halkhoree, J.; Treloar, S.; Ryan, K.; Sandford, T.; Walsham, J.; Jenkins, C.; Williamson, D.; Burrows, J.; Hawkins, D.; Tang, C.; Dimakis, A.; Holdgate, A.; Micallef, S.; Parr, M.; White, H.; Morrison, L.; Sosnowski, K.; Ramadoss, R.; Soar, N.; Wood, J.; Franks, M.; Williams, A.; Hogan, C.; Song, R.; Tilsley, A.; Rainsford, D.; Soar, N.; Wells, R.; Wood, J.; Dowling, J.; Galt, P.; Lamac, T.; Lightfoot, D.; Walker, C.; Braid, K.; DeVillecourt, T.; Tan, H.S.; Seppelt, I.; Chang, L.F; Cheung, W.S; Fok, S.K; Lam, P.K; Lam, S.M; So, H.M; Yan, W.W; Altea, A.; Lancashire, B.; Gomersall, C.D.; Graham, C.A.; Leung, P.; Arora, S.; Bass, F.; Shehabi, Y.; Isoardi, J.; Isoardi, K.; Powrie, D.; Lawrence, S.; Ankor, A.; Chester, L.; Davies, M.; O’Connor, S.; Poole, A.; Soulsby, T.; Sundararajan, K.; Williams, J.; Greenslade, J.H.; MacIsaac, C.; Gorman, K.; Jordan, A.; Moore, L.; Ankers, S.; Bird, S.; Delaney, A.; Dowling, J.; Fogg, T.; Hickson, E.; Jewell, T.; Kyneur, K.; O’Connor, A.; Townsend, J.; Yarad, E.; Brown, S.; Chamberlain, J.; Cooper, J.; Jenkinson, E.; McDonald, E.; Webb, S.; Buhr, H.; Coakley, J.; Cowell, J.; Hutch, D.; Gattas, D.; Keir, M.; Rajbhandari, D.; Rees, C.; Baker, S.; Roberts, B.; Farone, E.; Holmes, J.; Santamaria, J.; Winter, C.; Finckh, A.; Knowles, S.; McCabe, J.; Nair, P.; Reynolds, C.; Ahmed, B.; Barton, D.; Meaney, E.; Nichol, A.; Harris, R.; Shields, L.; Thomas, K.; Karlsson, S.; Kuitunen, A.; Kukkurainen, A.; Tenhunen, J.; Varila, S.; Burrows, J.; Ryan, N.; Trethewy, C.; Crosdale, J.; Smith, J. C; Vellaichamy, M.; Furyk, J.; Gordon, G.; Jones, L.; Senthuran, S.; Bates, S.; Butler, J.; French, C.; Tippett, A.; Kelly, J.; Kwans, J.; Murphy, M.; O’Flynn, D.; Kurenda, C.; Otto, T.; Peake, S.; Raniga, V.; Williams, P.; Ho, H. F.; Leung, A.; Wu, H. (2020). "Time to antimicrobial therapy in septic shock patients treated with an early goal‐directed resuscitation protocol: A post‐hoc analysis of the ARISE trial". Emergency Medicine Australasia. doi:10.1111/1742-6723.13634. ISSN 1742-6731.
  8. Pearson, K (1901). "On lines and planes of closest fit to systems of points in space". Philosophical Magazine. 2: 559–572.
  9. Weiner MG, Xie D, Tannen RL (2008). "Replication of the Scandinavian Simvastatin Survival Study using a primary care medical record database prompted exploration of a new method to address unmeasured confounding". Pharmacoepidemiol Drug Saf. doi:10.1002/pds.1585. PMID 18327857.
  10. 10.0 10.1 Tannen RL, Weiner MG, Xie D (2008). "Replicated studies of two randomized trials of angiotensin- converting enzyme inhibitors: further empiric validation of the 'prior event rate ratio' to adjust for unmeasured confounding by indication". Pharmacoepidemiol Drug Saf. doi:10.1002/pds.1584. PMID 18327852.
  11. Tannen RL, Weiner MG, Xie D (2009). "Use of primary care electronic medical record database in drug efficacy research on cardiovascular outcomes: comparison of database and randomised controlled trial findings". BMJ. 338: b81. PMID 19174434.
  12. Rosenbaum PR, Rubin DB (1983). "The central role of the propensity score in observational studies for causal effects". Biometrika. 70 (1): 41. doi:10.1093/biomet/70.1.41.
  13. Hill J (2008). "Discussion of research using propensity-score matching: Comments on 'A critical appraisal of propensity-score matching in the medical literature between 1996 and 2003' by Peter Austin, Statistics in Medicine". Stat Med. 27 (12): 2055–2061. doi:10.1002/sim.3245. PMID 18446836.
  14. 14.0 14.1 Wang SV, Schneeweiss S, RCT-DUPLICATE Initiative. Franklin JM, Desai RJ, Feldman W; et al. (2023). "Emulation of Randomized Clinical Trials With Nonrandomized Database Analyses: Results of 32 Clinical Trials". JAMA. 329 (16): 1376–1385. doi:10.1001/jama.2023.4221. PMID 37097356 Check |pmid= value (help).
  15. Forbes SP, Dahabreh IJ (2020). "Benchmarking Observational Analyses Against Randomized Trials: a Review of Studies Assessing Propensity Score Methods". J Gen Intern Med. 35 (5): 1396–1404. doi:10.1007/s11606-020-05713-5. PMC 7210373 Check |pmc= value (help). PMID 32193818 Check |pmid= value (help).
  16. Schneeweiss S (2006). "Sensitivity analysis and external adjustment for unmeasured confounders in epidemiologic database studies of therapeutics". Pharmacoepidemiol Drug Saf. 15 (5): 291–303. doi:10.1002/pds.1200. PMID 16447304.
  17. Hill AB (1965). "The Environment And Disease: Association Or Causation?". Proc. R. Soc. Med. 58: 295–300. PMC 1898525. PMID 14283879.
  18. Suissa S (2008). "Immortal time bias in pharmaco-epidemiology". Am. J. Epidemiol. 167 (4): 492–9. doi:10.1093/aje/kwm324. PMID 18056625.
  19. 19.0 19.1 Sanderson S, Tatt ID, Higgins JP (2007). "Tools for assessing quality and susceptibility to bias in observational studies in epidemiology: a systematic review and annotated bibliography". Int J Epidemiol. 36 (3): 666–76. doi:10.1093/ije/dym018. PMID 17470488.
  20. Margetts BM, Thompson RL, Key T, Duffy S, Nelson M, Bingham S; et al. (1995). "Development of a scoring system to judge the scientific quality of information from case-control and cohort studies of nutrition and disease". Nutr Cancer. 24 (3): 231–9. PMID 8610042.
  21. GA Wells, B Shea, D O'Connell, J Peterson, V Welch, M Losos, P Tugwell. The Newcastle-Ottawa Scale (NOS) for assessing the quality of nonrandomised studies in meta-analyses
  22. Lo CK, Mertz D, Loeb M (2014). "Newcastle-Ottawa Scale: comparing reviewers' to authors' assessments". BMC Med Res Methodol. 14: 45. doi:10.1186/1471-2288-14-45. PMC 4021422. PMID 24690082.
  23. Equator Network. Guidance for reporting observational studies
  24. National Library of Medicine. Research Reporting Guidelines and Initiatives: By Organization
  25. STrengthening the Reporting of OBservational (STROBE) studies in Epidemiology
  26. von Elm E, Altman DG, Egger M, Pocock SJ, Gøtzsche PC, Vandenbroucke JP; et al. (2007). "The Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) statement: guidelines for reporting observational studies". Ann Intern Med. 147 (8): 573–7. PMID 17938396.
  27. Vandenbroucke JP, von Elm E, Altman DG, Gøtzsche PC, Mulrow CD, Pocock SJ; et al. (2007). "Strengthening the Reporting of Observational Studies in Epidemiology (STROBE): explanation and elaboration". Ann Intern Med. 147 (8): W163–94. PMID 17938389.
  28. Eikelboom JW, Lonn E, Genest J, Hankey G, Yusuf S (1999). "Homocyst(e)ine and cardiovascular disease: a critical review of the epidemiologic evidence". Ann. Intern. Med. 131 (5): 363–75. PMID 10475890.
  29. Pocock SJ, Elbourne DR (2000). "Randomized trials or observational tribulations?". N. Engl. J. Med. 342 (25): 1907–9. PMID 10861329.
  30. Barton S (2000). "Which clinical studies provide the best evidence? The best RCT still trumps the best observational study". BMJ. 321 (7256): 255–6. PMC 1118259. PMID 10915111.
  31. Hemkens LG, Contopoulos-Ioannidis DG, Ioannidis JP (2016). "Agreement of treatment effects for mortality from routinely collected data and subsequent randomized trials: meta-epidemiological survey". BMJ. 352: i493. doi:10.1136/bmj.i493. PMC 4772787. PMID 26858277.
  32. Ioannidis JP, Haidich AB, Pappa M; et al. (2001). "Comparison of evidence of treatment effects in randomized and nonrandomized studies". JAMA. 286 (7): 821–30. PMID 11497536.
  33. Guyatt GH, DiCenso A, Farewell V, Willan A, Griffith L (2000). "Randomized trials versus observational studies in adolescent pregnancy prevention". J Clin Epidemiol. 53 (2): 167–74. PMID 10729689.
  34. Kunz R, Oxman AD (1998). "The unpredictability paradox: review of empirical comparisons of randomised and non-randomised clinical trials". BMJ. 317 (7167): 1185–90. PMC 28700. PMID 9794851.
  35. Phillips AN, Grabar S, Tassie JM, Costagliola D, Lundgren JD, Egger M (1999). "Use of observational databases to evaluate the effectiveness of antiretroviral therapy for HIV infection: comparison of cohort studies with randomized trials. EuroSIDA, the French Hospital Database on HIV and the Swiss HIV Cohort Study Groups". AIDS. 13 (15): 2075–82. PMID 10546860.
  36. Benson K, Hartz AJ (2000). "A comparison of observational studies and randomized, controlled trials". N. Engl. J. Med. 342 (25): 1878–86. PMID 10861324.
  37. Concato J, Shah N, Horwitz RI (2000). "Randomized, controlled trials, observational studies, and the hierarchy of research designs". N. Engl. J. Med. 342 (25): 1887–92. PMC 1557642. PMID 10861325.

External links

Template:Medical research studies

de:Kohortenstudie el:Μελέτη κοόρτης lt:Kohortinis tyrimas sv:Kohortstudier

Template:WikiDoc Sources