Abstract This paper describes a cohort study in terms of its design, the research questions answered by cohort studies, common analytic techniques and the strengths and limitations of this type of study. We also describe the main cohort studies of older populations, many of which are available for secondary data analysis. cohort study, follow-up, mortality, longitudinal, older people, research methods Introduction Compiling evidence for cause and effect is the basis of much ageing research. The best evidence comes from randomised controlled trials (RCTs), but this design is not always possible in practice, particularly where the ‘cause’ is potentially harmful, for instance smoking or obesity. The next best evidence is then from observational cohort studies. In this article, we describe the design of cohort studies including the type of research question they can answer, common analytical techniques for cohort studies and the strengths and limitations of this type of design. Finally, we provide a brief description of the major cohort studies of older populations, many of which are available to researchers outside of the study team for secondary data analysis. Comparative analysis of cohort studies from different populations can considerably strengthen evidence of cause and effect, or, in some cases, can help tease out mediating factors available in one, but not the other cohort. Design of cohort studies A cohort study is simply a set of individuals who share some characteristic and who are followed up over time. In terms of cohort studies of ageing this is generally birth year (or years), but it might be another shared characteristic such as occupation or geography. The majority of ageing research is fundamentally trying to determine whether there is a relationship between a cause or exposure, and an effect or outcome. The ‘exposure’ might be naturally occurring, a behaviour, or a characteristic. Examples of research questions answered by cohort studies include ‘Does a healthy diet (behavioural exposure) reduce the risk of dementia (outcome)?’ or ‘Does childhood socioeconomic status (characteristic – exposure) predict midlife psychological distress (outcome)?’. In some cases, the exposure might be a service or intervention that cannot be evaluated by a RCT, often because the service is already in place and withdrawal for some would be unethical or impossible. In all cases, the exposure must happen before the outcome to infer cause and effect. Hence, the natural starting point is with the exposure—either observe outcome(s) that result from a given exposure (observational cohort study) or control the exposure and record the outcome(s) (RCT). Thus, if we were researching the effect of vitamin supplementation on a disease, we can contrast the design for a RCT and a cohort study: In a RCT, the investigator will decide which participants receive vitamin supplements and which do not, and compare outcomes for the two groups; In a cohort study, people who happen to be taking vitamin supplements are compared to those who do not. i.e. there is no active intervention by the investigator. The two designs can also be contrasted in terms of the timing of the selection of participants and the occurrence of the exposure (Figure 1). Figure 1. View largeDownload slide Design of RCT and observational cohort study. Figure 1. View largeDownload slide Design of RCT and observational cohort study. Observing the occurrence of the outcome may not necessarily involve collecting data directly from study members, for example mortality or cancer incidence, as well as certain other diseases, as these may be ascertained from linkage with administrative databases or registers. However, since cohort studies have the capacity to investigate multiple outcomes, many cohort studies collect data directly from study members at regular follow-ups. For ageing cohorts, follow-up intervals may reduce as the cohorts reach very old age in order not to ‘miss’ outcomes through death. Strengths and limitations of cohort studies The major strength of cohort studies is that they follow the natural course of events so there is no confusion about whether the exposure preceded the outcome or vice versa. Multiple outcomes may be investigated, and this is the case for the major cohort studies of ageing. Unlike case-control studies, cohort studies avoid bias in exposure measurement as the outcome is unknown when information on the exposure is ascertained. Limitations of cohort studies include the large number of participants and the generally long follow-up period required, and therefore cohort studies are costly. Many existing cohort studies of ageing sample only from community-dwelling older adults and are therefore not representative of the total population. This is obviously a limitation for estimating prevalence and incidence, but it may also be an issue when investigating causal mechanisms . There is also the problem of loss to follow-up, often through death of study members before outcomes occur, which means the cohort becomes less representative over time. When cohort studies require long follow-up periods, there is the possibility that diagnostic criteria, methods and optimal exposure measurement may change over time. Hence, there is sometimes the view that the answer to the research question has arrived too late. Common analytical techniques for cohort studies Most research from cohort studies attempts to determine the association between an exposure (or risk factor) and an outcome, and to quantify the magnitude of the effect of the risk factor on the outcome. The simplest method for this is to calculate the risk or rate ratio for the outcome in the exposed cohort compared to the risk or rate in the unexposed cohort. The denominator for the risk is the whole exposed (or not exposed) cohort while for the rate it is the person years of observation, which considers any losses to follow-up. In a RCT, the process of randomisation generally (though not always) results in the exposed and non-exposed groups being balanced on all other factors that might be associated with both the exposure and the outcome. However, in cohort studies, the analysis must try to account for this by controlling, or adjusting, for such confounding factors that might obscure the effect of the exposure and outcome. Potential confounding variables are generally detected through causal diagrams and/or after careful thought of the causal pathways that govern the outcome of interest, including previous research. To account for confounding the simple comparison of the risk in the exposed and not exposed is extended to a logistic regression model where the dependent variable is whether or not the outcome has occurred and exposure is the independent variable. Other potential confounding factors are then added in to see whether these affect the relationship between exposure and outcome. However, care must be taken that other factors are not mediators, or part of the causal pathway between the exposure and the outcomes. When the outcome is continuous (e.g. gait speed) and measured at baseline and follow-ups then the analysis must take account of the dependence between the multiple observations on the same individual. For balanced data (no missing observation and equally spaced follow-ups, the simplest analysis is a repeated measures analysis of variance. However, such repeated measures can be fitted much more flexibly in the multi-level modelling framework which can accommodate outcomes that are binary, ordinal or even time-to-event (see below) as well as continuous. Training material for multi-level models is available from the Bristol Centre for Multilevel Modelling (http://www.bristol.ac.uk/cmm/) but these models can also be fitted in standard statistical software packages such as the Statistical Package for the Social sciences (SPSS). Some cohort studies are interested in how long it takes for an outcome/event to occur and this is often referred to as time-to-event data, survival data or failure data. In many cases, the time-to-event is death. To analyse these types of data, a routinely observed test to detect differences in survival rates between groups is the log-rank test (not accounting for confounding). Survival rates between two groups can be compared through graphical plots of the Kaplan–Meir estimate of the survivor function and/or Nelson–Aalen estimate of cumulative hazard. If there is a need to assess the effects of an exposure after accounting for confounding, the most commonly used modelling technique is the Cox Proportional Hazards model. Other techniques can be used but have additional assumptions. These techniques outline of the main statistical methods that can be found in the cohort study literature but are not exhaustive. Other more specialist techniques are available, for example multi-state modelling techniques when the outcome/event is not binary (dead/alive) but a series of health states (for example independent/disabled/dead) observed on multiple occasions over time. Most statistical textbooks will cover the basic methods but complex modelling of longitudinal data would benefit from a more in-depth approach . Major cohort studies of ageing It is impossible here to review all the cohort studies that include older populations, or which were developed to answer research questions on ageing. Here we focus on studies which have been developed to answer general research questions on ageing, rather than those focused on specific chronic conditions such as osteoporosis, diabetes or cardiovascular disease. General ageing cohorts can be divided into two groups: those covering an age range, and those recruiting participants from a single birth year. An important subgroup of the former is the growing number of studies based on and harmonised with the long running US Health and Retirement Study (HRS) , including the English Longitudinal Study of Ageing (ELSA) , the Irish Longitudinal Study on Ageing (TILDA) , the Survey of Health and Retirement in Europe (SHARE)  and the China Health and Retirement Longitudinal Study (CHARLS) . Data from HRS and many of its international sister studies are publicly available and the full list of studies, with tools and resources from cross harmonisation, are available on the USC Gateway to Global Aging (www.https://g2aging.org/). Other long running cohort studies of ageing covering more than a single birth cohort have generally been regional rather than national; these include the Longitudinal Ageing Study Amsterdam (LASA) , the Jerusalem Longitudinal Study , the Cambridge City over-75s Cohort (CC75C) , the Cognitive Function and Ageing Studies [11, 12] and a set of Australian studies that have been brought together and harmonised in the Dynamic Analyses to Optimize Ageing (DYNOPTA) project . Rather than geographically based, others are occupationally focussed such as the Whitehall Study , the GAZEL cohort  and the Helsinki Health Study . Cohort studies recruiting participants from a single birth year can also be subdivided into those who recruit from birth or childhood, thereby collecting early life information without recall bias, and those who recruit in old age. The UK is fortunate in having a rich history of cohorts followed regularly from birth and the first of these, the 1946 National Birth Cohort , has now reached older age (65+ years) and the second, the 1958 British Birth Cohort, will reach 60 in 2018 . Two other British cohorts had data from birth records but were not retraced until later in life: the Newcastle Thousand Families Study , and the Hertfordshire Cohort Study , whereas studies such as the Lothian Birth cohorts of 1921 and 1936 , the Boyd Orr cohort  and the Aberdeen Children of the 1950 s cohort , retraced in adulthood those who took part in surveys or tests as children, thereby providing early life data. The very old, those aged 85 years and over, are the fastest growing section of most populations in the developed world. Cohort studies of ageing which have a lower age limit of 50 or 65 years often contain relatively few very old people, unless this age group is over sampled (as in CFAS ). To overcome this, there has been a set of studies with similar design and based on single year birth cohorts but recruited at very old age: the Leiden 85+ cohort , the Newcastle 85+ study  and the Life and Living in Advanced Age Cohort Study in New Zealand (Lilacs NZ) . Other cohort studies (not necessarily single birth years) of the very old include the Vitality 90+ study from Finland , the Swedish Panel Study of Living Conditions of the Oldest Old (SWEOLD) , and the various centenarian cohorts . The list of cohort studies of ageing cited here is not exhaustive. Outcomes usually cover functional and cognitive health, but many would also encompass social functioning. Increasingly objective as well as subjective health and functioning measures are collected, including biological measures. Cohorts from the Nordic countries especially, and increasingly the UK, link administrative data or data on health and social care usage and, for younger members, employment. Given the expense of set up and maintenance of cohort studies, there are a growing number of initiatives to meta-analyse cohort studies through post harmonisation, such as IALSA (Integrative Analysis of Longitudinal Studies of Aging and Dementia, available at https://staging.maelstrom-research.org/mica/network/ialsa/) and the UK based CLOSER (Cohort & Longitudinal Studies Enhancement Resources, available at https://www.closer.ac.uk/). Of the studies outlined here some features are worthy of mention: Population-based samples do not always include those living in assisted care facilities (residential or nursing homes); in studies of the very old this can be problematic and can give a distorted picture of the health of this age group. Although studies followed from birth have data from across the life course, they require very long-term funding and may be relatively underpowered by advanced age. Methods of measuring health in younger old populations may not be appropriate for very old populations leading to biased results; in particular measures may have ceiling or floor effects at younger or very old ages. Though clinic-based assessments may provide more accurate assessment, they have the potential to introduce bias since participants with significant mobility or cognitive impairments may not take part . Maintaining active contact with participants between assessment waves is crucial to reduce withdrawal from the study, as well as for ascertainment of address change or death which can reduce unnecessary or inappropriate contact. Conclusions If well conducted with minimal attrition, cohort studies can provide strong evidence of causal links between an outcome and an earlier exposure. Many cohort studies of older populations already exist but they are expensive to set up and maintain, and therefore new longitudinal cohorts are being focussed in countries where none exist. The HRS set of studies, with similar designs and measures, maximise the potential to validate findings across countries with different cultures, societal norms and health delivery, thereby strengthening evidence. Other post harmonisation efforts, especially through meta-analysis, are important and should be employed more often to provide stronger evidence for policy in the shortest time. Keypoints If well conducted, cohort studies can provide strong evidence of causal links between outcomes and exposures. Exclusion of care home residents from cohort studies can give a distorted picture of health or causal relationships. Meta-analysis of cohort studies by post harmonisation of measures is encouraged to provide timely answers to research questions. Conflict of interest None. Funding None. References 1 Kelfve S. Underestimated health inequalities among older people-a consequence of excluding the most disabled and disadvantaged J Gerontol B Psychol Sci Soc Sci , 2017. doi: 10.1093/geronb/gbx032 2 Singer JD, Willett JB. Applied Longitudinal Data Analysis: Modeling Change and Event Occurrence . Oxford: Oxford University Press, 2003. Google Scholar CrossRef Search ADS 3 Sonnega A, Faul JD, Ofstedal MB, Langa KM, Phillips JW, Weir DR. Cohort profile: the Health and Retirement Study (HRS). Int J Epidemiol 2014; 43: 576– 85. Google Scholar CrossRef Search ADS PubMed 4 Steptoe A, Breeze E, Banks J, Nazroo J. Cohort profile: the English longitudinal study of ageing. Int J Epidemiol 2013; 42: 1640– 8. Google Scholar CrossRef Search ADS PubMed 5 Kearney PM, Cronin H, O'Regan C et al. . Cohort profile: the Irish longitudinal study on ageing. Int J Epidemiol 2011; 40: 877– 84. Google Scholar CrossRef Search ADS PubMed 6 Börsch-supan A, Brandt M, Hunkler C et al. . SHARE Central Coordination Team. Data Resource profile: the Survey of Health, Ageing and Retirement in Europe (SHARE). Int J Epidemiol 2013; 42: 992– 1001. Google Scholar CrossRef Search ADS PubMed 7 Zhao Y, Hu Y, Smith JP, Strauss J, Yang G. Cohort profile: The China Health and Retirement Longitudinal Study (CHARLS). Int J Epidemiol 2014; 43: 61– 8. Google Scholar CrossRef Search ADS PubMed 8 Huisman M, Poppelaars J, van der Horst M J et al. . Cohort profile: the Longitudinal Aging Study Amsterdam. Int J Epidemiol 2011; 40: 868– 76. Google Scholar CrossRef Search ADS PubMed 9 Jacobs JM, Cohen A, Bursztyn M, Azoulay D, Ein-Mor E, Stessman J. Cohort profile: the Jerusalem longitudinal cohort study. Int J Epidemiol 2009; 38: 1464– 9. Google Scholar CrossRef Search ADS PubMed 10 Fleming J, Zhao E, O'Connor DW, Pollitt PA, Brayne C. Cohort profile: the Cambridge City over-75s Cohort (CC75C). Int J Epidemiol 2007; 36: 40– 6. Google Scholar CrossRef Search ADS PubMed 11 Brayne C, McCracken C, Matthews FE. Medical Research Council Coginitive Function and Ageing Study (CFAS). Cohort profile: the Medical Research Council Cognitive Function and Ageing Study (CFAS). Int J Epidemiol 2006; 35: 1140– 5. Google Scholar CrossRef Search ADS PubMed 12 Matthews FE, Arthur A, Barnes LE et al. . A two-decade comparison of prevalence of dementia in individuals aged 65 years and older from three geographical areas of England: results of the Cognitive Function and Ageing Study I and II. Lancet 2013; 382: 1405– 12. Google Scholar CrossRef Search ADS PubMed 13 Anstey KJ, Byles JE, Luszcz MA et al. . Cohort profile: The Dynamic Analyses to Optimize Ageing (DYNOPTA) project. Int J Epidemiol 2010; 39: 44– 51. Google Scholar CrossRef Search ADS PubMed 14 Marmot M, Brunner E. Cohort profile: the Whitehall II study. Int J Epidemiol 2005; 34: 251– 6. Google Scholar CrossRef Search ADS PubMed 15 Goldberg M, Leclerc A, Zins A. Cohort profile update: the GAZEL cohort study. Int J Epidemiol 2015; 44: 77– 77g. Google Scholar CrossRef Search ADS PubMed 16 Lahelma E, Aittomäki A, Laaksonen M et al. . Cohort profile: the Helsinki Health Study. Int J Epidemiol 2013; 42: 722– 30. Google Scholar CrossRef Search ADS PubMed 17 Wadsworth M, Kuh D, Richards M, Hardy R. Cohort profile: the 1946 National birth cohort (MRC National Survey of Health and Development). Int J Epidemiol 2006; 35: 49– 54. Google Scholar CrossRef Search ADS PubMed 18 Power C, Elliott J. Cohort profile: 1958 British birth cohort (National Child Development Study). Int J Epidemiol 2006; 35: 34– 41. Google Scholar CrossRef Search ADS PubMed 19 Pearce MS, Unwin NC, Parker L, Craft AW. Cohort profile: the Newcastle Thousand Families 1947 birth cohort. Int J Epidemiol 2009; 38: 932– 7. Google Scholar CrossRef Search ADS PubMed 20 Syddall HE, Aihie Sayer A, Dennison EM, Martin HJ, Barker DJ, Cooper C. Cohort profile: the Hertfordshire cohort study. Int J Epidemiol 2005; 34: 1234– 42. Google Scholar CrossRef Search ADS PubMed 21 Deary IJ, Gow AJ, Pattie A, Starr JM. Cohort profile: the Lothian Birth Cohorts of 1921 and 1936. Int J Epidemiol 2012; 41: 1576– 84. Google Scholar CrossRef Search ADS PubMed 22 Martin RM, Gunnell D, Pemberton J, Frankel S, Davey Smith G. Cohort profile: the Boyd Orr cohort—an historical cohort study based on the 65 year follow-up of the Carnegie Survey of Diet and Health (1937–39). Int J Epidemiol 2005; 34: 742– 9. Google Scholar CrossRef Search ADS PubMed 23 Leon DA, Lawlor DA, Clark H, Macintyre S. Cohort profile: the Aberdeen Children of the 1950s study. Int J Epidemiol 2006; 35: 549– 52. Google Scholar CrossRef Search ADS PubMed 24 der Wiel AB, van Exel E, de Craen AJ et al. . A high response is not essential to prevent selection bias. J Clin Epidemiol 2002; 55: 1119– 25. Google Scholar CrossRef Search ADS PubMed 25 Collerton J, Barrass K, Bond J et al. . The Newcastle 85+ study: biological, clinical and psychosocial factors associated with healthy ageing: study protocol. BMC Geriatr 2007; 7: 7– 14. Google Scholar CrossRef Search ADS PubMed 26 Kerse N, Teh R, Moyes SA et al. . Cohort profile: Te Puawaitanga o Nga Tapuwae Kia Ora Tonu, Life and Living in Advanced Age: a Cohort Study in New Zealand (LiLACS NZ). Int J Epidemiol 2015; 44: 1823– 32. Google Scholar CrossRef Search ADS PubMed 27 Jylhä M, Hervonen A. Functional status and need of help among people aged 90 or over: a mailed survey with a total home-dwelling population. Scand J Public Health 1999; 27: 106– 11. Google Scholar CrossRef Search ADS PubMed 28 Lennartsson C, Agahi N, Hols-Salén L et al. . Data resource profile: the Swedish Panel Study of Living Conditions of the Oldest Old (SWEOLD). Int J Epidemiol 2014; 43: 731– 8. Google Scholar CrossRef Search ADS PubMed 29 Willcox DC, Willcox BJ, Poon BJ. Centenarian studies: important contributors to our understanding of the aging process and longevity. Curr Gerontol Geriatr Res 2010; 2010: 1– 6. Google Scholar CrossRef Search ADS 30 van Bemmel T, Delgado V, Bax JJ et al. . Impact of valvular heart disease on activities of daily living of nonagenarians: the leiden 85-plus study a population based study. BMC Geriatr 2010; 10: 1. Google Scholar CrossRef Search ADS PubMed © The Author 2017. Published by Oxford University Press on behalf of the British Geriatrics Society.All rights reserved. For permissions, please email: email@example.com
Age and Ageing – Oxford University Press
Published: Mar 1, 2018
It’s your single place to instantly
discover and read the research
that matters to you.
Enjoy affordable access to
over 18 million articles from more than
15,000 peer-reviewed journals.
All for just $49/month
Query the DeepDyve database, plus search all of PubMed and Google Scholar seamlessly
Save any article or search result from DeepDyve, PubMed, and Google Scholar... all in one place.
Get unlimited, online access to over 18 million full-text articles from more than 15,000 scientific journals.
Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more.
All the latest content is available, no embargo periods.
“Hi guys, I cannot tell you how much I love this resource. Incredible. I really believe you've hit the nail on the head with this site in regards to solving the research-purchase issue.”Daniel C.
“Whoa! It’s like Spotify but for academic articles.”@Phil_Robichaud
“I must say, @deepdyve is a fabulous solution to the independent researcher's problem of #access to #information.”@deepthiw
“My last article couldn't be possible without the platform @deepdyve that makes journal papers cheaper.”@JoseServera