Cohort study

Cohort studies

Suppose we want to know whether an exposure causes a disease. Let’s say whether smoking causes adenocarcinoma of the lung. We could create a randomized control trial where we have collect 1000 people, divide them randomly in two groups of 500. Group A and group B. Group A has to start smoking and group B shall not smoke. That is not possible, since you cannot force people to smoke if they don’t want to. This is considered unethical. However, we can use a different study design; the cohort study. This is a study design where we collect x number of smokers and x number of non-smokers. We assign them to two groups: one that is exposed (to smoking) and one that is not exposed (to smoking). We follow them in time and measure information like death or disease incidence or number of hospitalizations etc. We call the most important information we measure our primary outcome. Usually studies also measure secondary outcomes, but these are of less importance.

Benefits & limitations

In short, a cohort study is where you have an exposed group vs. a non-exposed group and you follow them to see whether the exposure has any relation to the primary outcome. We can do this prospectively, meaning participants have not developed a disease or have not died yet, but have been exposed. We can also choose to assess retrospectively, meaning a group of people has died and we examine whether they have been exposed or not to an exposure we wish to know about. For example 1000 people have died in an asbestos factory and you go back in time to see if there is an exposure difference. You find out that 900 of them were exposed to asbesots while 100 were not. The key distinction of retrospective cohort resaerch is that a researcher goes back in time to find out what might be associated with an outcome.
To repeat in other words, a cohort study involves participants being placed into two groups, followed over a period of time and we measure outcomes that interest us: how many people died, how many people got a disease? And we try to correlate that with factors or exposures they were exposed to. But how do we know this disease they developed is due to the specific exposure and not due to differences in age, sex, income, education level, etc.  For example the group of smokers might have less money that the non-smokers group and therefore visit the GP less. Or the group of non-smokers could be 70-years old while the group of smokers is 40-years old. The researcher tries to correct for these confounders. They try to cancel them out or even better, compare groups who have everything in common, but differ in only exposure or risk factor. That would be ideal, but is almost impossible in real life to find such groups. Thus, one of the biggest limitations of cohort studies is to assess whether associations between a group and risk factors are causal. In other words: how sure can we be that exposure X really causes disease Z without any other factors playing a role, such as age, sex, the location where one lives, the food one eats etc.

Thus to effectively correct for these confounders and find significant results, a cohort study usually takes a long time. Imagine that to develop adenocarcinoma of the lung will take some time after starting with smoking. This long period of time causes another limitation: the condition of people changes. Some in the smoking group decide to stop smoking, others die, others move to another country and discontinue their involvement in the research. What about social factors? A society might evolve to look down on smokers and they might under report their usage. Conversely, it might be considered cool to smoke, and some in the non-smoker group start to smoke, etc. etc. There are so many variables that can change and which make the results hard to interpret. Researchers then want to select people who do not move or who commit and this leads to selection bias.


The word cohort comes from the Latin cohors, meaning a group of warriors proceeding together in time. The word cohort study is attributed to Frost, who studied tuberculosis in the beginning of the 20th century. In the 1960s the Dutch scientist Korteweg used this method of study to analyse the incidence of lung cancer in the Netherlands. Yet this does not mean that before the 1930s no cohort studies were performed. They just had a different name, such as longitudinal or follow-up or just prospective studies. In the late 1800s a need for data on health became needed to make effective policy. Not only policymakers required data, but insurance companies as well. They recorded for example the number of deaths for specific occupations. In the 1950s, some very landmark studies which helped us gain insight into risk factors for certain outcomes were implemented. They continue until today. For example the Framingham study which studies an entire town in the US to find out what the risk factors of cardiovascular disease are. Closer to home, we have the Generation X study of the Erasmus MC which follows children over a period of time.

A lot of our knowledge in medicine (cancer and radiation, asbest and mesotheliama, high blood pressure and heart attack) comes from cohort studies. Cohort studies show an association in relative risks. For example high cholesterol gives a 4x greater risk for a heart attack compared to those that have low cholesterol. We have to note that cohorts do not prove anything in the strict scientific sense, they only express risks or probabilities of association. A high cholesterol is associated with heart attack, but it is not proven in the strict scientific sense. But what if researchers take large groups of people and examine them? What if not only a researcher in the USA, but also one in India, Thailand, Sydney and Greece examines the same outcome for the same exposure and they reach the same conclusion and (approximately) same relative risks? Or what if 90% of the smokers develop lungcarcinoma and only 5% of the non-smokers? Is a consistent relative risk of 20 times greater risk of developing a certain outcome enough prove?

What is meant with a strict scientific sense are the postulates of Koch pre the cohort study area. Koch postulated criteria to establish a hard causative relationship between a disease and a microbe, since at the end of the 19th century people were more concerned with communicable diseases (diseases that spread, like virus and bacteria). Koch stated for example that a microbe that causes the disease must be present in all sick individuals and not in healthy ones and then upon introduction of the microbe in a healthy individual, that person would also become sick. These postulates could not really be used for risk factors. With these, you could not prove smoking to be causing lungcancer, because healthy individuals got the same lungcancer too. Therefore, the need for a different type of ‘proving’ emerged with the emergence of non-communicable disease (chronic diseases).

Over the years, scientists have used different analytical models to make cohort studies better and make predictions based on regression / proportional hazard models. Results are being accompanied by many analyses, p-values and confidence intervals and we will discuss in other articles.

Quality of Trials

Quality of Trials

There are many studies that conflict each other in conclusions that can be drawn for them. A study might say that drug X is good for all, while another study might say drug X is actually very bad for your health. The essential question here is how we can rate an article. Not only the problem of conflicting studies is an issue, but also the fact that many articles are just wrong. They are not reproducable, and if they are, the results are not. Research has been flawed and will continue to be flawed. The reasons as to why and how to investigate this will be discussed in another post.  In this post, we will learn how to discern between well executed research and poorly executed research.

When reading an article, you need to know whether the results could be due to chance and what the biases and limitations are, because there always will be.  We also need to make a judgement about the validity by asking questions such as when research is done on a subject population in Sweden, does that mean you can apply the results on a population in the Netherlands?

A study is build on a structure of introduction, method, results, and discussion. All these are essential, while the methodology and results form the core of the research done. In general, a study, through the different sections, answers the following questions:

  1. What was the motive behind the research?
  2. How did we perform the research?
  3. What did we find in the experiment?
  4. Why did we find these results and what biases/limitations did we experience?

As a general rule, we assume that all conclusions of an article are wrong unless they are reproduced. This way, we can be super critical. Let’s start with important aspects that a trial needs to have, starting from the beginning.

The title should be concise, describe in one sentence what the research is about and do so in an objective manner.
The introduction of well written articles describes why the research is being done in the first place. Why now? And how is this trial different from others that have preceded it? I mean, you have to have a good reason to do a trial. Good quality studies use the PICO model. This stands for patients, intervention, control and (primary) outcome. Studies should describe the study subjects. Who are they and were do they come from? What intervention was used, was it a drug, a procedure or something else? Who is the control group and how are they ‘controlled’. Finally, what do we want to know? For example a primary outcome could be the mortality after 3 years. Mortality is a fairly objective parameter to be measured. We have a good objective definition of death. In contrast, happiness is not (yet) an objective measurement. A PICO model could look like this: We tested 60 male patients between 30-40 years old who were divided randomly into an intervention group with drug X and a control group with placebo and we want to measure the mortality after 3 years due to this intervention.

Concerning the methodology of a study, the most important factor is whether the study is reproducable. This does not necessarily mean having the same results, but whether you can reproduce the same conditions in which a trial took place. For example if researchers used substance X and don’t specify what that entails, other researchers cannot reproduce the study. Another important aspect is how generalizable is the study? This connects to the question asked earlier about (external) validity. If an intervention proves to work in athletes, what makes you so sure it will also work on common people? The third most important factor is the power of a study and this relates to sample size.  If an intervention really works, what is the probability that the study will detect a difference between intervention and control group? We usually want this to be 80% and by adjusting the sample size, we can achieve this.

Of course there are other important methodological aspects that apply. These include whether a control population is set up. Otherwise an intervention cannot be compared. Is randomization applied? Randomization shows that a subject in the intervention group was not hand picked to be put there, but was chosen randomly. A researcher might hand pick a healthy person to undergo the intervention of surgery, knowing that his chances of survival are better. That is why we need randomization. Even better would be a double-blind randomization. This is when both the researcher and the subject don’t know in which group they belong. A computer randomly decides. Notice that randomization is sometimes not possible or ethical.

Other important factors, which we will not deal with are how the subjects are analysed. Is this done on an intention-to-treat model or a per-protocol model. Both models can be used to play with randomization.
During the results, the most important aspects are to look for what the drop-out rate is. How many people did not continue with the study? This will affect the study’s power. Also, it shouldn’t be that in the intervention group 50% of the people dropped-out and in the control group only 20% and the researcher still making significant claims. When people drop out, it means that only the better or healthier patients remained. So what happened to those patients that don’t come to the trial center anymore? Are they too ill to make it? That is why the drop-out rate is important to consider. Of importance, albeit less, is that figures and charts should be possible to interpret without reading the text. They should be able to be self explanatory.

During the discussion we need to focus on why and how the results of this trial are different or similar to other trials. Every study has limitations and biases, has the author described them and taken measures to correct for them?


This list is not comprehensive, but provides a grip to discern articles. Future posts will focus on the pyramid of trial structures (epidemiological studies, cohorts, RCTs, case-reviews etc.), p-values & confidence intervals and reasons as to why articles might be flawed.