Quality of Trials

Quality of Trials

There are many studies that conflict each other in conclusions that can be drawn for them. A study might say that drug X is good for all, while another study might say drug X is actually very bad for your health. The essential question here is how we can rate an article. Not only the problem of conflicting studies is an issue, but also the fact that many articles are just wrong. They are not reproducable, and if they are, the results are not. Research has been flawed and will continue to be flawed. The reasons as to why and how to investigate this will be discussed in another post.  In this post, we will learn how to discern between well executed research and poorly executed research.

When reading an article, you need to know whether the results could be due to chance and what the biases and limitations are, because there always will be.  We also need to make a judgement about the validity by asking questions such as when research is done on a subject population in Sweden, does that mean you can apply the results on a population in the Netherlands?

A study is build on a structure of introduction, method, results, and discussion. All these are essential, while the methodology and results form the core of the research done. In general, a study, through the different sections, answers the following questions:

  1. What was the motive behind the research?
  2. How did we perform the research?
  3. What did we find in the experiment?
  4. Why did we find these results and what biases/limitations did we experience?

As a general rule, we assume that all conclusions of an article are wrong unless they are reproduced. This way, we can be super critical. Let’s start with important aspects that a trial needs to have, starting from the beginning.

The title should be concise, describe in one sentence what the research is about and do so in an objective manner.
The introduction of well written articles describes why the research is being done in the first place. Why now? And how is this trial different from others that have preceded it? I mean, you have to have a good reason to do a trial. Good quality studies use the PICO model. This stands for patients, intervention, control and (primary) outcome. Studies should describe the study subjects. Who are they and were do they come from? What intervention was used, was it a drug, a procedure or something else? Who is the control group and how are they ‘controlled’. Finally, what do we want to know? For example a primary outcome could be the mortality after 3 years. Mortality is a fairly objective parameter to be measured. We have a good objective definition of death. In contrast, happiness is not (yet) an objective measurement. A PICO model could look like this: We tested 60 male patients between 30-40 years old who were divided randomly into an intervention group with drug X and a control group with placebo and we want to measure the mortality after 3 years due to this intervention.

Concerning the methodology of a study, the most important factor is whether the study is reproducable. This does not necessarily mean having the same results, but whether you can reproduce the same conditions in which a trial took place. For example if researchers used substance X and don’t specify what that entails, other researchers cannot reproduce the study. Another important aspect is how generalizable is the study? This connects to the question asked earlier about (external) validity. If an intervention proves to work in athletes, what makes you so sure it will also work on common people? The third most important factor is the power of a study and this relates to sample size.  If an intervention really works, what is the probability that the study will detect a difference between intervention and control group? We usually want this to be 80% and by adjusting the sample size, we can achieve this.

Of course there are other important methodological aspects that apply. These include whether a control population is set up. Otherwise an intervention cannot be compared. Is randomization applied? Randomization shows that a subject in the intervention group was not hand picked to be put there, but was chosen randomly. A researcher might hand pick a healthy person to undergo the intervention of surgery, knowing that his chances of survival are better. That is why we need randomization. Even better would be a double-blind randomization. This is when both the researcher and the subject don’t know in which group they belong. A computer randomly decides. Notice that randomization is sometimes not possible or ethical.

Other important factors, which we will not deal with are how the subjects are analysed. Is this done on an intention-to-treat model or a per-protocol model. Both models can be used to play with randomization.
During the results, the most important aspects are to look for what the drop-out rate is. How many people did not continue with the study? This will affect the study’s power. Also, it shouldn’t be that in the intervention group 50% of the people dropped-out and in the control group only 20% and the researcher still making significant claims. When people drop out, it means that only the better or healthier patients remained. So what happened to those patients that don’t come to the trial center anymore? Are they too ill to make it? That is why the drop-out rate is important to consider. Of importance, albeit less, is that figures and charts should be possible to interpret without reading the text. They should be able to be self explanatory.

During the discussion we need to focus on why and how the results of this trial are different or similar to other trials. Every study has limitations and biases, has the author described them and taken measures to correct for them?


This list is not comprehensive, but provides a grip to discern articles. Future posts will focus on the pyramid of trial structures (epidemiological studies, cohorts, RCTs, case-reviews etc.), p-values & confidence intervals and reasons as to why articles might be flawed.