A short list of definitions for the lazy exam candidate

The college loves to include these in the exam paper, largely because the definitions of statistics terminology are usually quite fixed, not open to interpretation, and therefore easy to grade (i.e. you either get zero or you get full marks).

  • Question 7 from the second paper of 2021 (sensitivity, specificity, ROC curve)
  • Question 27 from the second paper of 2020 (external validity, allocation concealment, stratification, sensitivity analysis, fragility index
  • Question 24 from the second paper of 2019 (stratification, intention to treat analysis, sensitivity analysis, Kaplan-Meir curve and the analysis of competing risk).
  • Question 17 from the first paper of 2012 (Define "EBM", levels of evidence, intention to treat analysis)
  • Question 23 from the second paper of 2011 (RR, NNT, P-value, CI)
  • Question 9 from the second paper of 2010 (RR, AR, NNT, power of a study)
  • Question 23 from the second paper of 2008 ( Type 1 and 2 error, power, effect size, sample size)
  • Question 29.1 from the first paper of 2008 (Phases of a clinical trial; also calculate ARR, RRR, NNT)
  • Question 29.2 from the first paper of 2008 (Calculate sensitivity, specificity, PPV and NPV)
  • Question 10 from the second paper of 2006 (Define intention to treat analysis and randomisation)
  • Question 25 from the first paper of 2006 (RR, AR, NNT, power of a study)
  • Question 13 from the first paper of 2005 (Define sensitivity, specificity, PPV and NPV)
  • Question 14 from the second paper of 2002 (Define sensitivity, specificity, PPV and NPV)
  • Question 2b from the first paper of 2001 (Fluff; a creative writing task)

Most often, the question takes the shape of "define this or that concept", or (more worryingly) "calculate the following statistical variables". The only exception was the ancient Question 2b from the first paper of 2001, which asked  "What is the relevance of Evidence Based Medicine to your patients and how will you apply this?" which is equivalent to asking a high school English class to write essays on the topic of "What does freedom mean to me?". Fortunately, this sort of thing has not been seen for a while. Even raw definitions of such pleb items as sensitivity and specificity have not been seen since about 2008 (i.e they disappeared around the same time as the CICM Part I appeared on the scene; this primary exam then seems to have absorbed many of the EBM definition questions). These days, Part II tends to feature "fellow-level" questions about interpreting funnel plots and examining meta-analysis articles for validity.

Without further ado, here is a list of all the previously examined definitions in the statistics questions from the CICM Fellowship exam.

The definition of evidence-based medicine

"Evidence based medicine is the conscientious, explicit and judicious use of current best evidence in making decisions about the care of individual patients. The practice of evidence based medicine means integrating individual clinical expertise with the best available external clinical evidence from systematic research."

Levels of evidence

  • 1a: Systematic review (with homogeneity) of RCTs
  • 1b: Individual RCT (with narrow Confidence Interval)
  • 1c: All or none (ie all patients died before the Rx became available, but some now survive on it; or when some patients died before the Rx became available, but none now die on it)
  • 2a: Systematic review (with homogeneity ) of cohort studies
  • 2b: Individual cohort study (including low quality RCT; e.g., <80% follow-up)
  • 2c: "Outcomes" Research or ecologic studies (studies of group chics)
  • 3a: Systematic review (with homogeneity) of case-control studies
  • 3b: Individual Case-Control Study
  • 4: Case-series (and poor quality cohort and case-control studies )
  • 5: Expert opinion or based on physiology, bench research or "first principles"

Phases of a clinical trial

  • In vitro activity
  • Animal model
  • Phase 1: healthy volunteers
  • Phase 2: Patients with the disease of interest
  • Phase 3: large scale trial on the patients with the disease
  • Phase 4:"post-marketing experience"

Randomisation in clinical trials

  • Assignment of clinical trial participants so that each participant has an equal chance of being assigned to any of the groups.
  • Successful randomisation requires that group assignment cannot be predicted in advance.
  • Minimises selection bias
  • Allows probability theory to be used to express the likelihood that chance is responsible for the diffences in outcome among groups


  • Stratification is the partitioning of subjects and results by a factor other than the treatment given.
  • Stratification ensures that pre-identified confounding factors are equally distributed, to achieve balance. The objective is to remove "nuisance variables", eg. the presence of neutropenia in a trial performed on septic patients. One would want to ensure that the treatment group and the placebo group had equal numbers of these haematology disasters.

Intention to treat analysis

  • "Once randomised, always analysed"
  • All enrolled patients have to be a part of the final analysis
  • This preserves the bias-protective effect of randomisation
  • Minimises Type 1 errors (false positives)
  • When intention-to-treat analysis agrees with per-protocol analysis, it increases the validity of the study

Sensitivity analysis

  • Analysis of the data from a clinical trial where some of the assumptions are intentionally changed
  • One example of this is to assume that all the patients lost to follow-up or who dropped out of the study have failed treatment. If the 

Confidence interval

  •  The range of values within which the "actual" result is found.
  • A CI of 95% means that if a trial was repeated an infinite number of times, 95% of the results would fall within this range of values.
  • The CI gives an indication of the precision of the sample mean as an estimate of the "true" population mean
  • A wide CI can be caused by small samples or by a large variance within a sample.


  • The probability of the observed result arising by chance
  • The p-value is the chance of getting the reported study result (or one even more extreme) when the null hypothesis is actually true.

Type 1 error

  • This is a "false positive".
  • The null hypothesis is incorrectly rejected (i.e. there really is no treatment effect, but the study finds one)
  • The alpha value determines the risk of this happening. An alpha value of 0.05 - same as the p-value - so there is a 5% chance of making a Type 1 error.

Type 2 error

  • This is a "false negative"
  • The null hypothesis is incorrectly accepted (i.e. there really is a treatment effect, but you fail to find it)
  • The beta determines the risk of this happening. Where beta is 0.2 (a common setting), at a power of 0.8 (1-beta), there is a 20% chance of making a Type 2 error.


  • The power of a statistical test is the probability that it correctly rejects the null hypothesis, when the null hypothesis is false.
  • The chance of a study demonstrating a "true" result
  • Power = (1 - false negative rate)
  • Power = (1- beta error)
  • Normally, power is 80% (i.e. a 20% chance of a false negative result)

Determinants of sample size

  • Alpha value: the level of significance (normally 0.05)
  • Beta-value: the power (normally 0.2)
  • The statistical test you plan to use
  • The variance of the population (the greater the variance, the larger the sample size)
  • The effect size (the smaller the effect size, the larger the required sample)

Effect size

  • Effect size  is a quantitative reflection of the magnitude of a phenomenon, eg. the difference in the incidence of an arbitrarily defined outcome between the treatment group and the placebo group.
  • Effect size suggests the clinical relevance of an outcome.
  • The effect size is agreed upon a priori so that a sample size can be calculated (as the study needs to be powered appropriately to detect a given effect size)

Absolute risk

  • Actual event rate in the group (treatment or placebo). Essentially, it is the incidence rate.

Relative Risk (risk ratio)

  • The rate of events in the treatment group, divided by the rate of events in the control group.
    The college describes it as "the difference in event rates between 2 groups expressed as proportion of the event rate in the untreated group".

Odds ratio

  • The Odds Ratio represents the odds that an outcome will occur given a particular exposure, compared to the odds of the outcome occurring in the absence of that exposure.
  • An OR =1 suggests there is no association.
  • If the CI for an OR includes 1, then the OR is not significant (i.e. there might not be an association)

Relative risk reduction

  • RRR= absolute risk reduction divided by the control group risk.
  • Or, one can calculate it by subtracting relative risk (RR) from 1.
  • Thus, RRR = (1-RR)

Absolute risk reduction

  • This is the difference between the baseline population risk and the treatment risk.
  • It is an effective way of demonstrating a treatment effect.
  • ARR = incidence in exposed - incidence in unexposed

Attributable risk

  • This is a measure of the absolute effect of the risk of those exposed compared to unexposed.
  • The inverse of the absolute outcome difference between the treatment group and the control group.

Analysis of competing risk:

  • A competing risk is an event that either hinders the observation of the event of interest or modifies the chance that this event occurs
  • An example is death while on dialysis and getting a kidney transplant (the two eventsinterfere with one another)
  • Conventional methods (eg. Kaplan–Meier and standard Cox regression) ignore the competing events and may not be appropriate and competing risk analsysi methods must be employed

Numbers needed to treat or harm

  • NNT = 1 / (control event rate - experimental event rate)
  • One must use the absolute, rather than the relative, values here.  NNT is the inverse of absolute risk reduction.
  • Lets say the absolute risk reduction is 10%. Thus, NNT = 1/0.1, or 10.


  • Sensitivity = true positives / (true positives + false negatives)
  • This is the proportion of disease which was correctly identified


  • Specificity = true negatives / (true negatives + false positives)
  • This is the proportion of healthy patients in who disease was correctly excluded

Positive predictive value

  • The proportion of the positive tests results which are actually positive is the Positive Predictive Value
  • PPV = true positives / total positives (true and false)

Negative predictive value

  • The proportion of negative test results which are actually negative is the Negative Predictive Value
  • NPV = true negatives / total negatives (true and false)

Kaplan-Meier curve

  • A Kaplan-Meier curve is defined as the probability of surviving in a given length of time while considering time in many small intervals
  • The curve itself is a plot of the fraction of patients surviving in each group over time

Receiver operator characteristic curve (ROC curve)

  • This is a plot of sensitivity vs. false positive rate, for a number of test results
  • Sensitivity is on the y-axis, from 0% to 100%
  • The ROC curve graphically represents the compromise between sensitivity and specificity in tests which produce results on a numerical scale, rather than binary (positive vs. negative results)
  • The ROC curve determines the cut off point at which the sensitivity and specificity are optimal.

Area Under the Curve (AUC)

  • AUC is the Area Under the ROC curve.
  • The higher the AUC, the more accurate the test
  • An AUC of 1.0 means the test is 100% accurate
  • An AUC of 0.5 (50%) means the ROC curve is a a straight diagonal line, which represents the "ideal bad test", one which is only ever accurate by pure chance.
  • When comparing two tests, the more accurate test is the one with an ROC curve further to the top left corner of the graph, with a higher AUC.
  • The best cutoff point for a test (which separates positive from negative values) is the point on the ROC curve which is closest to the top left corner of the graph.

Likelihood ratio

  • A tangent at a point on the ROC curve represents the likelihood ratio for a single test value
  • Positive likelihood ratio = sensitivity / (1-specificity)
    • The chance of having the disease if the test is positive
  • Negative likelihood ratio =  (1-sensitivity) / specificity
    • The chance of having a disease if the test is negative; i.e. the probability of a person who has the disease testing negative divided by the probability of a person who does not have the disease testing negative. 
  • Pre-test probability = (true positive + false negative) / total sample
  • Pre-test odds: pre-test probability / (1- pre-test probability)
  • Post-test odds: likelihood ratio × pre-test odds