Statistical tests for quantitative data

Question 4 from the second paper of 2004, which asked for details about parametric and non-parametric tests, was passed by 22% of the candidates, which is slightly better than  Question 4 from the second paper of 2003 which asked  the candidates to "compare and contrast the use of the Chi-squared test, Fisher’s Exact Test and logistic regression when analysing data". Such things seem to now be behind us.

Statistical tests for quantitative data

You use these to figure out the p-value, i.e. the chance of getting the same results if the null hypothesis were true.

There are parametric and non-parametric tests.

parametric tests are more accurate, but require the assumption to be made about the data, eg. that the data is normally distributed.

Parametric tests

Description of parametric tests

Parametric tests are more accurate, but require assumptions to be made about the data, eg. that the data is normally distributed (in a bell curve). If the data deviate strongly from the assumptions, the parametric test could lead to incorrect conclusions.

If the sample size is too small, parametric tests may lead to incorrect conclusions due to the loss of "normality" of sample distribution.

Examples of parametric tests:

  • Normal distribution
  • Students T Test
  • Analysis of variance
  • Pearson correlation coefficient
  • Regression or multiple regression

Non-parametric tests

Description of non-parametric tests

Non-parametric tests make no assumptions about the distribution of the data. If the assumptions for a parametric test are not met (eg. the distribution has a lot of skew in it), one may be able to use an analogous non-parametric tests.

Non-parametric tests are particularly good for small sample sizes (<30). However, non-parametric tests have less power.

Examples of non-parametric tests:

  • Mann-Whitney U test
  • Wilcoxon sum test
  • Wilcoxon signed-rank test
  • Kruskal-Wallis test
  • Friedman's test
  • Spearman's rank order

Linear regression

  • Linear regression is a "line of best fit" that best describes the association between two variables.
  • b = slope of this line-of-best-fit (regression coefficient)

Logistic regression analysis

  • Method of predicting a binary variable (eg. dead or alive) on the basis of numerous predictive factors
  • ICU mortality is predicted using logistic regression analysis
  • Regression coefficients allow the contribution of different predictor variables to be analysed.
  • Goodness of fit can be estimated using a variety of mathematic methods.

Correlation

  • Pearson's correlation: This is the quantification of the residual scatter around the regression line (r). When (r = 0), there is no correlation (i.e. the whole data is noise and no line of best fit can confidently be drawn). When (r = 1), the correlation is perfect.
  • Spearman's rank correlation: same as the above, but for small samples. Presents us with a value called (rs), which is interpreted in the same way as (r).

References