Quantitative data: measures of variability and central tendency

Quantitative data types

  • Expressed numerically, and ordered on a scale
  • Interval data: increase at constant intervals, but do not start at zero, eg. temperature on the Celsius scale
  • Ratio data: interval data which has a true zero, eg. pressure
  • Binary data: yes or no answers
  • Discrete data: isolated data points separated by gaps
  • Continuous data: part of a continuous range of values

This sort of data is described by measures of central tendency, dispersion, and "shape".

Measure of central tendency

  • This is the average of a population - allowing the population to be represented by a single value.
  • Examples: Median, mode (value which occurs most frequently) and the means (arithmetic mean and geometric mean)

Measures of variability

  • These describe the dispersion of data around some sort of mean.
    • Range: the highest and the lowest score
    • Percentile: the percentage band into which the score falls (mean = the 50th percentile)
    • Deviation: distance between an observed score and the mean score
    • Variance: deviation squared
    • Standard deviation: square root of variance
      • Measure of the average spread of individual samples from the mean
      • Reporting the SD along with the mean gives one the impression of how valid that mean value actually is (i.e. if the SD is huge, the mean is totally invalid - it is not an accurate measure of central tendency, because the data is so widely scattered.)
    • Standard error
      • This is an estimate of spread of samples around the population mean.
      • You dont known the population mean- you only know the sample mean and the standard deviation for your sample, but if the standard deviation is large, the sample mean may be rather far from the population mean. How far is it? The SE can estimate this.

Standard Error (SE) = SD / square root of n

          • The variability among sample means will be increased if there is (a) a wide variability of individual data and (b) small samples
          • SE is used to calculate the confidence interval.


Shape of the data

  • This vaguely refers to the shape of the probability distribution bell curve.
  • Skewness is a measure of the assymetry of the probability distribution - the tendency of the bell curve to be assymmetrical.
  • Kurtosis or "peakedness" describes the width and height of the peak of the bell curve, i.e. the tendency for the scores to gather around the middle of the bell curve.
  • A normal distribution is a perfectly symmetrical bell curve, and is not skewed.