How can the ‘central tendency’ of data be measured (50% of marks)? How is the ‘degree of dispersion’ is described (50% of marks)?
There was a lack of sufficient knowledge to pass this question. The first question required a
definition of mean, median and mode and some explanation about when the use of one
would be preferred over another. One candidate gave an example of a simple data set (a set
of numbers) and calculated the mean, median & mode and explained the effect of an outlier.
This simple exercise demonstrated they had a very good understanding of the terms.
The degree of dispersion was poorly answered. A list of measures with a sentence or
equation describing each would have scored well. The list could have included range, interquartile range (and box and whisper plots), mean absolute deviation, variance and standard deviation and coefficient of variation. Not all measures needed to be discussed to pass the question. Many candidates confused standard deviation and standard error of the mean and many candidates failed to discuss ranges which are the simplest way to describe dispersion.
What follows is "a list of measures with a sentence or equation describing each", which should have "scored well".
Measures of central tendency:
- Median is the middle number in a data set that is ordered from least to greatest
- Mode is the number that occurs most often in a data set
- Arithmetic mean is the average of a set of numerical values,
- Geometric mean is the nth root of the product of n numbers
Degree of dispersion: measures which describe the dispersion of data around a mean
- Range: the highest and the lowest score
- Interquartile range: the difference between 75th and 25th percentiles
- Percentile: the percentage band into which the score falls (mean = the 50th percentile)
- Deviation: distance between an observed score and the mean score
- Variance: deviation squared
- Standard deviation: square root of variance; a measure of the average spread of individual samples from the mean. Reporting the SD along with the mean gives one the impression of how valid that mean value actually is (i.e. if the SD is huge, the mean is totally invalid - it is not an accurate measure of central tendency, because the data is so widely scattered.)
- Standard error: an estimate of spread of samples around the population mean.
SE = SD / square root of n. It is used to calculate the confidence interval.
- Coefficient of variation, also known as "relative standard deviation", is the SD divided by the mean.
- Mean absolute deviation is the average of the absolute deviations from a central point