Data distributions

The keywords to memorise but not necessarilty understand are as follows:

  • Normal Distribution
    • This is a normal "bell curve" distribution.
    • The bell curve is symmetrical and closely resembles the population as a whole
    • Thus, the sample mean probably resembles the population mean.
    • The standard deviation (sd) of the sample is likely to represent the (sd) of the population
    • The more irregular-looking the curve, the less likely it is to represent the population
    • Small samples and non-random sampling results in irregular curves
  • Standard normal distribution
    • This is a "z-transformation" of a normal distribution
    • Z tranformation is the transformation of the points on a normal distribution into multiples of the standard deviation from the population mean. These multiples are called z values
  • Data transformation
    • If your data is skewed, you can use data transformation to make it resemble a normal distribution, for the purposes of performing some sort of statistical manipulation which requires a normal distribution.
  • Binomial distribution
    • This term describes the sample distribution of a binary variable (eg. alive/dead).
    • The larger the sample,the closer the binomial distribution is to the normal distribution.
  • Poisson distribution
    • The probability of a number of events occurring per time period (or location)
    • The events must be random and independent of each other.
    • You only need prior knowledge of one parameter: the mean number of events per unit time (or per area of space)