The t-distribution is used to estimate the population parameters when the sample size is small and/or when the population standard deviation is unknown. It is a continuous probability distribution which is also known as student's t-distribution.
According to the central limit theorem, when the sample size (n) is large enough [that is, n ≥ 30], the sampling distribution of any statistic (like a sample mean) will be approximately normal. Thus, when we know the standard deviation of the population, we can compute a z-score and use the normal distribution to evaluate probabilities with the sample mean.
But sometimes sample sizes (n) are small and often we don't know the standard deviation of the population. When either of these problems occur, the sampling distribution of any statistic follows a t-distribution, which is similar in many respects to the normal distribution. The formula for calculating the t-statistic (or) t-score is given by: t =
, df = n – 1.
In the above equation: x is the sample mean, μ is the population mean,
is the standard error [standard error is an error occurred by using the standard deviation of sample to estimate the standard deviation of population] and 'df' is the degrees of freedom.
- Degrees of freedom: Actually there are many different t-distributions. The particular form of the t-distribution is determined by its degrees of freedom. The degrees of freedom refers to the number of independent observations in a set of data. When estimating a mean (or) a proportion from a single sample of size 'n', the number of independent observations is equal to the sample size minus one, that is, degrees of freedom = n – 1. We will use the symbol t(k) to identify the t-distribution with k degrees of freedom.
The t-statistic (or) t-score follows a t-distribution if and only if (i) The population from which the sample was drawn is approximately normal (or) the sample size is large enough, that is, n ≥ 30 (ii) The sample which is drawn from the population is an SRS (simple random sample).
When the sample size (n) increases, the degrees of freedom (df) also increases. As degrees of freedom increases, the t-distribution gets closer to the normal distribution.
t-table: The table used for t-scores is set up differently than table used for z-scores. In standard normal tables, the marginal entries are z-scores and the table entries are the corresponding areas under the normal curve to the left of z. In the t-table, the left hand column is degrees of freedom, the top margin gives upper tail probabilities and the table entries are the corresponding critical values of 't' required to achieve the probability. Here, we will use t* (or z*) to indicate critical values.