
A comprehensive guide with examples
The two fundamental concepts in inferential statistics are population and sample. The goal of the inferential statistics is to infer the properties of a population based on samples.
Population is all elements in a group whereas sample means a randomly selected subset of the population. It is not always feasible or possible to collect population data so we perform analysis using samples.
For instance, the college students in US is a population and randomly selected 1000 college students throughout US is a sample of this population.
It would not be reliable to directly apply the sample analysis results to the entire population. We need systematic ways to justify the sample results are applicable to the population. This is the reason why we need to statistical tests. They evaluate how likely the sample results are true representation of the population.
Consider we do research project on obesity. In the scope of our project, we want to compare the average weight of 20-year-old people in two different countries, A and B. Since we cannot collect the population data, we take samples and perform a statistical test.
We set the null and alternative hypothesis as below:
- Null hypothesis (H0) : The average weights of two groups are not different.
- Alternative hypothesis (H1): The average weights of two groups are different.
In case of comparing two groups, t-test is preferred over ANOVA. However, when we have more than two groups, t-test is not the optimal choice because a separate t-test needs to perform to compare each pair.
Assume we are comparing three countries, A, B, and C. We need to apply a t-test to A-B, A-C and B-C pairs. As the number of groups increase, this becomes harder to manage. Thus, we choose to go with ANOVA.
In the case of comparing three or more groups, ANOVA is preferred. There are two elements of ANOVA:
- Variation within each group
- Variation between groups
ANOVA result is based on the F ratio which is calculated as follows:
F ratio is a measure of the comparison between the variation between groups and variation withing groups.
Higher F ratio values indicate the variation between groups is larger than the individual variation of groups. In such cases, it is more likely that the mean of the groups are different.
By contrast, in case of lower F ratio values, the individual variation of groups are larger than the variations between groups. Thus, we can conclude that the elements in a group are highly different rather than the entire groups.
The larger the F ratio, the more likely that the groups have different means.
We have covered the intuition behind ANOVA and when it is typically used. The next step is do an example. We will use the R programming language to perform ANOVA test.
The rnorm function generates an array of numbers sampled from a normal distribution based on the given mean and standard variation values.
> rnorm(5, mean=10, sd=3)
[1] 8.624795 8.431731 10.570984 7.136710 11.801554
We will use the rnorm function to generate sample data and then stack groups in a data frame.
> A = rnorm(100, mean = 60, sd = 5)
> B = rnorm(100, mean = 71, sd = 10)
> C = rnorm(100, mean = 65, sd = 7)
> groups = stack(data.frame(cbind(A, B, C)))
The values column contains the values and ind column shows which group it belongs to. The ANOVA test is done using the aov function:
> anovaResults = aov(values ~ ind, data = groups)
- F value is 58.56 which indicates the groups are different. F values above 1 indicates that at least one of the groups is different than the others.
- p-value is very small which indicates the results are statistically significant (i.e. not generated due to random chance). Typically, results with p-values less than 0.05 are assumed to be statistically significant.
- Df is degrees of freedom. First line is for the variation between groups and the second line is for the variation within groups which are calculated as follows:
Let’s do another examples with groups that have very close average values.
> A = rnorm(100, mean = 70, sd = 5)
> B = rnorm(100, mean = 71, sd = 6)
> C = rnorm(100, mean = 70, sd = 7)
> groups = stack(data.frame(cbind(A, B, C)))
> anovaResults = aov(values ~ ind, data = groups)
In this case, we have selected same or very close mean values for each group. As a result, the f value is very small which confirms that the groups are not different.