Introduction
F-test is a generally widespread statistical hypothesis test, based on the F-distribution. The purpose of F-test is to compare two (or more) data sets, and to draw the inference
based on the test result.
The test is commonly used to decide if the variances or the means of two or more data sets are significantly different, or they differ only by chance (randomly). Our null
hypothesis (H0) is that the variances (or means) of the data sets are equal, while the alternative hypothesis (Ha) is that they are different.
As in every hypothesis test, we use various mathematical formulas to generate one number (in this case F), based on which we can accept or reject the null-hypothesis, by
comparing it to a critical F number (gathered from the pre-defined F table).
Sir Ronald Aylmer Fischer, an English biostatistician (combining biology and statistics), and George Waddel Snedecor, an American statistician reached new heights in the
field of statistics by pioneering design of experiments, analysis of variance (ANOVA) and Snedecor’s F-distribution (or Fischer-Snedecor distribution).
Source: qMindset.com
Key Features
In quality management (especially in quality control), we commonly use two major F-test methods:
- F-test for precision (or also named as F-test for equality of variances).
- One-way ANOVA F-test (analysis of variance).
In this section we will discuss both methods.
F-test for equality of variances
F-test for equality of variances is used to compare the variances (squared standard deviation) of two data sets (groups), and to make a statistical statement, if the differences
of the variances are significant with a given confidence level.
Hypothesis (0): S12 = S22 ... variances are equal
Hypothesis (alternative): S12 ≠ S22 ... variances are not equal
The formula of F-test for equality of variances:
The formula of F-test for equality of variances (Source: qMindset.com)
where F is the F number, S1 is the standard deviation of group one, S12 is the variance of group one, S2 is the standard deviation of group two,
S22 is the variance of group two.
Example: we want to compare if the variance of group one is statistically different from group two with 95% confidence. We use the F-test for this decision.
Step 1: calculate the value of F value.
The groups are featured by the following descriptive statistics:
Group statistics of the groups |
|
Group 1 |
Group 2 |
Standard deviation of data set (S) |
8.0 |
7.2 |
Variance (S2) |
64.0 |
51.84 |
Sample size (n) |
13 |
11 |
Degrees of freedom (df = n - 1) |
12 |
10 |
F is calculated by the following formula: F = S12 / S22 = 64.0 / 51.84 = 1.2346
So our F value for the two groups gives 1.2346, which we need to compare to a critical F value to see if we can accept or reject the null hypothesis. We will use the F tables
for this. We have set up a needed confidence level at the beginning of the test, which is 95%. This means our alpha level (α) is 5%.
As we make a two-tailed test, our alpha will be half of the 5%, so 0.025 (in case of two-tailed tests, we have to divide alpha by two). This is very important, as there are
different F-tables for different alpha levels. It is also important that:
- In case of two-tailed or right-tailed test, the larger variance must be divided by the smaller one (larger variance in the numerator, smaller variance in the denominator)!
- In case of a left-tailed test, the smaller variance must be divided by the larger one (smaller variance in the numerator, larger variance in the denominator)!
Step 2: define the degrees of freedom values.
The degrees of freedom (df) is equal to the sample size of the group minus 1 (df = n - 1). The group one’s df is 12, while group two’s df is 10. In the F tables the columns
represent the numerator degrees of freedom, while the rows represent the denominator degrees of freedom. Our df (numerator) is 12 and our df (denominator) is 10, so it is easy to check it in the F-table, which
belongs to the α = 0.025 alpha level.
IMPORTANT! Such as with the variances, in case of a left-tailed test, you have to use the df number of the group with smaller variance in the numerator, and the other df number in
the denominator. Once you have your Critical F value, you have to reciprocate it, and then use the received number to compare it with your calculated F value.
Step 3: use the df values and the alpha level for selecting the critical F value in the F table.
Critical values of F for significance level (α) = 0.025 (Source: stat.purdue.edu) |
/ |
df1 = 1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
9 |
10 |
12 |
15 |
df2 = 1 |
647.7890 |
799.5000 |
864.1630 |
899.5833 |
921.8479 |
937.1111 |
948.2169 |
956.6562 |
963.2846 |
968.6274 |
976.7079 |
984.8668 |
2 |
38.5063 |
39.0000 |
39.1655 |
39.2484 |
39.2982 |
39.3315 |
39.3552 |
39.3730 |
39.3869 |
39.3980 |
39.4146 |
39.4313 |
3 |
17.4434 |
16.0441 |
15.4392 |
15.1010 |
14.8848 |
14.7347 |
14.6244 |
14.5399 |
14.4731 |
14.4189 |
14.3366 |
14.2527 |
4 |
12.2179 |
10.6491 |
9.9792 |
9.6045 |
9.3645 |
9.1973 |
9.0741 |
8.9796 |
8.9047 |
8.8439 |
8.7512 |
8.6565 |
5 |
10.0070 |
8.4336 |
7.7636 |
7.3879 |
7.1464 |
6.9777 |
6.8531 |
6.7572 |
6.6811 |
6.6192 |
6.5245 |
6.4277 |
6 |
8.8131 |
7.2599 |
6.5988 |
6.2272 |
5.9876 |
5.8198 |
5.6955 |
5.5996 |
5.5234 |
5.4613 |
5.3662 |
5.2687 |
7 |
8.0727 |
6.5415 |
5.8898 |
5.5226 |
5.2852 |
5.1186 |
4.9949 |
4.8993 |
4.8232 |
4.7611 |
4.6658 |
4.5678 |
8 |
7.5709 |
6.0595 |
5.4160 |
5.0526 |
4.8173 |
4.6517 |
4.5286 |
4.4333 |
4.3572 |
4.2951 |
4.1997 |
4.1012 |
9 |
7.2093 |
5.7147 |
5.0781 |
4.7181 |
4.4844 |
4.3197 |
4.1970 |
4.1020 |
4.0260 |
3.9639 |
3.8682 |
3.7694 |
10 |
6.9367 |
5.4564 |
4.8256 |
4.4683 |
4.2361 |
4.0721 |
3.9498 |
3.8549 |
3.7790 |
3.7168 |
3.6209 |
3.5217 |
The table gives us the critical F statistic value, which is 3.6209. Now we have to compare it to the calculated F value. If the calculated F value exceeds the critical F value,
we can reject the null-hypothesis, which would mean that the means are significantly different at the 95% confidence level.
Step 4: conclusion
- Calculated F value: 1.2346
- Critical F value (0.025, 12, 10): 3.6209
Conclusion: the calculated F value does not exceed the critical F (1.2346 < 3.6209), so we have to accept the null hypothesis. From quality standpoint, we see no significant
difference between the two populations of product / process characteristics at 5% significance level (which is equal to 95% confidence level).
Remark: confidence level (1- α) = 95%, significance level (α) = 5%.
One-way ANOVA F-test
The one-way ANOVA F-test is a more complex test compared to equality of variances, and it is used for the comparison of means. By doing an ANOVA F-test we can compare even more
than two groups, and check if their means are statistically equal at a given confidence level.
First we set up our null-hypothesis and alternative hypothesis.
Hypothesis (0): µ1 = µ2 = µk ... means are equal
Hypothesis (alternative): µ1 ≠ µ2 ≠ µk ... means are not all equal
In case of one-way ANOVA F-test, the F value represents the ratio of explained variance vs unexplained variance, or in other words the between-group variance vs the within-group
variance.
F = "between–group variance" / "within-group variance"
Example: we want to compare if the means of three groups (data sets) are statistically equal, or different from each other with 95% confidence. We use the one-way ANOVA F-test
for this decision.
The values in the groups are the following:
Nine values in three groups |
m1 |
m2 |
m3 |
4 |
8 |
7 |
6 |
10 |
7 |
5 |
9 |
7 |
Step 1: calculate the overall mean ( x̄ ), also called as the grand mean.
= (4 + 6 + 5 + 8 + 10 + 9 + 7 + 7 + 7) / 9 = 7
Step 2: calculate the group means (x̄1; x̄2; x̄3).
x̄1 = (4 + 6 + 5) / 3 = 5
x̄2 = (8 + 10 + 9) / 3 = 9
x̄3 = (7 + 7 + 7) / 3 = 7
Step 3: calculate the total sum of squares.
SST = (4 – 7)2 + (6 – 7)2 + (5 – 7)2 + (8 – 7)2 + (10 – 7)2 + (9 – 7)2 + (7 – 7)2 + (7 – 7)2
+ (7 – 7)2 =
= 9 + 1 + 4 + 1 + 9 + 4 + 0 + 0 + 0 = 28
Step 4: calculate the total degrees of freedom by multiplying the number of groups by the number of samples in one group, and subtract 1 from the number.
DFTotal = m * n -1 = 3 * 3 -1 = 9 -1 = 8
Step 5: calculate the sum of squares within-group (SSW), by getting the sum of squares for all the 9 values.
Nine values in three groups |
m1 |
m2 |
m3 |
4 – 5 = -1 |
8 – 9 = -1 |
7 – 7 = 0 |
6 – 5 = 1 |
10 – 9 = 1 |
7 – 7 = 0 |
5 – 5 = 0 |
9 – 9 = 0 |
7 – 7 = 0 |
SSW = (-1)2 + (1)2 + (0)2 + (-1)2 + (1)2 + (0)2 + (0)2 + (0)2 + (0)2 =
1 + 1 + 0 + 1 + 1 + 0 + 0 + 0 + 0 = 4
Step 6: calculate the within-group degrees of freedom (DFWithin).
DFWithin = m (n – 1) = 3 * (3 – 1) = 3 * 2 = 6
Step 7: calculate the between-group sum of squares (SSB).
SSB = n (group1 mean – grand mean)2 + n (group2 mean – grand mean)2 + n (group3 mean – grand mean)2
SSB = 3 * (5 – 7)2 + 3 * (9 – 7)2 + 3 * (7 – 7)2 = 3 * (-2)2 + 3 * (2)2 + 3 * (0)2 =
3 * 4 + 3 * 4 + 3 * 0 = 12 + 12 = 24
Step 8: calculate the between-group degrees of freedom (DFBetween).
DFBetween = m -1 = 3 – 1 = 2
Step 9: calculate the between-group mean square (MSB) and the within-group mean squares (MSW).
MSB = SSB / DFBetween = 24 / 2 = 12
MSW = SSW / DFWithin = 4 / 6 = 0.6667
Step 10: summarize all data in a comprehensive table:
Summarized ANOVA table |
Source of variation |
Sum of squares (SS) |
Degrees of freedom (DF) |
Mean squares (MS) |
F statistic |
Between group |
SSB |
m – 1 |
MSB = SSB / (m – 1) |
F = MSB / MSW |
Within group |
SSW |
m (n-1) |
MSW = SSW / m (n - 1) |
|
Total |
SST |
m * n -1 |
|
|
Filling up the table with the calculated numbers:
Summarized ANOVA table |
Source of variation |
Sum of squares (SS) |
Degrees of freedom (DF) |
Mean squares (MS) |
F statistic |
Between group |
24 |
2 |
12 |
18 |
Within group |
4 |
6 |
0.6667 |
|
Total |
28 |
8 |
|
|
Step 11: calculate the F value, be dividing MSB by MSW.
F = MSB / MSW = 12 / 0.6667 = 18.0
The table is useful to make a crosscheck, as SSB + SSW must be equal to SST. Also DFBetween + DFWithin must be equal to DFTotal.
From the numbers, we can already see, that the total variation is mainly coming from the variation between the groups. This may give us an indication, that the F value will be
high, and we already see a number of 18.0.
But do we know yet if we can reject the null-hypothesis? Well, not, as we have to finish our test by getting the critical F value, and to compare it with our calculated F value
(also called F-statistic).
The degrees of freedom for the numerator (between-group) is 2, and the degrees of freedom for the denominator (within-group) is 6. Using the α = 0.05 F-table (having the
confidence level of 95%), with the given DF values, we see, that our critical F value is 5.1433.
Critical values of F for significance level (α) = 0.05 (Source: socr.ucla.edu) |
/ |
df1 = 1 |
2 |
df2 = 1 |
161.4476 |
199.5000 |
2 |
18.5128 |
19.0000 |
3 |
10.1280 |
9.5521 |
4 |
7.7086 |
6.9443 |
5 |
6.6079 |
5.7861 |
6 |
5.9874 |
5.1433 |
Step 12: conclusion
- Calculated F value: 18.0
- Critical F value (0.05, 2, 6): 5.1433
As our calculated F value exceeds (much higher) the critical F (gained from the table), we can reject the null-hypothesis at the 95% confidence level. In practice, it
means, that we can state the group means significantly differ from each other.
Source: qMindset.com
Hints
Using F-tests in practice can support decision making, as we can statistically consider the difference between independent populations. When you compare different lots /
shifts, etc. in your production based on a given product or process characteristic, you can support your decision with statistics, making an F-test. It is mainly useful, when the samples are large (over 100
pcs / sample).
When conducting a two-tail test, you have to divide alpha by 2, so in case of a two-tailed test and 95% confidence, you will need to use the 0.025 F-table instead of using
the 0.05 F-table.
One-tailed test vs two-tailed test (Source: qMindset.com; stat.purdue.edu)
In case of F-test of equal variances the larger variance must be placed in the numerator, while the smaller in the denominator.
F-test is quite sensitive to normality, so the underlying population from which the samples were taken should be normally distributed. This sensitiveness abates with larger
sample sizes, and the sample size of each group should be near to each other (large and equal sample sizes are the best for an F-test). In addition, F-test gives better indication if the standard deviation of
the samples do not differ greatly (according to Sullivan: highest S should be less, than double the lowest S, however other statisticians are more cautious).
When there are only two sample means to compare, the F-test is basically the same, as the two sampled
T-test.
In such a case the F-statistic is equal to the square of the T value (F = t
2).
To easily conclude statistical tests (such as F-test or T-test), use statistical software, which calculates the statistics for you in milliseconds. It is also advisable
to use such test for
Statistical Process Control (SPC), and not just performing range, mean, C
p, C
pk calculations.
Source: qMindset.com