Analysing the results (3): Anova
We shall now look in detail at each of the scales. For each scale, the question to be answered is which is the best (or worst) washing machine(s). Consider the monaural judgement of overall quality, shown in the worksheet “Anova – monaural quality whole” which gives the scores for the question about purchase influence. So in this case we want to find out which washing machine sound is more likely to influence people to purchase the product.
First we must determine whether there is a significant variation in judgements on this scale, i.e. are all the washing machines judged to have similar sounds. The graph in the worksheet shows the mean judgements on the scale, along with error bars showing the 95% confidence limits in the mean. The 95% confidence limits are calculated by taking (approximately) 2 standard deviations divided by the square root of the number of measurements. This graph is also reproduced below.
We can see that the washing machine B scores higher on this scale, and washing machine appears to score lower (even allowing for the experimental error), however, not all cases are so clear cut. In cases which are less certain, how can you statistically prove that the variation is real and not down to chance. This is done with a one way analysis of variance.
Excel has an Annova tool within the data analysis toolpak. You may have to install the tool first before using it by selecting “tools” “add-in” from the top menu. Then the tool you require will appear under “tool” “data analysis”. You want “Anova: Single factor”.
The “Input range” is the set of scores for each subject. In our case the each subject has their score in a single column, and each washing machine has their score in a single row:
In this case the data is grouped by rows i.e. the different products have a row each, so click the appropriate radio button in the dialog box. Select an output range, and then hit “OK”. The output range will look like this:
The rows 1-5 refer the the 5 washing machine types. The top part of the table shows there are 10 subjects/washing machine. The average score per washing machine and the variance is shown. By converting the variance to confidence limits you can form the error bars necessary to plot the graph.
The bottom table is what we need. This shows how much variation there is within the groups as compared to between the groups. If the variation between the groups (washing machines) is larger than the variation due to this different subject scores within the groups (washing machines) then we can say the variation between the washing machines (or groups) is significant. In the above table, the probability that the distribution happened by chance is shown by the P-value of 7E-8 (7 x 10^-8), this needs to be multiplies by 100 to give percent, so the percentage chance is 7 x 10^-6, which is a minute number. So in this case the variation shown by the graph is significant: the washing machine scores are different.
This case was very clear cut, but it will not always be true. If the P-value is greater than 0.05 then that woul indicate a more than 5% chance that the variation is due to chance, and at this point the analysis stops because the washing machines are statistically the same. Below is an example from some data where that is true. You can see that the error bars overlap, a useful first indication that the variation of judgements for each washing machine is too large to differentiate between washing machines. In this case P-Crit is 0.16 (16%). See the spreadsheet of data for this case.
If you do have a case where the variation is significant, it is then possible to look at which washing machine is best, and which is worst. This requires a multiple comparison test, as described in the next page.