If you do a Likert scale of confidence that X is linked to Y eg whether the sample believes that, say, socioeconomic status has an effect on health outcomes, from no-effect (1) to most-effect (10).

And since there are many such questions that link ANOVA rather than T-tests are suitable. But how d’you know if you can trust your ANOVA p-values and findings’ reliability? Some of the associations are null, so serve as a control comparison of guessed confidence btw.

Some say the residuals have to be normally distributed for ANOVA to be ‘robust’; others say it doesn’t matter. So, if it matters how d’you determine the residuals? Anyway, ultimately, how does one test/assess if ANOVA results are valid or not? Please explain like I’m a weird precocious five-year-old.

In: Other

When using an ANOVA, we assume some things about our data to be true. These assumptions are: 1. (For ELI5 purposes forget about residuals) Our data points are in a normal distribution (a.k.a. a bell curve). 2. In each group, the variance from the mean is the same as the other groups (the bell curves in each data set look the same). And 3. That each data point is independent of the other data points (i.e. None of your participants interacted, were counted multiple times, etc.).

For 1 and 2, you can test your data for normality and for homogeneity of variances to know how far off they are. When it comes to an ANOVA being robust for normality, that just means that your data can differ more from being perfectly normal without affecting the results of your ANOVA too much. For 3, if you know your data isn’t independent that generally means flawed experimental design or execution and I wouldn’t use that data.

The more your data differs from those assumptions, the less reliable the results of your ANOVA will be because you are essentially using the wrong formula. You can think of it like your data is a rectangle and you want to find the area. An ANOVA is saying it’s the length of one side squared. It works as long as your data is a square and is still a good approximation for small deviations, but if you have a long, skinny rectangle you should use a different formula.

Well for a start, a fixed scale from 0-10 is not really a good way to determine if two samples choose different numeric values on the scale. It’s just too coarse, no? Ppl choose extreme values in that scale, rather than being normal distributed around, say, 5.

If i face such weird distributions i never base it on t-Tests, but use Wilcox test. Im not an expert in likert scale Analysis, but you should be able to find cons and pros for using Wilcoxon for these kind of questions?