Not sure your level on the question, I can give you an ELI5 as someone who has studied stats myself, but this can quickly go above my head and probably be better for a math or stats specific subreddit.
I assume you know the difference between a “Sample” being the data you have and a “population” being 100% of the actual real world data out there you are trying to analyze.
ANOVA testing is a form of statistical analysis that seeks to A. determine the *mean* values of a sample group of different data, B. it compares the means of the different sample data sets to see if they are equal and C. it determines the likelihood that the *actual means of the population* of data are equal.
For example, I give you the heights of 1000 35 year old men in New York City, London, and Beirut. ANOVA calculates the means of those 3 sample sets and then seeks to determine if the actual population of all men in NYC, London, and Beirut are *actually the same* to a certain confidence bound, typically 95%.
So if my sample mean of men in NYC is 6′ tall, London is 5’10”, and Beirut is 6’2″, can we say with 95% certainty that the population of mean is actually the same in all 3 cities.
It’s worth noting in the case we reject the hypothesis that the population means are the same, we cannot say which one is the outlier, or if all 3 are different. It just says “yes” or “no”.
Latest Answers