Share & grow the world's knowledge!

- JEWv2 on Why aren’t taps and pipes filthy on the inside?
- robots914 on Are music bots actully illegal?
- SaiphSDC on How do the mechanical elements of a hydraulic press (oil, pumps, valves etc) withstand pressure that can flatten a piece of metal?
- GroovesRus on Are music bots actully illegal?
- zachtheperson on How do you bring back color from black and white images and videos?

Copyright © 2021 AnswerCult

EDIT: not exactly ELI5, but since you are asking about estimators I’ll assume you are at least a little bit familiar.

In statistics, estimators take sample data and attempt to compute an estimate for the value of some parameter of the population from which the sample was drawn (e.g. in a simple case, imagine that you record the age of a random set of 10,000 people in NYC and calculate the arithmetic mean of the sample; you then claim that the mean age of the population of NYC is that number — *at a certain level of confidence*. You are likely quite wrong, but I’ll get to that). Maximum likelihood, log-likelihood, OLS, and other estimators function similarity for different data/relationships between variables in the data. **Estimators** try to gauge something about a population using information from a limited subset of that population.

Depending on your sampling process, the natural structure of the data, measurement types, etc., certain estimators may be more/less appropriate to the data you have or the relationship you want to model. Take OLS — OLS produces an estimate of the effect of some input(s) **X** on some outcome **Y** which is *unbiased*, *has the lowest cumulative error*, and and is *most accurate* when the Gauss-Markov assumptions are satisfied for a linear regression model using OLS. Not all (not any, if you want to be very strict semantically) models and sample datasets can ensure that these assumptions are satisfied, though — and so OLS estimates often have some amount of bias/error which is suboptimal if we suppose that *more data* or a *better model* could exist. An easy example of this is the application of conventional OLS to autocorrelated time series data — OLS will find estimates that completely ignore the time-dependent variation in data, and these estimates will be very incorrect.

So then, we might be interested in using some *alternative estimator* such as above (e.g. MLE, log-likelihood, etc.) because that estimator does something different that allows it to find estimates that contain less bias, smaller error, and the like given a particular set of data and accompanying model.

Appropriateness will depend *heavily* on the exact data and model of interest; for example, log-likelihood would be a more appropriate choice for a linear model like *Supports Policy A = C + Age + Sex + Income + … + e* (for C, e constants representing the model intercept and error); this is because log-likelihood estimation through logistic regression will coerce the output of the model to the [0,1] interval as a probability of a given observation supporting a policy or not.