Share & grow the world's knowledge!

- Bensemus on (ELI5) I just read that plutonium is very rare. I have little to no knowledge of science etc. where in the natural world does it come from? How do you source it?
- summerswithyou on What is the difference between Median and Average?
- WholelottaLuv on – How is it that the first Mario Bros. game was about 32kb, but a JPEG of the game is over 300kbs in 2023?
- BeanC0unt3r on Eli5: what exactly is a virus and are viruses alive?
- BeanC0unt3r on What is the difference between Median and Average?

Copyright © 2023 AnswerCult

It usually works by shrinking the coefficients you get at the end. The coefficients get shrunk because there is a penalty applied. Normally you’re essentially minimizing the square error by finding the best coefficients. But in regularization, you add a penalty for each additional coefficient based on it’s size. So unless a coefficient explains a lot of variance to overcome the penalty, it will get shrunk toward 0.

This whole process is nice because a regular linear regression optimizes explained variance, but at the cost of additional bias (the coefficients are fit to the data you have, not the data you don’t). The penalty introduces some bias in the likelihood being optimized (shrinkage) to try to find the optimum balance of variance explained and minimal bias. Rather than just optimize variance explained.