How do small percentages continue to work out in the long run?

288 views

If something has a 5% chance of success, and I’ve failed 19 consecutive times, the next time independently still has a 5% chance of success, fairly low, rather than a guaranteed 100% chance of success in view of the previous attempts. How does this relate to updating percentages in light of new evidence, or is that something separate?

In: 6

8 Answers

Anonymous 0 Comments

You’re mixing up two concepts. You’re starting with the assumption that the chance of success is 5% – that is, you *know* the probability going in. There’s nothing for you to update with new observations; you’re already certain it’s 5%. Since you seem to be assuming the trials are independent (even if you aren’t saying so), the probability remains 5% regardless of past trials (since that’s what independent means in the first place).

If, on the other hand, you were not certain about the underlying probability, and you fail 19 times, your estimate of the underlying probability *would* correctly shift downward (since you’d be more likely to observe 19 failures if the probability were low than if it were high). [Bayes’ theorem](https://en.wikipedia.org/wiki/Bayes%27_theorem) is the mathematical law that tells you how to update your beliefs in light of new evidence, at least in simple cases like this (it can be tricky to apply in real world settings). The underlying probability is still fixed, but your *estimate* of that probability is changing, and (under some relatively weak assumptions) will approach the true value as you gather more data.

If your question is “how does it adjust if it fails a bunch to ‘get back to 5%'” – it doesn’t. The streaks just tend to cancel out in the long run.

Anonymous 0 Comments

[removed]

Anonymous 0 Comments

The key to this is whether or not each each attempt is independent of other attempts, or if it’s influenced by them. For example If you flip a coin 20 times in a row, each flip is independent of the previous or successive flips. They don’t impact it at all.

But if you are drawing a colored ball out of a bin, and removing that ball from the sample, that does influence subsequent picks.

Anonymous 0 Comments

>How does this relate to updating percentages in light of new evidence, or is that something separate?

That’s something completely separate. New evidence would completely invalidate what you previously believed and whatever model you were using. All of that math needs to be thrown out the window because you have been wrong the entire time.

This has nothing to do with probabilities as you described, correctly, before that.

Anonymous 0 Comments

It’s related to independent and dependent events.

A coin toss is an independent event. Every time I flip a coin, the odds are alway 50% for any given result. It doesn’t matter if the coin has landed a certain way previously; prior results don’t affect subsequent ones because the way a coin lands does not fundamentally alter the coin.

Things like pulling a certain sock color from a drawer are dependent. If I have 19 white socks and one black sock in a drawer, there’s a 5% chance that I pull the black sock on the first try. If I don’t put the first sock I pull back in the drawer for my second choice, now the odds of getting a black sock are up to 1/19 instead of 1/20. Those odds will keep going up as I remove more socks because I’m changing the situation. As there are fewer and fewer socks to pick, it becomes more likely that I get the black one. My future results depend on the changes to the system caused by previous results.

Anonymous 0 Comments

To.work out the chances of a 5% chance happening at least once in n tries is equal to 1 – (the probably of it not happening at all in n tries)

If we have a 5% chance to succeed on a given attempt (1/20) we have a 95% chance to fail (19/20). If we try 20 times, the odds it doesn’t happen at all is (19/20)^20 or ~35.8% chance. That means the chance of succeeding at least once is ~64.2%. The odds of succeeding exactly once is 20 * 1/20 * (19/20)^19 or ~37.7%. The odds of succeeding exactly twice are 190 * (1/20)^2 * (19/20)^18 , or ~18.9%

The formula for exactly k successes in n tries is nCk * P(win)^k * P(lose)^n-k . If we add together k=0, 1, 2, 3…n, we get 1. If this formula looks familiar, it’s a binomial expansion. As long as the events are independent, this is how things will go. Like rolling a die, the previous outcome does not affect the next. If it were something like a deck or cards, if I pull every card out of the deck one at a time and haven’t pulled the 8 of hearts, the last card must be the 8 of hearts. That is because these are dependent events where they are impacted by previous events. But if I pull out a card, replace it, and then shuffle the deck, the next card pull is independent of the previous.

Anonymous 0 Comments

So you’re getting a couple things confused here. There are actually 3 different situations you’re sort of conflating:

Situation A: You write the numbers 1-20 on some pieces of cardboard, and put all the pieces in a jar. You mix them up thoroughly and pick a piece without looking; then you look at it, see it’s not a 20, then rip it to shreds and throw it in the trash. Repeat the previous sentence 19 times.

Situation B: You write the numbers 1-20 on some pieces of cardboard, and put all the pieces in a jar. You mix them up thoroughly and pick a piece without looking; then you look at it, see it’s not 20, and put it back in the jar. Repeat the previous sentence 19 times.

Situation C: I pick a secret number N, write the numbers 1-N on some pieces of cardboard, and put all the pieces in a jar. You mix them up thoroughly and pick a piece without looking; then you look at it, see it’s 10 or smaller, and put it back in the jar. Repeat the previous sentence 19 times.

After each situation, you participate in an auction. The prize in the auction is a piece of paper that says, “I will mix the pieces in the jar fairly and draw a piece. If the piece says 20, I will pay you $100. – The Bank”.

Then the probability is how much you should be willing to pay for that piece of paper.

– For Situation A, you should be willing to pay $100. After 19 pieces have been drawn *and shredded* the remaining piece is definitely the 20.
– For Situation B, you should be willing to pay $5. There are 20 pieces equally likely to be chosen, and only one of them has the winning text.
– For Situation C, it’s a little fuzzier and the math is a bit complicated. Basically you assume (before you draw the pieces of paper) how likely I am to pick various values of N, then update those assumptions based on what the pieces of paper say. If you draw 19 pieces of paper and none of them are greater than 10, the Bank’s contract is nearly worthless, as it’s extremely unlikely I wrote N=20; you’d be vastly overpaying for The Bank’s contract even if you only paid $0.01.

Situation A is like a card game. “Drawing without replacement” is jargon for Situation A.

Situation B is like an experiment where the outcome’s determined by a repeatable process with randomized elements. Jargon for Situation B is “Drawing with replacement” or “Bernoulli trial.”

Situation C is like updating your worldview based on new evidence. Jargon for Situation C is “Bayesian inference” or “updating your prior”.

And now we get to the word “Independence.” Independence refers to multiple random events that don’t affect each other. (In other words, knowing the outcome of one event doesn’t change the probability you assign to the other event.)

In B the draws are independent. In A the draws are not independent. In C.. It’s complicated; if you know N it’s Situation B (with N instead of 20) and they’re independent. If you don’t know N, draws aren’t independent, because they’re both influenced by the unknown N (which you treat as another random variable).

The way your OP is worded, it’s not clear whether “independently” is part of the problem statement, or part of the solution.

If you intended to say this:

– Version 1: *Problem:* Trying has a 5% chance of success, and I’ve tried and failed 19 consecutive times. What’s my chance of failing on the next try? *Solution:* Independently, there’s still a 5% chance of success.

– Version 2: *Problem:* Trying has a 5% chance of success, and I’ve tried and failed 19 consecutive times. Tries are independent. What’s my chance of failing the next time? *Solution:* There’s still a 5% chance of success.

In Version 2 the solution’s correct and all is well.

In Version 1, things are.. problematic. You don’t know whether you’re in Situation A or Situation B; the problem didn’t tell you. If you only use the information given in the problem statement, you have to say “I don’t know because the problem didn’t include enough information to tell me whether I’m in Situation A or Situation B.”

Anonymous 0 Comments

> If something has a 5% chance of success … How does this relate to updating percentages

It doesn’t. If _p_ for a series of independent events truly is 5%, or you are sampling _with replacement_, it’s 5%. That doesn’t “update.”

If you are drawing without replacement, then _p_ would change with each draw; were it 1 winner and 19 losers in a hat, _p_ of the next draw winning would approach 100% till the winner is drawn.

> updating percentages

Let’s presume that _p_ is unknown. You could study the phenomenon and try to estimate _p_. If the event seems to be with replacement or independent each time, then your _most likely_ value of _p_ would be whatever wins-per-draws ratio you observed. Advanced statistics could give you a _confidence interval_ from that, which is another level of chance. For example, let’s say you observe 10,000 times and see exactly 442 successes. The most likely value of _p_ is 4.42%, but a _95% confidence interval_ could figure to be something like 4.00% to 4.84%. This says that if _p_ is a fixed value, there is about a 95% chance that this sample is the result of _p_ being in that range, and a 5% chance that _p_ is out of that range but had a strange run of luck.

If we continue to sample, the range will get narrower, but we still don’t _know_ what _p_ is or if our sample is becoming more or less representative of _p_. If we sample to 4,420 in 100,000, the most likely value of _p_ is still 4.42%, and then our range becomes 4.29% to 4.55%, but we’re still only 95% confident.

We _can’t know_ what _p_ is from sampling, but depending on what is most important to us—risk of being wrong, accuracy of our guessed range, or the cost of additional sampling—we usually can get to something satisfactory enough to estimate with, or to decide if our theory is supported or should be reconsidered.