Monte Carlo: 100k iteration simulation five times vs 500k iteration simulation once

356 viewsMathematicsOther

I am unknowledgeable about Monte Carlos. What would be the difference in the reliability/accuracy of the final expected value in a Monte Carlo simulation under the following two methods:

1) Run a 100k-iteration Monte Carlo simulation, five times over, get the average of each of the five simulations, and then take the average of the average of those five simulations
2) Run a 500k-iteration Monte Carlo once.

Presumably the second would be more accurate and reliable, but I am not sure how?

In: Mathematics

5 Answers

Anonymous 0 Comments

Monte Carlo is used for a huge variety of tasks in statistics and probability, and there are many different ways to do it. In the simplest case you want to draw samples from a known distribution and then evaluate a function on the sample points to estimate some statistical property of the result or compute an integral. This is known as crude MC and the results would be the same in both cases you describe. The samples you get are all independently drawn from the same distribution, so 5 sets of 100k is no different than a single set of 500k.

Burn-in is relevant in Markov Chain Monte Carlo, which is a common tool for inferring distributions, i.e. learning the distributions of variables you can not observe directly using some related quantity that you can measure. In this case you don’t actually know the distribution that you want to sample from. Instead you draw samples and accept or reject them (this is the simplest method, known as the Metropolis-Hastings algorithm) in such a way that eventually the “chain” of successive samples will converge to the correct distribution. Here you are not immediately sampling from the correct distribution as in the crude MC case, and therefore some of the initial samples must be discarded. Sampling 500k samples at once means you have to discard a smaller proportion of the total sample, but you may prefer to use 5 runs of 100k to reduce the total runtime by running the simulations in parallel.

You are viewing 1 out of 5 answers, click here to view all answers.