Bob Carpenter wrote today about how Markov chains cannot thoroughly cover high-dimensional spaces, and that they do not need to. That’s kinda th

MCMC and the coupon collector problem

submited by
Style Pass
2024-05-07 01:30:05

Bob Carpenter wrote today about how Markov chains cannot thoroughly cover high-dimensional spaces, and that they do not need to. That’s kinda the point of Monte Carlo integration. If you want systematic coverage, you can/must sample systematically, and that’s impractical in high dimensions.

Bob gives the example that if you want to get one integration point in each quadrant of a 20-dimensional space, you need a million samples. (220 to be precise.) But you may be able to get sufficiently accurate results with a lot less than a million samples.

If you wanted to be certain to have one sample in each quadrant, you could sample at (±1, ±1, ±1, …, ±1). But if for some weird reason you wanted to sample randomly and hit each quadrant, you have a coupon collector problem. The expected number of samples to hit each of N cells with uniform [1] random sampling is

[1] We’re assuming each cell is sampled independently with equal probability each time. Markov chain Monte Carlo is more complicated than that because the draws are not independent.

Leave a Comment