Friday, January 1, 2010

Sampling with and without replacement

In the bootstrapping literature (Efron & Tibshirani, 1993), the idea of ‘sampling with replacement’ is essential. It indicates that different bootstrap samples of size n are drawn from the same data set of size n, without producing the original data set over and over again. For instance, bootstrap samples from the data set of the numbers {1,9,11,12} may be: {1,1,11,12}; {1,9,11,9}; {1,9,11,12} and so on.
However, the procedure underlying sampling with replacement differs completely from sampling without replacement, in which the same unit can never appear twice.
In fact, sampling with replacement supposes that units are drawn one after another, and that a copy of the unit that was drawn is put back into the data (the actual replacement). In contrast, sampling without replacement allows to ‘grab’ n units at the same time.
The same effect as the sequential drawing described above for sampling with replacement would be obtained if n units were grabbed from a data set consisting of n copies of the original data. The data set would then be of size n2.

References

Efron, B., & Tibshirani, R. J. (1993). An introduction to the bootstrap. New York: Chapman & Hall.