蓄水池抽样

This problem can be solved easily. You don't have to use reservoir sampling at all. You just need to:

  1. calculate the length of the list

  2. randomly draw a number from 1 ~ length (let's say i), then return the ith element, and it's done.

But if you want to know more about reservoir sampling, take a looking at the following part.

Problem Desciption

  • Choose k entries from n numbers. Make sure each number is selected with the probability of k/n

  • This problem is the special case where k=1

Basic idea

  • Choose 1, 2, 3, ..., k first and put them into the reservoir.

  • For the next k+1 element, pick it with a probability of k/(k+1), and randomly replace a number in the reservoir.

  • Same as above, for the following k+i element, pick it with a probability of k/(k+i), and randomly replace a number in the reservoir.

  • Repeat until k+i reaches n

Proof

  • For k+i, the probability that it is selected and will replace a number in the reservoir is k/(k+i) (As we defined above)

  • For a number in the reservoir before (let's say X), the probability that it keeps staying in the reservoir is

    • P(X was in the reservoir last time) × P(X is not replaced by k+i)

    • = P(X was in the reservoir last time) × (1 - P(k+i is selected and replaces X))

    • = k/(k+i-1) × (1 - k/(k+i) × 1/k

    • = k/(k+i)

  • When k+i reaches n, the probability of each number staying in the reservoir is k/n

Examples

Example1: Choose 3 numbers from [111, 222, 333, 444]. Each number should be selected with a probability of 3/4

  1. Choose [111, 222, 333] as the initial reservior

  2. Then choose 444 with a probability of 3/4

  3. For 111, it stays with a probability of

    • P(444 is not selected) + P(444 is selected but it replaces 222 or 333)

    • = 1/4 + 3/4*2/3

    • = 3/4

  4. The same case with 222 and 333

  5. Now all the numbers have the probability of 3/4 to be picked

Example2: Choose 1 number from [111, 222, 333]. Each number should be selected with a probability of 1/3

  1. Choose [111] as the initial reservior

  2. Now we iterate to 222, choose it with the probility of 1/2

    1. if it is chosen (1/2 probility), we replace 111 with 222 . In this case, 222 is picked with the probility of 1/2

    2. if it is not chosen (another 1/2 probility), 111 stays. In this case, 111 stays with the probility of 1/2

    3. Combine the above 2 cases, both 111 and 222 have the probility of 1/2

  3. Now we iterate to 333, choose it with the probility of 1/3

    1. if it is chosen (1/3 probility), we replace the result of last iteration to 333. In this case, 333 is picked with the probility of 1/3

    2. if it is not chosen (2/3 probility), the result of last iteration stays, and in last iteration, both 111 and 222 have the probility of 1/2. In this round, their probability become 2/3 * 1/2 = 1/3.

    3. Now all the nums 111, 222, 333 have the probility of 1/3 to be picked

Sample Code

func (this *Solution) GetRandom() int {
    res, n := 0, 0               // res: reservoir, n: counter
    for h := this.Head; h != nil; h = h.Next {
        n++
        if rand.Intn(n) == 0 {   // rand.Intn(n) draws from [0 ~ n-1], so it has a probability of 1/n to equals 0.
            res = h.Val          // If it equals, means it is chosen and should replace the reservoir
        }
    }
    return res
}

Last updated