1. Introduction

In this tutorial, we’ll learn how to shuffle an array.

2. Shuffling

In many applications, we have an array a = [a_1, a_2, \ldots, a_n] and need to shuffle it. For example, that’s the case when doing permutation tests to check feature importance in machine learning. Shuffling the array a means getting a random permutation of its elements.

We require all permutations to be equally likely.

3. Naïve Approach

We could first try something like this. We could iterate over i=1,2,\ldots, n, and for each i randomly choose an index j \in \{0, 1, \ldots, n\} and swap a_i with a_j:

algorithm NaiveShuffling(a):
    // INPUT
    //    a = an array [a_1, a_2, ..., a_n]
    // OUTPUT
    //    A random permutation of a

    for i <- 1 to n:
        j <- a random number from {1, 2, ..., n}
        Swap a[i] and a[j]

    return a

This may come as an intuitive thing to do, but the issue is that not all permutations are equally likely.

In each of the n iterations, we can choose any of the n elements of a. So, there are n^n equally likely executions but only n!=n \cdot (n-1) \cdot \ldots \cdot 2 \cdot 1 different permutations. Since n isn’t divisible by n-1 for any n > 2, we can’t spread out n^n executions over n! permutations evenly. So, some permutations are more likely than others.

3.1. Example

Here are all possible executions when the input is [1, 2, 3]:

Shuffle [1, 2, 3] with the naive algorithm.

As we see, permutations [1, 3, 2], [2, 1, 3], [2, 3, 1] occur with the probability \frac{5}{27} each, whereas the rest with \frac{4}{27}.

4. The Fisher-Yates Algorithm

Instead, we can use the Fisher-Yates algorithm. For each i = 1, 2, \ldots, n-1, it randomly selects an index j from \{ i, i+1, \ldots, n\} to make a swap:

algorithm FisherYatesAlgorithm(a):
    // INPUT
    //    a = an array [a_1, a_2, ..., a_n]
    // OUTPUT
    //    A random permutation of a

    for i <- 1 to n - 1:
        j <- a random number from {i, i + 1, ..., n}
        Swap a[i] and a[j]

    return a

Basically, whenever we choose a_j to swap with the current element a_i, we discard it from participating in the swaps with a_{i+1}, a_{i+2}, \ldots, a_{n}. We achieve that by restricting the range for j.

Therefore, the Fisher-Yates algorithm splits the array into two parts: a[1:(i-1)]  and a[i:n]. We’ll call them the inactive and active parts. In the ith iteration, the algorithm can choose only active elements for the exchange with a_i. Afterward, the inactive part grows to include the element at the  ith position.

So, for each i=1,2,\ldots,n-1, there are n-i+1 active elements to choose from. As a result, the total number of possible executions is:

    [n \codt (n-1) \cdot (n-2) \cdot \ldots \cdot 2  = n!]

which matches the number of permutations. Since we can choose an element for a swap only once, there is only one way to get each permutation. Consequently, they’re all equally likely and occur with the probability of \frac{1}{n!}.

4.1. Example

Let’s say a = [1, 2, 3, 4]. Here’s a possible execution where | denotes the split:

    [\begin{matrix} | & 1 & 2 & 3 & 4 \\ 2 & | & 1 & 3 & 4 \\ 2 & 4 & | & 3 & 1 \\ 2 & 4 & 1 & | & 3 \end{matrix}]

In the last step, there’s only one possibility for a_4, so we stop and return [2, 4, 1, 3].

4.2. Potential Issues

Random integer generation is at the core of the Fisher-Yates algorithm. If we can’t sample from a truly uniform distribution over \boldsymbol{\{i, i+1, \ldots, n\}}, the algorithm will be biased toward some permutations. That can happen in two ways.

Let’s say we have a random number generator rand that returns integers from 0 to N-1 inclusive (N is usually the maximal integer). We can get j \in \{i, i+1, \ldots, n\} like this:

    [i + \left(rand() \bmod (n - i + 1) \right)]

For all the remainders to be equally likely, N must be divisible by each possible n-i+1, which we can’t assume to be the case. As some remainders are more likely than others, not all js we compute with the above formula are equally probable. However, the higher the N, the less the discrepancy.

On the other hand, if we have a generator that returns random numbers from [0, 1], we can get j like this:

    [i + round\left( rand() \cdot (n - i) \right)]

Since computers have finite precision, rand() can’t return all real numbers from [0, 1], so rounding won’t result in equally probable 0, 1, \ldots, n-i. Adding i to the result of rounding does give us integers from the desired range, but not all are equally likely.

5. Conclusion

In this article, we showed how to shuffle an array.

The naïve approach is intuitive but incorrect since not all permutations are equally likely. The Fisher-Yates algorithm shuffles an array and returns each permutation with the same probability.


» 下一篇: 什么是数据湖?