This article needs additional citations for verification. Please help improve this article by adding citations to reliable sources. Unsourced material may be challenged and removed.Find sources: "American flag sort" – news · newspapers · books · scholar · JSTOR (July 2017) (Learn how and when to remove this message)

An American flag sort is an efficient, in-place variant of radix sort that distributes items into buckets. Non-comparative sorting algorithms such as radix sort and American flag sort are typically used to sort large objects such as strings, for which comparison is not a unit-time operation.^[1] American flag sort iterates through the bits of the objects, considering several bits of each object at a time. For each set of bits, American flag sort makes two passes through the array of objects: first to count the number of objects that will fall in each bin, and second to place each object in its bucket. This works especially well when sorting a byte at a time, using 256 buckets. With some optimizations, it is twice as fast as quicksort for large sets of strings.^[1]

The name American flag sort comes by analogy with the Dutch national flag problem in the last step: efficiently partition the array into many "stripes".

Algorithm

This article may be confusing or unclear to readers. In particular, it is mentioned above that (1) American flag sort is typically used to sort large objects such as strings, and (2) American flag sort is twice as fast as quicksort for large sets of strings; yet this section states that American flag sort can only sort integers. The additional clarification "or objects that can be interpreted as integers" is quite meaningless since every concievable object in computer memory can be interpreted as an integer. Please help clarify the article. There might be a discussion about this on the talk page. (October 2020) (Learn how and when to remove this message)

Sorting algorithms in general sort a list of objects according to some ordering scheme. In contrast to comparison-based sorting algorithms, such as quicksort, American flag sort is based on directly comparing the bytes (numerical representation) of the underlying objects. In-place sorting algorithms, including American flag sort, run without allocating a significant amount of memory beyond that used by the original array. This is a significant advantage, both in memory savings and in time saved copying the array.

American flag sort works by successively dividing a list of objects into buckets based on the first digit of their base-N representation (the base used is referred to as the radix). When N is 3, each object can be swapped into the correct bucket by using the Dutch national flag algorithm. When N is larger, however, objects cannot be immediately swapped into place, because it is unknown where each bucket should begin and end. American flag sort gets around this problem by making two passes through the array. The first pass counts the number of objects that belong in each of the N buckets. The beginning of each bucket is then computed as the sum of sizes of the preceding buckets. The second pass puts each object into the correct bucket.

American flag sort is most efficient with a radix that is a power of 2, because bit-shifting operations can be used instead of expensive exponentiations to compute the value of each digit. When sorting strings using 8- or 7-bit encodings such as ASCII, it is typical to use a radix of 256 or 128, which amounts to sorting character-by-character.^[1]

Performance considerations

It is worth noting that for pure English alphabet text, the counts histogram is always sparse. Depending on the hardware, it may be worth clearing the counts in correspondence with completing a bucket (as in the original paper); or it may be worth maintaining a max and min active bucket, or a more complex data structure suitable for sparse arrays. It is also important to use a more basic sorting method for very small data sets, except in pathological cases where keys may share very long prefixes.

Most critically, this algorithm follows a random permutation, and is thus particularly cache-unfriendly for large datasets.^[2]^{[user-generated source]} It is a suitable algorithm in conjunction with a k-way merge algorithm.^{[citation needed]} (The original paper was written before cached memory was in common use.)

Pseudocode

american_flag_sort(Array, Radix)
    for each digit D:
        # first pass: compute counts
        Counts <- zeros(Radix)
        for object X in Array:
            Counts[digit D of object X in base Radix] += 1
        # compute bucket offsets
        Offsets <- [ sum(Counts[0..i]) for i in 1..Radix]
        # swap objects into place
        for object X in Array:
            swap X to the bucket starting at Offsets[digit D of X in base Radix]
        for each Bucket:
            american_flag_sort(Bucket, Radix)

Sample implementation in Python

This example written in the Python programming language will perform American flag sort for any radix of 2 or greater. Simplicity of exposition is chosen over clever programming, and so the log function is used instead of bit shifting techniques.

from math import floor, log
from copy import copy

def get_radix_val(x, digit, radix):
    return int(floor(x / radix**digit)) % radix

def compute_offsets(a_list, start, end, digit, radix):
    counts = [0 for _ in range(radix)]
    for i in range(start, end):
        val = get_radix_val(a_list[i], digit, radix)
        counts[val] += 1
    offset = start
    offsets = [start]
    for i in range(radix):
        offset += counts[i]
        offsets.append(offset)
    return offsets

def swap(a_list, offsets, start, end, digit, radix):
    next_free = copy(offsets)
    cur_block = 0
    while cur_block < radix:
        i = next_free[cur_block]
        if i >= offsets[cur_block+1]:
            cur_block += 1
            continue
        radix_val = get_radix_val(a_list[i], digit, radix)
        if radix_val != cur_block:
            swap_to = next_free[radix_val]
            a_list[i], a_list[swap_to] = a_list[swap_to], a_list[i]
        next_free[radix_val] += 1

def american_flag_sort_helper(a_list, start, end, digit, radix):
    offsets = compute_offsets(a_list, start, end, digit, radix)
    swap(a_list, offsets, start, end, digit, radix)
    if digit == 0:
        return
    for i in range(len(offsets)-1):
        american_flag_sort_helper(a_list, offsets[i], offsets[i+1], digit-1, radix)

def american_flag_sort(a_list, radix):
    for x in a_list:
        assert type(x) == int
    max_val = max(a_list)
    max_digit = int(floor(log(max_val, radix)))
    american_flag_sort_helper(a_list, 0, len(a_list), max_digit, radix)

References

General

This article incorporates public domain material from Paul E. Black. "American flag sort". Dictionary of Algorithms and Data Structures. NIST.

v t e Sorting algorithms
Theory	Computational complexity theory Big O notation Total order Lists Inplacement Stability Comparison sort Adaptive sort Sorting network Integer sorting X + Y sorting Transdichotomous model Quantum sort
Exchange sorts	Bubble sort Cocktail shaker sort Odd–even sort Comb sort Gnome sort Proportion extend sort Quicksort
Selection sorts	Selection sort Heapsort Smoothsort Cartesian tree sort Tournament sort Cycle sort Weak-heap sort
Insertion sorts	Insertion sort Shellsort Splaysort Tree sort Library sort Patience sorting
Merge sorts	Merge sort Cascade merge sort Oscillating merge sort Polyphase merge sort
Distribution sorts	American flag sort Bead sort Bucket sort Burstsort Counting sort Interpolation sort Pigeonhole sort Proxmap sort Radix sort Flashsort
Concurrent sorts	Bitonic sorter Batcher odd–even mergesort Pairwise sorting network Samplesort
Hybrid sorts	Block merge sort Kirkpatrick–Reisch sort Timsort Introsort Spreadsort Merge-insertion sort
Other	Topological sorting Pre-topological order Pancake sorting Spaghetti sort
Impractical sorts	Stooge sort Slowsort Bogosort