WSP-Sort-Unstable: The Fastest Shell Sort Gap Sequence Calculation and Sorting Algorithm Without Auxiliary Partition Space

William Stafford Parsons developed a bitwise calculation for an optimal sequence of unstable Shell Sort gaps as a substantial improvement to Ciura, Knuth, Sedgwick and Tokuda sequences.

Library

Source

void wsp_sort_unstable_ascending(unsigned long input_count, unsigned long *input) { unsigned long _input; unsigned long gap = (input_count >> 5) + (input_count >> 3) + 1; unsigned long i; unsigned long j; while (gap > 0) { i = gap; while (i < input_count) { _input = input[i]; j = i; while ( j >= gap && input[j - gap] > _input ) { input[j] = input[j - gap]; j -= gap; } input[j] = _input; i++; } if ( gap > 3 || gap == 1 ) { gap = (gap >> 5) + (gap >> 2); } else { gap = 1; } } } void wsp_sort_unstable_descending(unsigned long input_count, unsigned long *input) { unsigned long _input; unsigned long gap = (input_count >> 5) + (input_count >> 3) + 1; unsigned long i; unsigned long j; while (gap > 0) { i = gap; while (i < input_count) { _input = input[i]; j = i; while ( j >= gap && input[j - gap] < _input ) { input[j] = input[j - gap]; j -= gap; } input[j] = _input; i++; } if ( gap > 3 || gap == 1 ) { gap = (gap >> 5) + (gap >> 2); } else { gap = 1; } } }

Reference

wsp_sort_unstable_ascending() is the sorting function in ascending order that accepts the following 2 arguments.

input_count is the count of elements in the input array.

input is the unsigned long array to sort. The default unsigned long data type is interchangeable with any integral data type. The integral data type of the input pointer must match the integral data type of _input.

The return value data type is void.

wsp_sort_unstable_descending() is the sorting function in descending order that accepts the following 2 arguments.

input_count is the count of elements in the input array.

input is the unsigned long array to sort. The default unsigned long data type is interchangeable with any integral data type. The integral data type of the input array must match the integral data type of _input.

The return value data type is void.

Requirements

C compiler with C89 (ISO/IEC 9899:1990) standard compatibility.

Explanation

WSP-Sort-Unstable is designed to both increase the speed and decrease resource usage for all Shell Sort implementations.

It's the fastest in-place sorting algorithm on average that doesn't allocate additional memory for input partitions. This applies across all input data types, distributions and sizes.

It's portable for both 32-bit and 64-bit systems.

It meets strict compliance, portability and code security requirements.

It doesn't use modulus, multiplication or division arithmetic operations.

Before sorting, it doesn't pre-calculate the upper limit with a loop starting from 0. Instead, the gap numbers in each sorting instance are dynamically-calculated with inconsistencies among different input_count values.

Elements are always guaranteed to be sorted within bounds using conditional statements that guarantee the final pass is always a regular insertion sort with a gap value of 1.

After each pass, gap > 3 prevents any result of gap calculation from jumping from either 3 or 2 to 0 instead of 1 based on the following calculation output table.

Gap Calculation Result 0 0 1 0 2 0 3 0 4 1 5 1 6 1 7 1 8 2 9 2 10 2

The following speed test results were performed with 1 million randomized elements.

Compared to optimized Shell Sort, it's 31% faster.

Compared to optimized Knuth's Sequence, it's 24% faster without requiring either division operations or pre-calculated gap increments.

Furthermore, it's still faster when the gap increments are hard-coded.

Compared to optimized Ciura, Sedgewick and Tokuda Sequences, it's marginally-faster without a hard-coded upper limit of elements in auxiliary memory.

The speed is faster with a variety of input_count values and input data types as shown in the following benchmarking table.

10k 2-Byte Inputs Ciura 0.005s Tied Knuth 0.005s Tied SedgeWick 0.005s Tied Tokuda 0.005s Tied Shell 0.006s ---- WSP-Sort-Unstable 0.005s Tied 100k 2-Byte Inputs Ciura 0.040s --- Knuth 0.040s --- SedgeWick 0.039s --- Tokuda 0.040s --- Shell 0.047s --- WSP-Sort-Unstable 0.037s Won 250k 2-Byte Inputs Ciura 0.103s --- Knuth 0.110s --- SedgeWick 0.101s --- Tokuda 0.102s --- Shell 0.126s --- WSP-Sort-Unstable 0.091s Won 500k 2-Byte Inputs Ciura 0.211s --- Knuth 0.243s --- SedgeWick 0.212s --- Tokuda 0.209s --- Shell 0.291s --- WSP-Sort-Unstable 0.191s Won 1m 1-Byte Inputs Ciura 0.302s --- Knuth 0.349s --- SedgeWick 0.291s --- Tokuda 0.304s --- Shell 0.380s --- WSP-Sort-Unstable 0.266s Won 1m 2-Byte Inputs Ciura 0.461s --- Knuth 0.570s --- SedgeWick 0.453s --- Tokuda 0.451s --- Shell 0.634s --- WSP-Sort-Unstable 0.423s Won

It's faster than every other sequence as well, including Hibbard, Papernov & Stasevich and Pratt.

All speed tests were performed locally on a Pixelbook Go M3 using Debian.

Games

ContrivityContrivity Icon

Contrivity

Spawn into the hostile quantum laboratory and destroy oscillations.