WSP-Sort-Unstable: The Fastest Shell Sort Gap Sequence Calculation and Sorting Algorithm Without Auxiliary Partition Space
WSP-Sort-Unstable is a bitwise calculation for an optimal sequence of unstable Shell Sort gaps as a substantial improvement to Ciura, Knuth, Sedgwick and Tokuda sequences.
Library
Source
wsp_sort_unstable.c
wsp_sort_unstable.h
Reference
wsp_sort_unstable_ascending() is the sorting function in ascending order that accepts the following 2 arguments in left-to-right order.
1: input_count is the const unsigned long count of elements in the input array.
2: input is the unsigned long array to sort. The data type is interchangeable with any integral data type and must match the integral data type of input_capture.
The return value data type is void.
wsp_sort_unstable_descending() is the sorting function in descending order that accepts the following 2 arguments in left-to-right order.
1: input_count is the const unsigned long count of elements in the input array.
2: input is the unsigned long array to sort. The data type is interchangeable with any integral data type and must match the integral data type of input_capture.
The return value data type is void.
Example
#include <stdio.h>
#include "wsp_sort_unstable.h"
int main(void) {
unsigned long input[10] = {1, 111, 111, 11, 111111, 111, 1, 11, 111111111, 1};
unsigned long i = 0;
wsp_sort_unstable_ascending(10, input);
while (i != 10) {
printf("%u\n", input[i]);
i++;
}
wsp_sort_unstable_descending(10, input);
i = 0;
while (i != 10) {
printf("%u\n", input[i]);
i++;
}
return 0;
}
Requirements
It adheres to the C89 standard draft (ISO/IEC 9899:1990), although it's convertible to other programming languages and standards.
Explanation
WSP-Sort-Unstable is designed to both increase the speed and decrease resource usage for all Shell Sort implementations.
It's the fastest in-place sorting algorithm on average that doesn't allocate additional memory for input partitions. This applies across all input data types, distributions and sizes.
It's portable for both 32-bit and 64-bit systems.
It meets strict compliance, portability and code security requirements.
It doesn't use modulus, multiplication or division arithmetic operations.
Before sorting, it doesn't pre-calculate the upper limit with a loop starting from 0. Instead, the gap numbers in each sorting instance are dynamically-calculated with inconsistencies among different input_count values.
Elements are always guaranteed to be sorted within bounds using conditional statements that guarantee the final pass is always a regular insertion sort with a gap value of 1.
After each pass, gap > 3 prevents any result of gap calculation from jumping from either 3 or 2 to 0 instead of 1 based on the following calculation output table.
Gap Calculation
Result
0 0
1 0
2 0
3 0
4 1
5 1
6 1
7 1
8 2
9 2
10 2
Compared to optimized Ciura, Knuth, Sedgewick and Tokuda gap sequences, it's faster with varying input_count values as shown in the following benchmarking table.
100k 2-Byte Inputs
Ciura 0.040s ---
Knuth 0.040s ---
Sedgewick 0.039s ---
Tokuda 0.040s ---
Shell 0.047s ---
WSP-Sort-Unstable 0.037s Won
250k 2-Byte Inputs
Ciura 0.103s ---
Knuth 0.110s ---
Sedgewick 0.101s ---
Tokuda 0.102s ---
Shell 0.126s ---
WSP-Sort-Unstable 0.091s Won
500k 2-Byte Inputs
Ciura 0.211s ---
Knuth 0.243s ---
Sedgewick 0.212s ---
Tokuda 0.209s ---
Shell 0.291s ---
WSP-Sort-Unstable 0.191s Won
1m 1-Byte Inputs
Ciura 0.302s ---
Knuth 0.349s ---
Sedgewick 0.291s ---
Tokuda 0.304s ---
Shell 0.380s ---
WSP-Sort-Unstable 0.266s Won
1m 2-Byte Inputs
Ciura 0.461s ---
Knuth 0.570s ---
Sedgewick 0.453s ---
Tokuda 0.451s ---
Shell 0.634s ---
WSP-Sort-Unstable 0.423s Won
It's faster than every other sequence as well, including Hibbard, Papernov & Stasevich and Pratt.
All speed tests are performed locally on a Pixelbook Go M3 using Debian.