WSP-Search-Sorted: The Fastest Search Algorithm Derived From Binary Search
WSP-Search-Sorted is an optimal search algorithm as a substantial improvement to Binary Search, Exponential Search, Fibonacci Search and Interpolation Search.
Library
Source
wsp_search_sorted.c
wsp_search_sorted.h
Reference
wsp_search_sorted() is the searching function for a list of elements already sorted in either ascending or descending order with a marginal performance tradeoff compared to the 2 following functions.
wsp_search_sorted_ascending() is the searching function for a list of elements already sorted in ascending order.
wsp_search_sorted_descending() is the searching function for a list of elements already sorted in descending order.
Each of the aforementioned functions accept the following 5 arguments in left-to-right order.
1: low is the unsigned long lowest index bound to search in haystack.
2: high is the unsigned long highest index bound to search in haystack and must be greater than or equal to low.
3: haystack is the const unsigned long array of elements to search. The data type is interchangeable with any integral data type after modifying the relevant function parameter accordingly.
4: needle is the const unsigned long element to search for in haystack. The data type is interchangeable with any integral data type after modifying the relevant function parameter to match the haystack data type.
5: position is the unsigned long pointer containing the index of the searched element.
The return value data type is unsigned char.
When the return value is 1, position contains the index of the found needle.
When the return value is 0, a new index value isn't assigned to position.
Example
#include <stdio.h>
#include "wsp_search_sorted.h"
int main(void) {
const unsigned long input[6] = {1, 11, 111, 1111, 11111, 111111};
const unsigned long input_ascending[6] = {1, 11, 111, 1111, 11111, 111111};
const unsigned long input_descending[6] = {111111, 11111, 1111, 111, 11, 1};
unsigned long position;
if (wsp_search_sorted(0, 5, input, 11111, &position) == 1) {
printf("11111 is found at position %lu in input.\n", position);
}
if (wsp_search_sorted_ascending(0, 5, input_ascending, 111, &position) == 1) {
printf("111 is found at position %lu in input_ascending.\n", position);
}
if (wsp_search_sorted_descending(0, 5, input_descending, 1, &position) == 1) {
printf("1 is found at position %lu in input_descending.\n", position);
}
return 0;
}
Requirements
It adheres to the C89 standard draft (ISO/IEC 9899:1990), although it's convertible to other programming languages and standards.
Explanation
WSP-Search-Sorted is designed as a practical optimization for all sorted-list search algorithm implementations.
It's portable for both 32-bit and 64-bit systems.
It meets strict compliance, portability and code security requirements.
It doesn't use modulus, multiplication or division arithmetic operations.
Beyond the common optimization of verifying the existence of needle within haystack before searching, it modifies the high bound instead of initializing with a median pivot value.
Subtraction is always the first operation on the high bound, so it decreases repeatedly in the same direction without either wasting CPU cycles on unnecessary conditional statements or exceeding the low bounds that are already verified to contain needle. This drastically decreases the number of gap calculations and comparisons in the average instance.
Furthermore, the positioning of conditional statements with gap calculations offset by + 1 prevents unnecessary array accesses and equality operations after each modification of the high bound. This drastically increases speed on average, especially for large arrays.
This process repeats in each iteration with alternating directions whenever the high bound almost exceeds the position of the needle element.
Compared to Binary Search, it's close to a lossless optimization with up to 890% faster speed on average in the best instances and 15% faster speed on average in most instances.
A specific best instance against Binary Search is demonstrated with the following code example.
#include <stdio.h>
#include "wsp_search_sorted.h"
int main(void) {
unsigned short input[1111111];
unsigned long position;
unsigned long verification = 0;
unsigned long i = 0;
unsigned long j = 0;
unsigned long k;
while (i != 1111111) {
input[i] = i;
i++;
}
while (j != 200) {
i = 1111111 - j;
k = 111111 - j;
while (k != 1000) {
i--;
if (wsp_search_sorted_ascending(0, k, input, input[i], &position) != 0) {
verification++;
}
k--;
}
j++;
}
printf("The verification is %lu.\n", verification);
return 0;
}
Compared to Interpolation Search, it doesn't rely on evenly-distributed haystack values for heuristic calculations and it's 1000% faster than the worst case.
Compared to other sorted-list search algorithms for sorted arrays, it's at least 10% faster on average across a range of data types and data, both randomized and uniform.
It's optimized for instances where low is greater than 0 and where needle is either found or not found in haystack.
Average speed for uniform data is tested with the following code example.
#include <stdio.h>
#include "wsp_search_sorted.h"
int main(void) {
unsigned short input[1111111];
unsigned long position;
unsigned long verification = 0;
unsigned long i = 0;
unsigned long j = 0;
unsigned long k;
while (i != 1111111) {
input[i] = i | 3;
i++;
}
while (j != 200) {
i = 111110 - j;
k = 1111101 - j;
while (i != 10000) {
i--;
if (wsp_search_sorted_ascending(0, k, input, input[i], &position) != 0) {
verification++;
}
k--;
}
j++;
}
printf("The verification is %lu.\n", verification);
return 0;
}
Furthermore, the average speed is faster with randomized distribution using the aforementioned code, but with sorted numbers generated from various seeded states in WSP-PRNG-32.
Different degrees of distribution are tested with bitwise OR operands 3, 7, 15 and so on.
The following code example demonstrates the aforementioned worst instance with Interpolation Search. It searches for 1 in an array with all values defined as 2 except for the first value defined as 0 and the second value defined as 1.
#include <stdio.h>
#include "interpolation_search.h"
int main(void) {
unsigned short input[1111111];
unsigned long position;
unsigned long verification = 0;
unsigned long i = 0;
unsigned long j = 0;
unsigned long k;
while (i != 1111111) {
input[i] = 2;
i++;
}
input[0] = 0;
input[1] = 1;
input[2] = 2;
while (j != 200) {
i = 1111100 - j;
k = 1111101 - j;
while (i != 10000) {
i--;
if (interpolation_search_ascending(0, k, input, 1, &position) != 0) {
verification++;
}
k--;
}
j++;
}
printf("The verification is %lu.\n", verification);
return 0;
}
The aforementioned instance with WSP-Search-Sorted is faster than the same instance with Binary Search.
All speed tests are performed locally on a Pixelbook Go M3 using Debian.