William Stafford Parsons

WSP-Hash-OAAT: The Fastest One-at-a-Time Hashing Algorithm With Good Quality

WSP-Hash-OAAT is a tiny 32-bit OAAT hashing algorithm as a substantial improvement to 32-bit FNV-1a, Jenkin's OAAT, MicroOAAT and MurmurOAAT.

Library

Source

wsp_hash_oaat.c
wsp_hash_oaat.h

Reference

wsp_hash_oaat() is the all-at-once hashing function that accepts the 2 following arguments in left-to-right order.

1: input_count is the const unsigned long count of elements in the input array.

2: input is the const uint8_t array to hash.

The return value data type is uint32_t.

It returns the 32-bit unsigned integer hash digest result.

wsp_hash_oaat_initialize() is the initialization function that accepts the following argument.

1: s is the struct wsp_hash_oaat_s pointer.

wsp_hash_oaat_transform() is the core hashing loop function that accepts the 4 following arguments in left-to-right order.

1: i is the unsigned long starting index position of elements in the input array.

2: input_count is the const unsigned long count of elements in the input array.

3: input is the const uint8_t array to hash.

4: s is the struct wsp_hash_oaat_s pointer.

wsp_hash_oaat_finalize() is the finalization function that accepts the following argument.

1: s is the struct wsp_hash_oaat_s pointer. s.mix contains the finalized hash digest result.

The return value data type is void.

Example

#include <stdio.h> #include "wsp_hash_oaat.h" int main(void) { struct wsp_hash_oaat_s s; uint8_t input[8] = {'m', 'e', 's', 's', 'a', 'g', 'e', 0}; unsigned char i = 0; while (i != 10) { i++; printf("All-at-once result %u is 0x%08x.\n", i, wsp_hash_oaat(8, input)); input[7]++; } input[7] = 0; i = 0; while (i != 10) { i++; wsp_hash_oaat_initialize(&s); wsp_hash_oaat_transform(0, 8, input, &s); wsp_hash_oaat_finalize(&s); input[7]++; printf("Segmented result %u is 0x%08x.\n", i, s.mix); } return 0; }

Requirements

It adheres to the C99 standard draft (ISO/IEC 9899:1999), although it's convertible to other programming languages and standards.

Explanation

WSP-Hash-OAAT is designed to hash 32-bit keys in data structures in place of existing implementations whenever multi-byte memory reading is forbidden.

It's a lossless performance and statistical quality improvement for all input sizes larger than 3 bytes.

Input sizes less than or equal to 3 bytes are slower by a small margin with higher statistical quality.

It's portable for both 32-bit and 64-bit systems.

It meets strict compliance, portability and code security requirements.

It doesn't use modulus, multiplication or division arithmetic operations.

The core hashing loop structure appears similar to MicroOAAT due to the limited options when designing the fastest non-multiplicative, one-at-a-time hashing function with good collision avoidance, but the technical difference is vast.

It uses a XOR assignment operation for each input byte instead of an addition assignment, similar to FNV-1a. The remaining operations emulate multiplication before finalizing with enhanced bit distribution.

Any bitwise multiplication with a left shift operand greater than << 3 is too slow in the core hashing loop and anything less doesn't result in enough distribution to omit a second bitwise multiplication.

It left-rotates bits in mix_offset instead of right-rotating bits in mix.

mix_offset combines with mix using addition instead of subtraction to keep bits rolling in the same direction.

Furthermore, the bit rotation is non-blocking with instruction-level parallelism for each next byte iteration.

The speed gains from these design differences allow for a minimal finalization sequence with finely-tuned shift values and additive bit rotations in both directions.

The result deprecates the usage of 32-bit FNV-1a, Jenkin's OAAT, MicroOAAT and MurmurOAAT by comparison in most practical instances. There are significant improvements across all SMHasher tests with no critical collision vulnerabilities relative to the compared hashing algorithms.

Seed tests in SMHasher are omitted to both discourage using hashing algorithms as PRNGs and prevent collision vulnerabilities from 2³² different initialized states.

The strict avalanche test in SMHasher is omitted as it's only relevant to the analysis of cryptographic hash function properties where the worst instance exposes a critical cryptographic vulnerability.

Furthermore, the following code tests collision counts for truncated digests against each of the 8 bits flipped within a single input byte ranging from 1 to 255 at all positions for all input_count values ranging from 1 to 256.

#include <stdio.h> #include "wsp_hash_oaat.h" int main(void) { uint8_t input[256]; uint32_t result = 0; uint32_t result_flipped = 1; unsigned long bit_collisions_counts[32]; unsigned short sparse_byte = 0; unsigned short i = 0; unsigned short j = 0; unsigned short k = 0; unsigned short l = 0; unsigned short m = 0; while (i != 32) { bit_collisions_counts[i] = 0; i++; } i = 1; while (sparse_byte != 16) { while (i != 256) { j = 1; while (j != 256) { k = 0; while (k != j) { while (l != j) { input[l] = sparse_byte; l++; } if (sparse_byte == i) { k = j; l = 0; continue; } input[k] = i; result = wsp_hash_oaat(j, input); l = 0; while (l != 8) { input[k] = input[k] ^ (1 << l); result_flipped = wsp_hash_oaat(j, input); input[k] = input[k] ^ (1 << l); m = 1; while (m != 32) { if ( (result & ((2 << m) - 1)) == (result_flipped & ((2 << m) - 1)) ) { bit_collisions_counts[m - 1]++; } m++; } l++; m = 0; } k++; l = 0; } j++; k = 0; } i++; } sparse_byte++; i = 1; } i = 2; j = 0; while (i != 33) { printf("%2u-Bit Segmented Collisions: %9lu\n", i, bit_collisions_counts[i - 2]); i++; j++; } return 0; }

The following collision results demonstrate a sufficient collision-based avalanche effect in the worst instance with non-cryptographic, universal hashing relevance.

2-Bit Segmented Collisions: 268422364 3-Bit Segmented Collisions: 136568989 4-Bit Segmented Collisions: 69294412 5-Bit Segmented Collisions: 34875523 6-Bit Segmented Collisions: 17265466 7-Bit Segmented Collisions: 8343604 8-Bit Segmented Collisions: 4141758 9-Bit Segmented Collisions: 2065115 10-Bit Segmented Collisions: 1022555 11-Bit Segmented Collisions: 509805 12-Bit Segmented Collisions: 253307 13-Bit Segmented Collisions: 126814 14-Bit Segmented Collisions: 62568 15-Bit Segmented Collisions: 31040 16-Bit Segmented Collisions: 15333 17-Bit Segmented Collisions: 7648 18-Bit Segmented Collisions: 3715 19-Bit Segmented Collisions: 1871 20-Bit Segmented Collisions: 918 21-Bit Segmented Collisions: 472 22-Bit Segmented Collisions: 211 23-Bit Segmented Collisions: 109 24-Bit Segmented Collisions: 65 25-Bit Segmented Collisions: 27 26-Bit Segmented Collisions: 8 27-Bit Segmented Collisions: 6 28-Bit Segmented Collisions: 2 29-Bit Segmented Collisions: 0 30-Bit Segmented Collisions: 0 31-Bit Segmented Collisions: 0 32-Bit Segmented Collisions: 0

0 32-bit collisions out of 125 million groups of 8 single-bit flip tests proves the worst instances are rare enough to be considered random occurrences in practical implementations.

The semi-linear collision increments after each truncation are within an acceptable range based on the probability of collisions at each bit size. For example, 200-300 million collisions in the lower 2 bits are expected out of 1 billion total keys, but only up to a few are expected in the full 32 bits.

Compared to 32-bit FNV-1a, it's 53% faster on average for small inputs and up to 70% faster on average for large inputs.

Compared to Jenkin's OAAT, it's 110% faster on average for small inputs and up to 112% faster on average for large inputs.

Compared to MicroOAAT, it's 13% faster on average for small inputs and up to 45% faster on average for large inputs.

Compared to MurmurOAAT, it's 88% faster on average for small inputs and up to 145% faster on average for large inputs.

In the instances when multi-byte memory reading is permitted and byte order isn't expected to change during runtime, WSP-Hash-32 is a faster alternative.

All speed tests are performed locally on a Pixelbook Go M3 using Debian.

Games

Contrivity

Contrivity

Spawn into the hostile quantum laboratory and destroy oscillations.