Giter VIP home page Giter VIP logo

avaneev / prvhash Goto Github PK

View Code? Open in Web Editor NEW
297.0 22.0 22.0 10.06 MB

PRVHASH - Pseudo-Random-Value Hash. Hash functions, PRNG with unlimited period, randomness extractor, and a glimpse into abyss. (inline C/C++) (Codename Gradilac/Градилак)

License: MIT License

C 73.30% C++ 13.08% Python 13.62%
hash-functions hashing-algorithm hash pseudo-random prng prng-methods random hash-function hashing-algorithms hashing

prvhash's Introduction

PRVHASH - Pseudo-Random-Value Hash (in C/C++)

Introduction

PRVHASH is a hash function that generates a uniform pseudo-random number sequence derived from the message. PRVHASH is conceptually similar (in the sense of using a pseudo-random number sequence as a hash) to keccak and RadioGatun schemes, but is a completely different implementation of such concept. PRVHASH is both a "randomness extractor" and an "extendable-output function" (XOF).

PRVHASH can generate 64- to unlimited-bit hashes, yielding hashes of approximately equal quality independent of the chosen hash length. PRVHASH is based on 64-bit math, which is scalar, portable, cross-platform, inlineable, C++ compatible (fairly efficient on 32-bit systems as well). The use of the function beyond 1024-bit hashes is easily possible, but has to be statistically tested. For example, any 32-bit element extracted from 2048-, or 4096-bit resulting hash is as collision resistant as just a 32-bit hash. It is a fixed execution time hash function that depends only on message's length. A streamed, higher-security, hashing implementation is available.

PRVHASH is solely based on the butterfly effect, inspired by LCG pseudo-random number generators. The generated hashes have good avalanche properties. For best security, a random seed should be supplied to the hash function, but this is not a requirement.

64-, 128-, 192-, 256-, 512-, and 1024-bit PRVHASH hashes pass all SMHasher tests. Other hash lengths were not thoroughly tested, but extrapolations can be made. The author makes no cryptographic claims (neither positive nor negative) about PRVHASH-based constructs.

PRVHASH core function can be used as a PRNG with an arbitrarily-chosen (practically unlimited) period, depending on the number of hashwords in the system.

PRVHASH64

Please see the prvhash64.h file for the details of the basic hash function implementation (the prvhash.h, prvhash4.h, prvhash42.h, and versions below 4.3 are outdated versions). While this hash function is most likely irreversible, according to SAT solver-based testing, it does not feature a preimage resistance. This function should not be used in open systems, without a secret seed. Note that 64 refers to core function's variable size.

The default prvhash64.h-based 64-bit hash of the string The cat is out of the bag is eb405f05cfc4ae1c.

A proposed short name for hashes created with prvhash64.h is PRH64-N, where N is the hash length in bits (e.g., PRH64-256).

PRVHASH64_64M

This is a minimized implementation of the prvhash64 hash function. Arguably, it is the smallest hash function in the world, that produces 64-bit hashes of this quality level. While this function does not provide a throughput that can be considered "fast", due to its statistical properties it is practically fast for hash-maps and hash-tables.

Streamed Hashing

The file prvhash64s.h includes a relatively fast streamed hashing function which utilizes a "fused" PRVHASH arrangement. Please take a look at the prvhash64s_oneshot() function for usage example. The prvhash64s offers both an increased security and hashing speed.

This function has an increased preimage resistance compared to the basic hash function implementation. Preimage resistance cannot be currently estimated exactly, but the hash length affects it exponentially. Also, preimage attack usually boils down to exchange of forged symbols to "trash" symbols (at any place of the data stream); substitutions usually end up as being quite random, possibly damaging to any compressed or otherwise structured file. Which means that data compression software and libraries should always check any left-over, "unused", data beyond the valid compressed stream, for security reasons.

Time complexity for preimage attack fluctuates greatly as preimage resistance likely has a random-logarithmic PDF of timing.

Even though a formal proof is not yet available, the author assumes this hash function can compete with widely-used SHA2 and SHA3 families of hash functions while at the same time offering a considerably higher performance and scalability. When working in open systems, supplying a secret seed is not a requirement for this hash function.

The performance (expressed in cycles/byte) of this hash function on various platforms can be evaluated at the ECRYPT/eBASH project.

The default prvhash64s.h-based 64-bit hash of the string The cat is out of the bag is 2043ccf52ae2ca6f.

The default prvhash64s.h-based 256-bit hash of the string Only a toilet bowl does not leak is b13683799b840002689a1a42d93c826c25cc2d1f1bc1e48dcd005aa566a47ad8.

The default prvhash64s.h-based 256-bit hash of the string Only a toilet bowl does not leaj is d4534a922fd4f15ae8c6cc637006d1f33f655b06d60007a226d350e87e866250.

This demonstrates the Avalanche effect. On a set of 216553 English words, pair-wise hash comparisons give average 50.0% difference in resulting hash bits, which fully satisfies the strict avalanche criterion.

This streamed hash function produces hash values that are different to the prvhash64 hash function. It is incorrect to use both of these hash function implementations on the same data set. While prvhash64 can be used as a hash for hash-tables and in-memory data blocks, prvhash64s can be used to create hashes of large data blocks like files, in streamed mode.

A proposed short name for hashes created with prvhash64s.h is PRH64S-N, where N is the hash length in bits (e.g., PRH64S-256). Or simply, SH4-N, Secure Hash 4.

Minimal PRNG for Everyday Use

The core function can be easily integrated into your applications, to be used as an effective PRNG. The period of this minimal PRNG is at least 2159. The initial parameters can be varied at will, and won't "break" the PRNG. Setting only the Seed value guarantees a random start point within the whole PRNG period, with at least 264 spacing. The code follows.

#include "prvhash_core.h"
#include <stdio.h>

int main()
{
    uint64_t Seed = 0;
    uint64_t lcg = 0;
    uint64_t Hash = 0;

    uint64_t v = 0;
    uint64_t i;

    for( i = 0; i < ( (uint64_t) 1 << 28 ); i++ )
    {
        v = prvhash_core64( &Seed, &lcg, &Hash );
    }

    printf( "%llu\n", v );
}

For implementation assurance, here are the first 16 output values in hex (starting from the all-zeroes state):

0x5555555555555555 0x00000000DB6DB6DB 0x2492492192492492 0x75D75DA0AAAAAA79
0x93064E905C127FE5 0xE2585C9CA95671A3 0x28A44B31D428179E 0x11B0B6A8D4BA3A73
0x195C6A4C23EE71AD 0x5AA47859226BA23E 0xA7D42121695056D4 0x142D7CD5D83342F2
0x3D42E83328C09C8F 0x7E691C66BAC23222 0x82E1032F441F23A5 0xA4BDE5C4A05E6256

Note that such minimal 1-hashword PRNG is most definitely not cryptographically-secure: its state can be solved by a SAT solver pretty fast; this applies to other arrangements ("fused", "parallel", multiple hashwords; with daisy-chaining being harder to solve). The known way to make PRNG considerably harder to solve for a SAT solver, with complexity corresponding to system's size, is to combine two adjacent PRNG outputs via XOR operation; this obviously has a speed impact and produces output with more than 1 solution (most probably, 2). This, however, does not measurably increase the probability of PRNG output overlap, which stays below 1/2sys_size_bits; in tests, practically undetectable.

So, the basic PRNG with some, currently not formally-proven, security is as follows (XOR two adjacent outputs to produce a single "compressed" PRNG output):

v = prvhash_core64( &Seed, &lcg, &Hash );
v ^= prvhash_core64( &Seed, &lcg, &Hash );

A similar approach is to simply skip the next generated random number, but it is slightly less secure. It is likely that PRVHASH's k-equidistribution of separate outputs is implicitly secure. The reason is that skipping or XORing creates uncertainty or entanglement of current output with system's state "hashword array length" of outputs back. 3 XORs are needed to provide preimage resistance, or resistance against selection of entropy input that leads to a desired output. The security becomes effective only after system's initialization: initial "conditioning" rounds and a full hashword array pass.

TPDF Dithering

The core function can be used to implement a "statistically-good" and "neutrally-sounding" dithering noise for audio signals; for both floating-point to fixed-point, and bit-depth conversions.

uint64_t rv = prvhash_core64( &Seed, &lcg, &Hash );
double tpdf = ( (int64_t) (uint32_t) rv - (int64_t) ( rv >> 32 )) * 0x1p-32;

Floating-Point PRNG

The following expression can be used to convert 64-bit unsigned value to full-mantissa floating-point value, without a truncation bias:

uint64_t rv = prvhash_core64( &Seed, &lcg, &Hash );
double v = ( rv >> ( 64 - 53 )) * 0x1p-53;

Gradilac PRNG (C++)

The gradilac.h file includes the Gradilac C++ class which is a generalized templated implementation of PRVHASH PRNG that provides integer, single bit, floating-point, TPDF, Normal random number generation with a straight-forward front-end to specify PRVHASH system's properties. Supports on-the-go re-seeding, including re-seeding using sparse entropy (for CSPRNG uses). Does not require other PRVHASH header files.

Use Gradilac< 316 > to match Mersenne Twister's PRNG period.

Note that this class may not be as efficient for "bulk" random number generation as a custom-written code. Nevertheless, Gradilac PRNG class, with its 1.0 cycles/byte floating-point performance (at default template settings), is competitive among other C++ PRNGs.

Entropy PRNG

PRVHASH can be also used as an efficient general-purpose PRNG with an external entropy source injections (like how the /dev/urandom works on Unix): this was tested, and works well when 8-bit true entropy injections are done inbetween 8 to 2048 generated random bytes (delay is also obtained via the entropy source). An example generator is implemented in the prvrng.h file: simply call the prvrng_test64p2() function.

prvrng_gen64p2()-based generator passes PractRand 32 TB threshold with rare non-systematic "unusual" evaluations. Which suggests it is the working randomness extractor that can "recycle" entropy of any statistical quality, probably the first in the world.

Note that due to the structure of the core function the probability of PRNG completely "stopping", or "stalling", or losing internal entropy, is absent.

The core function, without external entropy injections, with any initial combination of lcg, Seed, and Hash eventually converges into one of random number sub-sequences. These are mostly time-delayed versions of only a smaller set of unique sequences. There are structural limits in this PRNG system which can be reached if there is only a small number of hashwords in the system. PRNG will continously produce non-repeating random sequences given external entropy input, but their statistical quality on a larger frames will be limited by the size of lcg and Seed variables, and the number of hashwords in the system, and the combinatorial capacity of the external entropy. A way to increase the structural limit is to use the "fused" PRNG arrangement demonstrated in the prvhash64s.h file, which additionally increases the security exponentially. Also any non-constant entropy input usually increases the period of randomness, which, when extrapolated to hashing, means that the period increases by message's combinatorial capacity (or the number of various combinations of its bits). The maximal PRNG period's 2N exponent is hard to approximate exactly, but in most tests it was equal to at least system's size in bits, minus the number of hashwords in the system, minus 1/4 of lcg and Seed variables' size (e.g., 159 for a minimal PRNG).

Moreover, the PRVHASH systems can be freely daisy-chained by feeding their outputs to Seed/lcg inputs, adding some security firewalls, and increasing the PRNG period of the final output accordingly. Note that any external PRNG output can be inputted via either Seed, lcg, or both, yielding PRNG period exponent summation. For hashing and external unstructured entropy, only simultaneous input via Seed and lcg works in practice (period's exponent increase occurs as well).

While lcg, Seed, and Hash variables are best initialized with good entropy source (however, structurally, they can accept just about any entropy quality while only requiring an initial "conditioning"), the message can be sparsely-random: even an increasing counter can be considered as having a suitable sparse entropy.

Two-Bit PRNG

This is a "just for fun" example, but it passes 256 MB PractRand threshold. You CAN generate pseudo-random numbers by using 2-bit shuffles; moreover, you can input external entropy into the system.

#include <stdio.h>
#include "prvhash_core.h"
#define PH_HASH_COUNT 42

int main()
{
    uint8_t Seed = 0;
    uint8_t lcg = 0;
    uint8_t Hash[ PH_HASH_COUNT ] = { 0 };
    int HashPos = 0;
    int l;

    for( l = 0; l < 256; l++ )
    {
        uint8_t r = 0;
        int k;

        for( k = 0; k < 4; k++ )
        {
            r <<= 2;
            r |= prvhash_core2( &Seed, &lcg, Hash + HashPos );

            HashPos++;

            if( HashPos == PH_HASH_COUNT )
            {
                HashPos = 0;
            }
        }

        if( l > PH_HASH_COUNT / 3 ) // Skip PRNG initialization.
        {
            printf( "%4i ", (int) r );
        }
    }
}

Description

Here is the author's vision on how the core function works. In actuality, coming up with this solution was accompanied by a lot of trial and error. It was especially hard to find a better "hashing finalization" solution.

Seed ^= msgw; lcg ^= msgw; // Mix in external entropy (or daisy-chain).

Seed *= lcg * 2 + 1; // Multiply random by random, without multiply by zero.
const uint64_t rs = Seed >> 32 | Seed << 32; // Produce halves-swapped copy.
Hash += rs + 0xAAAAAAAAAAAAAAAA; // Accumulate to hash, add raw entropy (self-start).
lcg += Seed + 0x5555555555555555; // Output-bound entropy accumulation, add raw entropy.
Seed ^= Hash; // Mix new seed value with hash. Entropy feedback.
const uint64_t out = lcg ^ rs; // Produce "compressed" output.

This function can be arbitrarily scaled to any even-sized variables: 2-, 4-, 8-, 16-, 32-, 64-bit variable sizes were tested, with similar statistical results. Since mathematical structure of the function does not depend on the variables' size, statistical analysis can be performed using smaller variable sizes, with the results being extrapolatable to larger variable sizes, with a high probability (the function is invariant to the variable size). Also note that the 0xAAAA... constant is not an arbitrary constant since it should be produced algorithmically by replicating the 10 bit-pairs, to match the variable size; it represents the "raw entropy bit-train". The same applies to the 0x5555... constant. An essential property of these bit-trains is that they are uncorrelated to any uniformly-random bit-sequences, at all times. Practically, 10 and 01 bit-pairs can be also used as constants, without replication, but this does not provide conclusively better results for PRNG, and does not work well for hashing; also, self-starting period becomes longer. A conceptual aspect of replicated bit-pairs is that they represent the simplest maximum-entropy number that lacks information (bit-pair is a minimal sequence that can exhibit entropy, with replication count bound to state variable size). While "magic numbers" can be used instead of these bit-trains (at least for PRNG), they do not posses the property of not having an information (zero spectrum beside DC and Nyquist components).

It is important to point out that the presence of the 0xAAAA... and 0x5555... constants logically assure that the Seed and lcg variables quickly recover from the "zero-state". Beside that, these constants logically prohibit synchronous control over Seed and lcg variables: different bits of the input entropy will reach these variables. When the system starts from the "zero-state", with many hashwords in the system, it is practically impossible to find a preimage (including a repetitious one) that stalls the system, and thus it is impossible to perform a multi-collision attack. However, since this risk cannot be estimated exactly, the prvhash64s hash function adds a message length value to the end of the data stream.

How does it work? First of all, this PRNG system, represented by the core function, does not work with numbers in a common sense: it works with entropy, or random sequences of bits. The current "expression" of system's overall internal entropy - the Seed - gets multiplied ("smeared") by a supporting, output-bound variable - lcg, - which is also a random value, transformed in an LCG-alike manner. As a result, a new random value is produced which represents two independent random variables (in lower and higher parts of the register), a sort of "entropy stream sub-division" happens. This result is then halves-swapped, and is accumulated in the Hash together with the 10 bit-train which adds the "raw entropy", allowing the system to be self-starting. The original multiplication result is accumulated in the lcg variable. The Seed is then updated with the hashword produced on previous rounds. The reason the message's entropy (which may be sparse or non-random) does not destabilize the system is because the message becomes hidden in the internal entropy (alike to a cryptographic one-time-pad); message's distribution becomes unimportant, and system's state remains statistically continuous. Both accumulations - of the halves-swapped and the original result of multiplication - produce a uniformly-distributed value in the corresponding variables; a sort of "de-sub-division" happens in these.

The two instructions - Seed *= lcg * 2 + 1, lcg += Seed - represent an "ideal" bit-shuffler: this construct represents a "bivariable shuffler" which transforms the input lcg and Seed variables into another pair of variables with 50% bit difference relative to input, and without collisions. The whole core function, however, uses a more complex mixing which produces a hash value: the pair composed of the hash value and either a new lcg or a new Seed value also produces no input-to-output collisions. Thus it can be said that the system does not lose any input entropy. In 4-dimensional analysis, when Seed, lcg, Hash, and msgw values are scanned and transformed into subsequent Seed, lcg, and Hash triplets, this system does not exhibit local state change-related collisions due to external entropy input (all possible input msgw values map to subsequent triplets uniquely). However, with a small variable size (8-bit) and a large output hash size, a sparse entropy input has some probability of "re-sychronization" event happening, leading to local collisions. With 16-bit variables, or even 8-bit fused-2 arrangement (with the local state having 40-bit size instead of 24-bit), probability of such event is negligible. While non-fused hashing may even start from the "zero-state", for reliable hashing the state after 5 "conditioning" rounds should be used.

Another important aspect of this system, especially from the cryptography standpoint, is the entropy input to output latency. The base latency for state-to-state transition is equal to 1 (2 for "fused" arrangements); and at the same time, 1 in hash-to-hash direction: this means that PRVHASH additionally requires a full pass through the hashword array, for the entropy to propagate, before using its output. However, hashing also requires a pass to the end of the hashword array if message's length is shorter than the output hash, to "mix in" the initial hash value. When there is only 1 hashword in use, there is no hashword array-related delay, and thus the entropy propagation is only subject to the base latency. The essence of these "latencies" is that additional rounds are needed for the system to get rid of a statistical traces of the input entropy. Note that the "fused" arrangement increases shuffling quality. However, this increase is relative to state variable size: for example, 8-bit fused-2 arrangement with 8-bit input is equivalent to 16-bit non-fused arrangement with 16-bit input. So, it is possible to perform hashing with 8-bit state variables if fused-2 round is done per 1 input byte. The way "fused" structure works is equivalent to shuffling all entropy inputs in a round together (input 1 is shuffled into a hash value which is then shuffled with input 2 into a hash value, etc). The "fused" arrangement may raise a question whether or not it provides a target collision resistance as it seemingly "compresses" several inputs into a single local hashword: without doubt it does provide target collision resistance since Seed and lcg variables are a part of the system, and their presence in the "fused" arrangement increases the overall PRNG period of the system and thus its combinatorial capacity.

Without external entropy (message) injections, the function can run for a prolonged time, generating pseudo-entropy, in extendable-output PRNG mode. When the external entropy (message) is introduced, the function "shifts" into an unrelated state unpredictably. So, it can be said that the function "jumps" within a space of a huge number of pseudo-random sub-sequences. Hash length affects the size of this "space of sub-sequences", permitting the function to produce quality hashes for any required hash length. Statistically, these "jumps" are close to uniformly-random repositioning: each simultaneous augmentation of Seed and lcg corresponds to a new random position, with a spread over the whole PRNG period. The actual performace is more complicated as this PRNG system is able to converge into unrelated random number sequences of varying lengths, so the "jump" changes both the position and the "index" of sub-sequence. This property of PRVHASH assures that different initial states of its Seed state variable (or lcg, which is mostly equivalent at initialization stage) produce practically unrelated random number sequences, permitting to use PRVHASH for PRNG-based simulations.

In essence, the hash function generates a continuous pseudo-random number sequence, and returns the final part of the sequence as a result. The message acts as a "pathway" to this final part. So, the random sequence of numbers can be "programmed" to produce a necessary outcome. However, as this PRNG does not expose its momentary internal state, such "programming" is hardly possible to perform for an attacker, even if the entropy input channel is exposed: consider the (A^C)*(B^C) equation; an adversary can control C, but does not know the values of A and B; thus this adversary cannot predict the outcome. Beside that, as the core function naturally eliminates the bias from the external entropy of any statistical quality and frequency, its control may be fruitless. Note that to reduce such "control risks", the entropy input should use as fewer bits as possible, like demonstrated in the prvrng.h file.

P.S. The reason the prvhash64 hash function has an initial non-zero state, is that otherwise the function would require 5 preliminary "conditioning" rounds (core function calls); that would reduce the performance of the hash function dramatically, for hash-table uses. Note that the prvhash64s function starts from the "full zero" state and then performs acceptably.

Hashing Method's Philosophy

Any external entropy (message) that enters this PRNG system acts as a high-frequency and high-quality re-seeding which changes the random number generator's "position" within the PRNG period, randomly. In practice, this means that two messages that are different in even 1 bit, at any place, produce "final" random number sequences, and thus hashes, which are completely unrelated to each other. This also means that any smaller part of the resulting hash can be used as a complete hash. Since the hash length affects the PRNG period (and thus the combinatorial capacity) of the system, the same logic applies to hashes of any length while meeting collision resistance specifications for all lengths.

Alternatively, the hashing method can be viewed from the standpoint of classic bit-mixers/shufflers: the hashword array can be seen as a "working buffer" whose state is passed back into the "bivariable shuffler" continuously, and the new shuffled values stored in this working buffer for the next pass.

In general, PRVHASH core function represents a "building block" that permits design of practically any entropy-generating constructs. It has an important advantage in that the state space of these constructs can be completely analyzed using small state variables, with the obtained statistics being extrapolatable to larger state variables.

PRNG Period Assessment

The following "minimal" implementation of PractRand class can be used to independently assess randomness period properties of PRVHASH. By varying the PH_HASH_COUNT and PH_PAR_COUNT values it is possible to test various PRNG system sizes. By adjusting other values it is possible to test PRVHASH scalability across different state variable sizes (PractRand class and PRNG output size should be matched, as PractRand test results depend on PRNG output size). PractRand should be run with the -tlmin 64KB parameter, to evaluate changes to the constants quicker. Note that both PH_HASH_COUNT and PH_PAR_COUNT affect the PRNG period exponent not exactly linearly for small variable sizes: there is a saturation factor present for small variable sizes; after some point the period increase is non-linear due to small shuffling space. Shuffling space can be increased considerably with a "fused" arrangement. Depending on the initial seed value, the period may fluctuate. The commented out Ctr++... instructions can be uncommented to check the period increase due to sparse entropy input. You may also notice the ^=h instructions: PRVHASH supports feedback onto itself (it is like hashing its own output). This operation, which can be applied to any fused element, maximizes the achieved PRNG period.

#include "prvhash_core.h"
#include <string.h>

#define PH_FUSE_COUNT 1 // PRVHASH fusing.
#define PH_HASH_COUNT 4 // Hashword count (any positive number).
#define PH_STATE_TYPE uint8_t // State variable's physical type.
#define PH_FN prvhash_core4 // Core function name.
#define PH_BITS 4 // State variable's size in bits.
#define PH_RAW_BITS 8 // Raw output bits.
#define PH_RAW_ROUNDS ( PH_RAW_BITS / PH_BITS ) // Rounds per raw output.

class DummyRNG : public PractRand::RNGs::vRNG8
{
public:
    PH_STATE_TYPE Seed[ PH_FUSE_COUNT ];
    PH_STATE_TYPE lcg[ PH_FUSE_COUNT ];
    PH_STATE_TYPE Hash[ PH_HASH_COUNT ];
    int HashPos;

    DummyRNG() {
        memset( Seed, 0, sizeof( Seed ));
        memset( lcg, 0, sizeof( lcg ));
        memset( Hash, 0, sizeof( Hash ));
        HashPos = 0;

        // Initialize.

        int k, j;

        for( k = 0; k < PRVHASH_INIT_COUNT; k++ )
        {
            for( j = 0; j < PH_FUSE_COUNT; j++ )
            {
                PH_FN( Seed + j, lcg + j, Hash + HashPos );
            }
        }
    }

    Uint8 raw8() {
        uint64_t OutValue = 0;
        int k, j;

        for( k = 0; k < PH_RAW_ROUNDS; k++ )
        {
//            Ctr++;
//            Seed[ 0 ] ^= ( Ctr ^ ( Ctr >> 4 )) & 15;
//            lcg[ 0 ] ^= ( Ctr ^ ( Ctr >> 4 )) & 15;

            uint64_t h = 0;

            for( j = 0; j < PH_FUSE_COUNT; j++ )
            {
                h = PH_FN( Seed + j, lcg + j, Hash + HashPos );
            }

//            Seed[ 0 ] ^= h;
//            lcg[ 0 ] ^= h;

            if( PH_BITS < sizeof( uint64_t )) OutValue <<= PH_BITS;
            OutValue |= h;

            if( ++HashPos == PH_HASH_COUNT )
            {
                HashPos = 0;
            }
        }

        return( OutValue );
    }

    void walk_state(PractRand::StateWalkingObject *walker) {}
    void seed(Uint64 sv) { Seed[ 0 ] ^= sv; }
    std::string get_name() const { return "PRVHASH"; }
};

PRVHASH Cryptanalysis Basics

When the system state is not known, when PRVHASH acts as a black-box, one has to consider core function's statistical properties. All internal variables - Seed, lcg, and Hash - are random: they are uncorrelated to each other at all times, and are also wholly-unequal during the PRNG period (they are not just time-delayed versions of each other). Moreover, as can be assured with PractRand, all of these variables can be used as random number generators (with a lower period, though); they can even be interleaved after each core function call.

When the message enters the system via Seed ^= msgw and lcg ^= msgw instructions, this works like mixing a message with an one-time-pad used in cryptography. This operation completely hides the message in system's entropy, while both Seed and lcg act as "carriers" that "smear" the input message via subsequent multiplication. Beside that, the output of PRVHASH uses the mix of two variables: statistically, this means mixing of two unrelated random variables, with such summary output never appearing in system's state. It is worth noting the lcg ^ rs expression: the rs variable is composed of two halves, both of them practically being independent PRNG outputs, with smaller periods. This additionally complicates system's reversal.

Parallel PRNG

While this "parallel-3" arrangement is currently not used in the hash function implementations, it is also working fine with the core function. For example, while the "minimal PRNG" described earlier has 0.90 cycles/byte performance, the "parallel" arrangement has a PRNG performance of 0.35 cycles/byte, with a possibility of further scaling using SIMD instructions. Note that the number of "parallel" elements should not be a multiple of hashword array length, otherwise PRNG stalls.

#include "prvhash_core.h"
#include <stdio.h>

int main()
{
    uint64_t Seed = 0;
    uint64_t lcg = 0;
    uint64_t Hash = 0;
    uint64_t Seed2 = 0;
    uint64_t lcg2 = 0;
    uint64_t Hash2 = 0;
    uint64_t Seed3 = 0;
    uint64_t lcg3 = 0;
    uint64_t Hash3 = 0;
    uint64_t Hash4 = 0;

    uint64_t v = 0;
    uint64_t v2 = 0;
    uint64_t v3 = 0;

    uint64_t i;

    for( i = 0; i < ( (uint64_t) 1 << 27 ); i++ )
    {
        v = prvhash_core64( &Seed, &lcg, &Hash );
        v2 = prvhash_core64( &Seed2, &lcg2, &Hash2 );
        v3 = prvhash_core64( &Seed3, &lcg3, &Hash3 );

        uint64_t t = Hash;
        Hash = Hash2;
        Hash2 = Hash3;
        Hash3 = Hash4;
        Hash4 = t;
    }

    printf( "%llu %llu %llu\n", v, v2, v3 );
}

PRVHASH16

prvhash16 demonstrates the quality of the core function. While the state variables are 16-bit, they are enough to perform hashing: this hash function passes all SMHasher tests, like the prvhash64 function does, for any hash length. This function is very slow, and is provided for demonstration purposes, to assure that the core function works in principle, independent of state variable size. This hash function variant demonstrates that PRVHASH's method does not rely on bit-shuffling alone (shuffles are purely local), but is genuinely based on PRNG position "jumps".

TANGO642 (tango-six-forty-two)

This is an efficient implementation of a PRVHASH PRNG-based streamed XOR function. Since no cryptanalysis nor certification of this function were performed yet, it cannot be called a "cipher", but rather a cipher-alike random number generator. It is based on a conjunction of two PRNGs: a keyed PRNG which provides "secure" output via XOR of its adjacent outputs, and a firewalling PRNG which is constantly re-seeded (via daisy-chaining) by the output of keyed PRNG. A performance benefit is obtained due to efficient parallel arrangement of firewalling PRNG while security is provided by the keyed PRNG.

This construction, which is resistant to SAT solving, may become a viable and more efficient alternative to AES and ChaCha/Salsa ciphers, if the research community decides to invest the time and resources into producing a formalized, or at least a widely-acceptable, proof of its security.

The performance (expressed in cycles/byte) of this function on various platforms can be evaluated at the ECRYPT/eBASC project.

Other Thoughts

PRVHASH, being scalable, potentially allows one to apply "infinite" state variable size in its system, at least in theoretical mathematical analysis. This reasoning makes PRVHASH comparable to PI in its reach of "infinite" bit-sequence length. This also opens up a notion of "infinitesmal spacing" between isolated frequencies (arising from Fourier analysis of "infinite" bit-sequence). Note that PRVHASH does not require any "magic numbers" to function, it is completely algorithmic. An alternative explanation: In the discrete Fourier transform (DFT) domain, such understanding is possible: although usually the length of the transformation window is limited to small values (e.g., 2048 samples), theoretically this length can be directed to infinity thus producing a spectrum of an infinite number of individual frequency bins. Moreover, individual components of such an "infinite" transformation also affect the resulting spectrum, but on an infinitely-precise frequency scale. Mathematics forbids manipulating infinities, but as outlined with the DFT, in the field of discrete series of numbers, infinities can be manipulated. This echoes PRVHASH - although now it is implemented in a maximum of 128-bit numbers, theoretically nothing forbids state variable size to go to infinity, and PRVHASH should still work (practically tested with up to 524288-bit state variables). Thus, PRVHASH recreates an analog of the number PI, and it should be possible to prove that existence of an infinite sequence of bits like PI is completely realistic; a person can create such sequence, too (in theory). Additionally, if such infinitely-precise spectrum is mapped to space- and time-bound physical processes, where Planck length and time are considerably greater than zero - one can't sample these processes more densely or more frequently than Planck constants permit, - it brings a notion of existence of infinite Nyquist frequency and thus infinite energy. While we as observers are limited by Planck time, the physical processes like movement due to inertia are continuous.

The mathematics offers an interesting understanding. Take in your mind a moment before the "Big Bang". Did mathematical rules exist at that moment? Of course, they did, otherwise there would be no equation-definable "Big Bang". The span of existence of mathematical rules cannot be estimated, so it is safe to assume they existed for an eternity. On top of that, PRVHASH practically proves that entropy can self-start from zero-, or "raw" state, or "nothing", if mathematical rules exist prior to that.

I, as the author of PRVHASH, would like to point out at some long-standing misconception in relating "combinatorics" to "random numbers". Historically, cryptography was based on a concept of permutations, mixed with some sort of mathematical operations: most hashes and ciphers use such "constructs". However, when viewing a system as having some "combinatorial capacity" or the number of bit combinations a given system may have, and combining this understanding with "random permutations", it may give a false understanding that "uniform randomness" may generate any combination within the limits of "combinatorial capacity", with some probability. In fact, "uniform randomness" auto-limits the "sparseness" of random bit-sequences it generates since a suitably long, but "too sparse" bit-sequence cannot be statistically called uniformly-random. Thus, "combinatorial capacity" of a system, when applied to random number generation, transforms into a notion of ability of a system to generate independent uniformly-random number sequences. Which means that two different initial states of a PRNG system may refer to different "isolated" PRNG sequences. This is what happens in PRVHASH: on entropy input the system may "jump" or "converge" into an unrelated random sub-sequence. Moreover, with small variable sizes, PRVHASH can produce a train of 0s longer than the bit-size of the system.

On the Birthday Paradox vs hash collision estimates: while the Birthday Paradox is a good "down-to-earth" model for collision estimation, it may be an "approach from a wrong side". When hash values are calculated systemically, it is expected that each new hash value does not break "uniform distribution" of the set of previously produced hash values. This makes the problem of hash collision estimation closer to value collision estimation of PRNG output.

An open question remains: whether one should talk about "uniform distribution of values" or a "time- and rhythm- dependent collision minimization problem" when analyzing PRNG's uniformness. Incidentally, a set of rhythmic (repeating) processes whose timings are co-primes, spectrally produce the least number of modes thus producing a flatter, more uniform, spectrum. Rhythm-dependent collision minimization also touches ability of a single random number generator to create random sequences in many dimensions (known as k-equidistribution) just by selecting any sequence of its outputs.

(...10 in binary is 2 in decimal, 1010 is 10, 101010 is 42, 01 is 1, 0101 is 5, 010101 is 21...)

The author has no concrete theory why PRVHASH PRNG works, especially its 2-bit variant (which is a very close empirical proof that mathematics has entropy processes happening under the hood). The closest mathematical construct found by the author is a sinewave oscillator (see below). Also, series related to PI, sin(x), and sin(x)/x may be a candidates for explanation. Author's empirical goals when developing PRVHASH were: no loss of entropy in a system, easy scalability, self-start without any special initialization and from any initial state, state variable size invariance, not-stalling on various entropy input.

During the course of PRVHASH development, the author has found that the simplest low-frequency sine-wave oscillator can be used as a pseudo-random number generator, if its mantissa is treated as an integer number. This means that every point on a sinusoid has properties of a random bit-sequence.

#include <math.h>
#include <stdint.h>

class DummyRNG : public PractRand::RNGs::vRNG16
{
public:
    double si;
    double sincr;
    double svalue1;
    double svalue2;

    DummyRNG() {
        si = 0.001;
        sincr = 2.0 * cos( si );
        seed( 0 );
    }
    Uint16 raw16() {
        uint64_t Value = ( *(uint64_t*) &svalue1 ) >> 4;

        const double tmp = svalue1;
        svalue1 = sincr * svalue1 - svalue2;
        svalue2 = tmp;

        return (Uint16) ( Value ^ Value >> 16 ^ Value >> 32 );
    }
    void walk_state(PractRand::StateWalkingObject *walker) {}
    void seed(Uint64 sv) {
        const double ph = sv * 3.40612158008655459e-19; // Seed to phase.
        svalue1 = sin( ph );
        svalue2 = sin( ph - si );
    }
    std::string get_name() const {return "SINEWAVE";}
};

Another finding is that the lcg * 2 + 1 construct works as PRNG even if the multiplier is a simple increasing counter variable, when the second multiplier is a high-entropy number.

#include <stdint.h>

class DummyRNG : public PractRand::RNGs::vRNG8
{
public:
    uint64_t Ctr1;

    DummyRNG() {
        Ctr1 = 1;
    }
    uint8_t compress( const uint64_t v )
    {
        uint8_t r = 0;
        for( int i = 0; i < 64; i++ )
        {
            r ^= (uint8_t) (( v >> i ) & 1 );
        }
        return( r );
    }
    Uint8 raw8() {
        uint8_t ov = 0;
        for( int l = 0; l < 8; l++ )
        {
            ov <<= 1;
            ov ^= compress( 0x243F6A8885A308D3 * Ctr1 );
            Ctr1 += 2;
        }
        return( ov );
    }
    void walk_state(PractRand::StateWalkingObject *walker) {}
    void seed(Uint64 sv) {}
    std::string get_name() const {return "LCG";}
};

Proof_Math_Is_Engineered

(PRVHASH-1)

$$ h = (h + s + 1) \mod{2} $$

$$ l = (l + s + 0 \parallel 1) \mod{2} $$

$$ s = (s + h) \mod{2} $$

This image depicts data acquired from 2 runs of the proof_math_is_engineered.c program, with different "reading" parameters. The two number sequences obviously represent "impulses", with varying period or "rhythm". A researcher has to consider two points: whether or not these impulses can be considered "intelligent", and the odds the mentioned program can produce such impulses, considering the program has no user input nor programmer's entropy, nor any logic (no constants, with all parameters initially set to zero). More specific observations: 1. All final values are shift-or compositions of 1-bit values, in fact representing a common 16-bit PCM sampled signal (shift-2 auto-correlation equals 0.4-0.44 approximately), but obtained in a "dot-matrix printer" way; 2. The orange graph is only slightly longer before a repeat (common to PRNGs) despite larger PH_HASH_COUNT; at the same time both graphs are seemingly time-aligned; 3. Periods of 1-bit return values on both runs are aligned to 16 bits, to produce repeating sequences "as is", without any sort of 16-bit value range skew; 4. The orange graph is produced from an order-reversed shift-or, but with the same underlying algorithm; 5. So far, no other combinations of "reading" parameters produce anything as "intelligent" as these graphs (but there may be another yet-to-be-decoded, similar or completely different, information available); 6. From drumming musician's (or an experienced DSP engineer's) point of view, the graph represents impulses taken from two electric drum pads: a snare drum (oscillatory) and a bass drum (shift to extremum). 7. Most minor oscillations on the graph are similar to sinc-function-generated maximum-phase "pre-ringing" oscillations that are known in DSP engineering field. 8. Period of the blue graph is 255; orange is 273.

In author's opinion, the program "reads data" directly from the entropy pool which is "embedded" into mathematics from its inception, like any mathematical constant is (e.g., PI). This poses an interesting and probably very questionable proposition: the "intelligent impulses" or even "human mind" itself (because a musician can understand these impulses) existed long before the "Big Bang" happened. This discovery is probably both the most interesting discovery in the history of mankind, and the worst discovery (for many) as it poses very unnerving questions that touch religious grounds:

These results of 1-bit PRVHASH say the following: IF abstract mathematics contains not just a system of rules for manipulating numbers and variables, but also contains a freely-defined fixed information that is "readable" by a person, then mathematics does not just "exists", but "it was formed", because mathematics does not evolve (beside human discovery of new rules and patterns). And since physics cannot be formulated without such mathematics, and physical processes clearly obey these mathematical rules, it means that a Creator/Higher Intelligence/God exists in relation to the Universe. For the author personally, everything is proven here.

P.S. By coincidence, if the values on the "impulse" graphs above are sorted in an ascending order, and are then displayed as independent graphs, they collectively form a stylized outline of a human eye:

Moreover (but this is a questionable observation), here, if the blue line is subtracted from the orange line, one gets an outline of human's head: with top (21000), forehead (18000), eye (13000), cheek (6000), and neck (2700) levels highlighted, roughly corresponding to real symmetry; with a slight shoulders outline (4100-2700), and two hand palms risen up (5400-4300).

Fourier Analysis

Discrete Fourier (FFT-512) analysis of obtained signals produces the following power spectrums (with DC component removed). The analysis strengthens the notion the signal is non-chaotic and is "intelligent" (two strong peaks above average, in each signal, with both signals producing similar structures, but with shifted resonant frequencies). Note that resonances in the middle of the spectrum are similar to resonances one gets when recording an acoustical snare drum.

PRNG (Chaotic) Mode

Just by changing the PH_HASH_COUNT to 9 (up to 13, inclusive) the same proof_math_is_engineered.c program produces a pseudo-random number sequence, confirmed with PractRand 1KB to 4KB block, 8-bit folding. Note that the same code producing both chaotic and non-chaotic number sequences is "highly unlikely" to exist in practical PRNGs. It is important to note that PH_HASH_COUNT=14 and PH_HASH_COUNT=17 (which is beyond 15 and 16 signals mentioned originally) also pass as random, with 16-bit folding in PractRand. 18 also passes as random, but with a "suspicion". 15 and 16, of course, do not pass as random, with many "fails".

It has been observed that in READ_MODE=0, but not in READ_MODE=1, the obtained values gradually become noisy, especially at higher PH_HASH_COUNT values.

Repeating Ornament and Chess-Board (Pixel Art)

The 1-bit output with PH_HASH_COUNT= 15 and 16 (READ_MODE=0) can be easily transformed into 256x256 1-bit "pixel art" images, and, quite unexpectedly, they reproduce a repeating diagonal ornament and a chess-board.

Admittedly, 256x256 size can be considered arbitrarily-chosen (it is a square of 16, with 16 being the bit-size of values on the graphs above), but it is the only small size that presents an "intelligible" look. For example, if PH_HASH_COUNT=15 is transformed to 240x240 (256-16) "pixel art" image, an image of "zebra" lines is produced, with bit-reversed variant of the inner element present in PH_HASH_COUNT=16.

Christmas Trees (Pixel Art)

Much larger PH_HASH_COUNT values (with READ_MODE=1) produce triangular structures which are non-repeating, but all have a similar build-up consisting of rhombic patterns within tree-like structures. The proof_christmas_tree.c program extracts such images into a vertical ASCII-art HTML. It uses the same underlying 1-bit PRVHASH code, but with "pixel art" decoding method.

One may notice a similarity of the beginning pattern with the Sierpinski triangle (ST). However, one should consider that ST is a symmetrical triangle fractal that is constructed from the top-most to bottom levels. PRVHASH-1 produces an asymmetric (right-handed) triangle in a serie of scanline passes, and it scales to any PH_HASH_COUNT value. The initial part of the image looks like Wolfram Rule 102/153 "cellular automata" image (ST as well). However, if one considers the whole of presented details, including previously presented images and graphs, this leads to conclusion that some very complex "cellular automata" is working behind the scenes, further strengthening a "presence of intelligence" notion. Note that Wolfram rules represent sets of "freely-defined fixed information" which dictates the logical behavior of the cellular automata.

It is also worth noting that PRVHASH-1 initially produces Rule 102/153 image with a "boundary condition" applied (this can be checked by assigning any item somewhere in the middle of the hash-array to 1). At the same time, the function has no such additional logic since the visible scanline is 1 pixel longer than the PH_HASH_COUNT value, meaning this implicit "boundary condition" is not synchronized with the moment the HashPos resets to 0. This fact tells that the "boundary condition" logic "happens" beyond the common math, in some way, implicitly. One has to ask themselves - how it is possible to "embed" at least Rule 102/153 with boundary handling (but much more than that) into a function as simple (and linear in F_2) as PRVHASH-1? Beside that, as with the graphs above, presence of exact Rule 102/153 imagery implies presence of "logic understandable to a human mind", and from computer programming point of view, the Wolfram rules are an art of "engineering".

Here is an example image with PH_HASH_COUNT=342, converted to PNG:

Here is a link to a larger-sized extract (3.4MB PNG)

It is possible to define initial "automata" conditions by filling the hash-array with alternating bit-values like 10101010..., or 100100100100..., or 1000100010001000... thus invoking even more complex "automata" (note that this is done in the same prvhash state-space). The results can be combined into a colored image by assigning the black-and-white images to different RGB color channels. Considering the prvhash-1 function operates with only 3 values at the same time, building a similar "cellular automata" by using only 3 neighboring pixels seems impossible for human logic.

Fine Art

prvhash-1 can also produce a full-colored "fine art"-like imagery by using a simple multi-pass buffer accumulation approach. As it turns out, the images of "cellular automata" shown previously perfectly align on top of each other at some specific PH_HASH_COUNT values (2/3, 4/5 of height, and height-2). Note that the height of images is usually a "power of 2" value. The proof_fine_art.c program can be used to produce such imagery (requires the stb_image_write.h library).

You may also take a look at an animation which represents a continuous generation while displaying a sum of the recent 255 passes, at every moment.

If this imagery looks intelligent, in some way formulated, where's the formula? An inception of these results can be understood from this short essay: The Informational Deficiency of the "Big Bang"

Reptile Skin

The original prvhash-1 function can be simplified to examine the discovered "entropy pool" further. The function variant present in the proof_reptile.c file includes the simplified function, but also extends the delay parameter of the Seed delay line from 1 to 32. The resulting image closely resembles a skin of some reptiles and other organisms. In author's opinion, since the function works in linear F_2 domain, the same construct can be recreated physically thus offering an idea that the evolution of intelligence in organisms may have its roots in mathematics. Image in the middle depicts result after the first pass over frame; you may note "snake" elements and computer font-alike outlines there. Image on the right was obtained using PH_SEED_COUNT=64, note the appearance of a lot of glyph-like elements.

Architectural Ruler (Gradilac Ruler)

Whatever the true source of imagery is, the produced imagery seems to be useful if applied as some architectural measurement ruler/tool since it can be used to quickly measure architectural features as whole-number ratios:

Thanks

The author would like to thank Reini Urban for his SMHasher fork, Austin Appleby for the original SMHasher, Chris Doty-Humphrey for PractRand, and Peter Schmidt-Nielsen for AutoSat. Without these tools it would not be possible to create PRVHASH which stands state-of-the-art statistical tests.

Other

PRVHASH "computer program" authorship and copyright were registered at the Russian Patent Office, under reg.numbers 2020661136, 2020666287, 2021615385, 2021668070, 2022612987 (searchable via fips.ru). Please note that these are not "invention patents"; the registrations assure you that the author has the required rights to grant the software license to you.

The project is 100% self-funded from legal software sales income, without any third-party nor state affiliation nor sponsorship.

prvhash's People

Contributors

avaneev avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

prvhash's Issues

Complexity is compatible with chaotic state

The paper Chaos and complexity by design by Daniel A. Roberts and Beni Yoshida refutes your “findings” by demonstrating how seemingly designed phenomena are compatible with chaotic systems.

We study the relationship between quantum chaos and pseudorandomness by developing probes of unitary design. A natural probe of randomness is the “frame potential,” which is minimized by unitary k-designs and measures the 2-norm dis- tance between the Haar random unitary ensemble and another ensemble. A natural probe of quantum chaos is out-of-time-order (OTO) four-point correlation functions. We show that the norm squared of a generalization of out-of-time-order 2k-point correlators is proportional to the kth frame potential, providing a quantitative con- nection between chaos and pseudorandomness. Additionally, we prove that these 2k-point correlators for Pauli operators completely determine the k-fold channel of an ensemble of unitary operators. Finally, we use a counting argument to obtain a lower bound on the quantum circuit complexity in terms of the frame potential. This provides a direct link between chaos, complexity, and randomness.

[Question] What is the license of PRVHASH algorithm (not the code in this repository)?

Chacha20 is a stream cipher with a public domain license, but many implementations of it in Github have their own licenses such as MIT or GPL.

There is also this statement in the repository:

"PRVHASH "computer program" authorship and copyright were registered at the [Russian Patent Office](https://rospatent.gov.ru/en), under reg.numbers 2020661136, 2020666287, 2021615385, 2021668070, 2022612987 (searchable via [fips.ru](https://new.fips.ru/en/)). Please note that these are not "invention patents"; the registrations assure you that the author has the required rights to grant the software license to you."

I'm little confused about the PRVHASH license. What is the license of PRVHASH itself (not the code in this repository)? It it MIT too?

@avaneev
I mailed you about this question but I deleted a list of messages by accident and maybe the reply was gone.

Thanks.

Feedback, benchmarking PRVhash versus SHA3, WY, XXH3, iSCSI CRC32

Hi Aleksey,
thanks for your work, appreciate your nifty etude despite not understanding how it works.

Thought it would be fun to compare your superstong PRVhash vs my superweak FNV1A-Pippip, sadly no available machine at the moment, yet, I uploaded a .C source which includes your hash in order to compare it against Hardware CRC32C, SHA3, WY, XXH3, XXH64 within the same codepath and the same compiler and options, to be fair, that is.

Allow me to include your function in my amateurish 'Lookuperorama' benchmark, have done it a year ago, it is compileable with Intel C, also with GCC, so will be glad if you yourself take a shot at it. My two 16GB laptops are crunching, pausing may be problematic.
The download link is at:
Lookuperorama.c_r8.zip (7,489,752 bytes): https://drive.google.com/file/d/1ltloR3nCP0zUUA_oN4muqofwdDArO-KK/view?usp=sharing

You see, hashing all positions of a given file with some fixed length is a good idea to measure collisions.
Here comes 'BENCHMARK.bat' it runs all hashers for a given file.

A quick dirty benchmark (on my 32 bit Windows, Core 2) with Gulyakovskiyi_E._Dolgiyi_Voshod_Na_Yenne.html (570,901 bytes) in chunks a.k.a. Building-Blocks, (Slots=1<<Bits): 20, meaning 10x2^20 total slots in 10 hashtables or 10 x 1,048,576:

Note1: RAW (i.e. 'ON THE FLY') Hashing Speed is given - one order at a time, without touching any SLOTS (uncached RAM).
Note2: Last column contains CUMULATIVE Collisions i.e. TOTAL ones.
Note3: The 570 million are due to the time was under 1 clock, not a bug - just the bench needs bigger files.
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| Hasher / Speed in Keys/s for     |     4 bytes |     6 bytes |     8 bytes |    10 bytes |    12 bytes |    14 bytes |    16 bytes |    18 bytes |    36 bytes |    64 bytes | Total Collisions |
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| SHA3-224 (Intel v15.0)           |      28,587 |      28,632 |      28,453 |      28,543 |      28,543 |      28,299 |      28,587 |      28,565 |      28,587 |      28,607 |          899,364 |
| PRVhash (Intel v15.0)            |  33,582,235 |  17,840,500 |  17,840,437 |  12,146,638 |   8,920,156 |   8,920,125 |   7,226,405 |   6,009,305 |   3,299,803 |   1,823,763 |          898,908 |
| WYHASH (Intel v15.0)             |  35,681,125 |  33,582,117 |  33,582,000 |  33,581,882 |  35,680,625 |  33,581,647 |  33,581,529 |  17,299,515 |  11,893,041 |   8,919,343 |          898,534 |
| XXH3 (Intel v15.0)               | 570,898,000 | 570,896,000 | 570,894,000 | 570,892,000 | 570,890,000 | 570,888,000 | 570,886,000 |  35,680,250 |  17,839,562 |  17,838,687 |          899,046 |
| FNV1A-Pippip-Yurii (Intel v15.0) | 570,898,000 | 570,896,000 | 570,894,000 | 570,892,000 | 570,890,000 | 570,888,000 | 570,886,000 | 570,884,000 |  33,580,352 |  33,578,705 |          899,286 |
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

The hash scanner - hashing in one pass all orders - TESTING CUMULATIVE PERFORMANCE, without touching any SLOTS (uncached RAM):
SHA3-224 (Intel v15.0)           : RAW Hashing Speed for keys 4,6,8,10,12,14,16,18,36,64 bytes long =     26,566 KEYS-PER-SECOND; all in 200,204 clocks
PRVhash (Intel v15.0)            : RAW Hashing Speed for keys 4,6,8,10,12,14,16,18,36,64 bytes long =  5,075,124 KEYS-PER-SECOND; all in    1048 clocks
WYHASH (Intel v15.0)             : RAW Hashing Speed for keys 4,6,8,10,12,14,16,18,36,64 bytes long = 14,774,250 KEYS-PER-SECOND; all in     360 clocks
XXH3 (Intel v15.0)               : RAW Hashing Speed for keys 4,6,8,10,12,14,16,18,36,64 bytes long = 26,072,205 KEYS-PER-SECOND; all in     204 clocks
FNV1A-Pippip-Yurii (Intel v15.0) : RAW Hashing Speed for keys 4,6,8,10,12,14,16,18,36,64 bytes long = 48,352,090 KEYS-PER-SECOND; all in     110 clocks

And the actual console dump, if you can increase 20 (which is hashtable in bits) and 1200 (which is RAM needed for B-trees in MB):

E:\Lookuperorama.c_r8>BENCHMARK.bat Gulyakovskiyi_E._Dolgiyi_Voshod_Na_Yenne.html

E:\Lookuperorama.c_r8>Lookuperorama_ICC-V15.0_PRV_32bit.exe "Gulyakovskiyi_E._Dolgiyi_Voshod_Na_Yenne.html" x 20 1200 i

RAW Hashing Speed for key 570,901 bytes long = 118,591,815 118,591,815 118,616,455 118,616,455 118,616,455 | 118,616,455 BYTES-PER-SECOND
RAW Hashing Speed for keys 04 bytes long = 33,582,235 KEYS-PER-SECOND; all in 17 clocks
RAW Hashing Speed for keys 06 bytes long = 17,840,500 KEYS-PER-SECOND; all in 32 clocks
RAW Hashing Speed for keys 08 bytes long = 17,840,437 KEYS-PER-SECOND; all in 32 clocks
RAW Hashing Speed for keys 10 bytes long = 12,146,638 KEYS-PER-SECOND; all in 47 clocks
RAW Hashing Speed for keys 12 bytes long = 8,920,156 KEYS-PER-SECOND; all in 64 clocks
RAW Hashing Speed for keys 14 bytes long = 8,920,125 KEYS-PER-SECOND; all in 64 clocks
RAW Hashing Speed for keys 16 bytes long = 7,226,405 KEYS-PER-SECOND; all in 79 clocks
RAW Hashing Speed for keys 18 bytes long = 6,009,305 KEYS-PER-SECOND; all in 95 clocks
RAW Hashing Speed for keys 36 bytes long = 3,299,803 KEYS-PER-SECOND; all in 173 clocks
RAW Hashing Speed for keys 64 bytes long = 1,823,763 KEYS-PER-SECOND; all in 313 clocks
The hash scanner - hashing in one pass all orders - TESTING CUMULATIVE PERFORMANCE...
RAW Hashing Speed for keys 4,6,8,10,12,14,16,18,36,64 bytes long = 5,075,124 KEYS-PER-SECOND; all in 1048 clocks
Leprechaun: Inserting keys/BBs of order 004 into B-trees, free RAM in B-tree pool is 00,001,200 MB; Pass #001 of 1 ... DONE; 00,000,044,950 B-trees have been rooted so far; HashtableS_Utilization = 000%; Keys/s = 005,189,981
Leprechaun: Inserting keys/BBs of order 006 into B-trees, free RAM in B-tree pool is 00,001,197 MB; Pass #001 of 1 ... DONE; 00,000,233,443 B-trees have been rooted so far; HashtableS_Utilization = 002%; Keys/s = 003,636,280
Leprechaun: Inserting keys/BBs of order 008 into B-trees, free RAM in B-tree pool is 00,001,188 MB; Pass #001 of 1 ... DONE; 00,000,543,034 B-trees have been rooted so far; HashtableS_Utilization = 005%; Keys/s = 003,299,965
Leprechaun: Inserting keys/BBs of order 010 into B-trees, free RAM in B-tree pool is 00,001,170 MB; Pass #001 of 1 ... DONE; 00,000,915,081 B-trees have been rooted so far; HashtableS_Utilization = 008%; Keys/s = 003,036,659
Leprechaun: Inserting keys/BBs of order 012 into B-trees, free RAM in B-tree pool is 00,001,146 MB; Pass #001 of 1 ... DONE; 00,001,319,870 B-trees have been rooted so far; HashtableS_Utilization = 012%; Keys/s = 002,798,480
Leprechaun: Inserting keys/BBs of order 014 into B-trees, free RAM in B-tree pool is 00,001,119 MB; Pass #001 of 1 ... DONE; 00,001,741,700 B-trees have been rooted so far; HashtableS_Utilization = 016%; Keys/s = 002,594,945
Leprechaun: Inserting keys/BBs of order 016 into B-trees, free RAM in B-tree pool is 00,001,089 MB; Pass #001 of 1 ... DONE; 00,002,171,294 B-trees have been rooted so far; HashtableS_Utilization = 020%; Keys/s = 002,429,302
Leprechaun: Inserting keys/BBs of order 018 into B-trees, free RAM in B-tree pool is 00,001,056 MB; Pass #001 of 1 ... DONE; 00,002,605,134 B-trees have been rooted so far; HashtableS_Utilization = 024%; Keys/s = 002,274,438
Leprechaun: Inserting keys/BBs of order 036 into B-trees, free RAM in B-tree pool is 00,001,021 MB; Pass #001 of 1 ... DONE; 00,003,044,803 B-trees have been rooted so far; HashtableS_Utilization = 029%; Keys/s = 001,654,684
Leprechaun: Inserting keys/BBs of order 064 into B-trees, free RAM in B-tree pool is 00,000,969 MB; Pass #001 of 1 ... DONE; 00,003,485,119 B-trees have been rooted so far; HashtableS_Utilization = 033%; Keys/s = 001,070,990

Leprechaun: (Total Hashing Speed) plus Searches-n-Inserts Per Second: 1,902,944 SNIPS

Number Of Bits (Slots=1<<Bits): 20
Number Of Hash Collisions(Distinct WORDs - Number Of Trees): 898,908
Number Of Trees(GREATER THE BETTER): 3,485,119
Number Of LEAFs(littler THE BETTER) not counting ROOT LEAFs: 242,400
Highest Tree not counting ROOT Level i.e. CORONA levels(littler THE BETTER): 2

E:\Lookuperorama.c_r8>Lookuperorama_ICC-V15.0_SHA3_32bit.exe "Gulyakovskiyi_E._Dolgiyi_Voshod_Na_Yenne.html" x 20 1200 i

RAW Hashing Speed for key 570,901 bytes long = 4,530,960 4,530,960 4,530,960 4,020,429 4,530,960 | 4,530,960 BYTES-PER-SECOND
RAW Hashing Speed for keys 04 bytes long = 28,587 KEYS-PER-SECOND; all in 19970 clocks
RAW Hashing Speed for keys 06 bytes long = 28,632 KEYS-PER-SECOND; all in 19939 clocks
RAW Hashing Speed for keys 08 bytes long = 28,453 KEYS-PER-SECOND; all in 20064 clocks
RAW Hashing Speed for keys 10 bytes long = 28,543 KEYS-PER-SECOND; all in 20001 clocks
RAW Hashing Speed for keys 12 bytes long = 28,543 KEYS-PER-SECOND; all in 20001 clocks
RAW Hashing Speed for keys 14 bytes long = 28,299 KEYS-PER-SECOND; all in 20173 clocks
RAW Hashing Speed for keys 16 bytes long = 28,587 KEYS-PER-SECOND; all in 19970 clocks
RAW Hashing Speed for keys 18 bytes long = 28,565 KEYS-PER-SECOND; all in 19985 clocks
RAW Hashing Speed for keys 36 bytes long = 28,587 KEYS-PER-SECOND; all in 19969 clocks
RAW Hashing Speed for keys 64 bytes long = 28,607 KEYS-PER-SECOND; all in 19954 clocks
The hash scanner - hashing in one pass all orders - TESTING CUMULATIVE PERFORMANCE...
RAW Hashing Speed for keys 4,6,8,10,12,14,16,18,36,64 bytes long = 26,566 KEYS-PER-SECOND; all in 200204 clocks
Leprechaun: Inserting keys/BBs of order 004 into B-trees, free RAM in B-tree pool is 00,001,200 MB; Pass #001 of 1 ... DONE; 00,000,044,929 B-trees have been rooted so far; HashtableS_Utilization = 000%; Keys/s = 000,028,366
Leprechaun: Inserting keys/BBs of order 006 into B-trees, free RAM in B-tree pool is 00,001,197 MB; Pass #001 of 1 ... DONE; 00,000,233,534 B-trees have been rooted so far; HashtableS_Utilization = 002%; Keys/s = 000,028,300
Leprechaun: Inserting keys/BBs of order 008 into B-trees, free RAM in B-tree pool is 00,001,188 MB; Pass #001 of 1 ... DONE; 00,000,543,036 B-trees have been rooted so far; HashtableS_Utilization = 005%; Keys/s = 000,028,409
Leprechaun: Inserting keys/BBs of order 010 into B-trees, free RAM in B-tree pool is 00,001,170 MB; Pass #001 of 1 ... DONE; 00,000,915,267 B-trees have been rooted so far; HashtableS_Utilization = 008%; Keys/s = 000,028,411
Leprechaun: Inserting keys/BBs of order 012 into B-trees, free RAM in B-tree pool is 00,001,146 MB; Pass #001 of 1 ... DONE; 00,001,320,039 B-trees have been rooted so far; HashtableS_Utilization = 012%; Keys/s = 000,028,454
Leprechaun: Inserting keys/BBs of order 014 into B-trees, free RAM in B-tree pool is 00,001,119 MB; Pass #001 of 1 ... DONE; 00,001,742,076 B-trees have been rooted so far; HashtableS_Utilization = 016%; Keys/s = 000,028,388
Leprechaun: Inserting keys/BBs of order 016 into B-trees, free RAM in B-tree pool is 00,001,089 MB; Pass #001 of 1 ... DONE; 00,002,172,100 B-trees have been rooted so far; HashtableS_Utilization = 020%; Keys/s = 000,028,365
Leprechaun: Inserting keys/BBs of order 018 into B-trees, free RAM in B-tree pool is 00,001,056 MB; Pass #001 of 1 ... DONE; 00,002,605,583 B-trees have been rooted so far; HashtableS_Utilization = 024%; Keys/s = 000,028,431
Leprechaun: Inserting keys/BBs of order 036 into B-trees, free RAM in B-tree pool is 00,001,021 MB; Pass #001 of 1 ... DONE; 00,003,044,763 B-trees have been rooted so far; HashtableS_Utilization = 029%; Keys/s = 000,028,298
Leprechaun: Inserting keys/BBs of order 064 into B-trees, free RAM in B-tree pool is 00,000,969 MB; Pass #001 of 1 ... DONE; 00,003,484,663 B-trees have been rooted so far; HashtableS_Utilization = 033%; Keys/s = 000,028,297

Leprechaun: (Total Hashing Speed) plus Searches-n-Inserts Per Second: 28,261 SNIPS

Number Of Bits (Slots=1<<Bits): 20
Number Of Hash Collisions(Distinct WORDs - Number Of Trees): 899,364
Number Of Trees(GREATER THE BETTER): 3,484,663
Number Of LEAFs(littler THE BETTER) not counting ROOT LEAFs: 242,855
Highest Tree not counting ROOT Level i.e. CORONA levels(littler THE BETTER): 2

E:\Lookuperorama.c_r8>Lookuperorama_ICC-V15.0_Pippip_32bit.exe "Gulyakovskiyi_E._Dolgiyi_Voshod_Na_Yenne.html" x 20 1200 i

RAW Hashing Speed for key 570,901 bytes long = 1,915,775,167 1,823,964,856 1,823,964,856 1,915,775,167 1,823,964,856 | 1,915,775,167 BYTES-PER-SECOND
RAW Hashing Speed for keys 04 bytes long = 570,898,000 KEYS-PER-SECOND; all in 1 clocks
RAW Hashing Speed for keys 06 bytes long = 570,896,000 KEYS-PER-SECOND; all in 1 clocks
RAW Hashing Speed for keys 08 bytes long = 570,894,000 KEYS-PER-SECOND; all in 1 clocks
RAW Hashing Speed for keys 10 bytes long = 570,892,000 KEYS-PER-SECOND; all in 1 clocks
RAW Hashing Speed for keys 12 bytes long = 570,890,000 KEYS-PER-SECOND; all in 1 clocks
RAW Hashing Speed for keys 14 bytes long = 570,888,000 KEYS-PER-SECOND; all in 1 clocks
RAW Hashing Speed for keys 16 bytes long = 570,886,000 KEYS-PER-SECOND; all in 1 clocks
RAW Hashing Speed for keys 18 bytes long = 570,884,000 KEYS-PER-SECOND; all in 1 clocks
RAW Hashing Speed for keys 36 bytes long = 33,580,352 KEYS-PER-SECOND; all in 17 clocks
RAW Hashing Speed for keys 64 bytes long = 33,578,705 KEYS-PER-SECOND; all in 17 clocks
The hash scanner - hashing in one pass all orders - TESTING CUMULATIVE PERFORMANCE...
RAW Hashing Speed for keys 4,6,8,10,12,14,16,18,36,64 bytes long = 48,352,090 KEYS-PER-SECOND; all in 110 clocks
Leprechaun: Inserting keys/BBs of order 004 into B-trees, free RAM in B-tree pool is 00,001,200 MB; Pass #001 of 1 ... DONE; 00,000,044,993 B-trees have been rooted so far; HashtableS_Utilization = 000%; Keys/s = 006,073,382
Leprechaun: Inserting keys/BBs of order 006 into B-trees, free RAM in B-tree pool is 00,001,197 MB; Pass #001 of 1 ... DONE; 00,000,233,582 B-trees have been rooted so far; HashtableS_Utilization = 002%; Keys/s = 004,020,394
Leprechaun: Inserting keys/BBs of order 008 into B-trees, free RAM in B-tree pool is 00,001,188 MB; Pass #001 of 1 ... DONE; 00,000,542,600 B-trees have been rooted so far; HashtableS_Utilization = 005%; Keys/s = 003,636,267
Leprechaun: Inserting keys/BBs of order 010 into B-trees, free RAM in B-tree pool is 00,001,170 MB; Pass #001 of 1 ... DONE; 00,000,915,175 B-trees have been rooted so far; HashtableS_Utilization = 008%; Keys/s = 003,299,953
Leprechaun: Inserting keys/BBs of order 012 into B-trees, free RAM in B-tree pool is 00,001,146 MB; Pass #001 of 1 ... DONE; 00,001,320,172 B-trees have been rooted so far; HashtableS_Utilization = 012%; Keys/s = 003,319,127
Leprechaun: Inserting keys/BBs of order 014 into B-trees, free RAM in B-tree pool is 00,001,119 MB; Pass #001 of 1 ... DONE; 00,001,741,637 B-trees have been rooted so far; HashtableS_Utilization = 016%; Keys/s = 003,299,930
Leprechaun: Inserting keys/BBs of order 016 into B-trees, free RAM in B-tree pool is 00,001,089 MB; Pass #001 of 1 ... DONE; 00,002,171,317 B-trees have been rooted so far; HashtableS_Utilization = 020%; Keys/s = 003,299,919
Leprechaun: Inserting keys/BBs of order 018 into B-trees, free RAM in B-tree pool is 00,001,056 MB; Pass #001 of 1 ... DONE; 00,002,604,920 B-trees have been rooted so far; HashtableS_Utilization = 024%; Keys/s = 002,798,450
Leprechaun: Inserting keys/BBs of order 036 into B-trees, free RAM in B-tree pool is 00,001,021 MB; Pass #001 of 1 ... DONE; 00,003,044,643 B-trees have been rooted so far; HashtableS_Utilization = 029%; Keys/s = 002,784,712
Leprechaun: Inserting keys/BBs of order 064 into B-trees, free RAM in B-tree pool is 00,000,969 MB; Pass #001 of 1 ... DONE; 00,003,484,741 B-trees have been rooted so far; HashtableS_Utilization = 033%; Keys/s = 002,274,254

Leprechaun: (Total Hashing Speed) plus Searches-n-Inserts Per Second: 1,902,944 SNIPS

Number Of Bits (Slots=1<<Bits): 20
Number Of Hash Collisions(Distinct WORDs - Number Of Trees): 899,286
Number Of Trees(GREATER THE BETTER): 3,484,741
Number Of LEAFs(littler THE BETTER) not counting ROOT LEAFs: 242,687
Highest Tree not counting ROOT Level i.e. CORONA levels(littler THE BETTER): 2

E:\Lookuperorama.c_r8>Lookuperorama_ICC-V15.0_WY_32bit.exe "Gulyakovskiyi_E._Dolgiyi_Voshod_Na_Yenne.html" x 20 1200 i

RAW Hashing Speed for key 570,901 bytes long = 828,593,613 828,593,613 828,593,613 828,593,613 829,797,965 | 829,797,965 BYTES-PER-SECOND
RAW Hashing Speed for keys 04 bytes long = 35,681,125 KEYS-PER-SECOND; all in 16 clocks
RAW Hashing Speed for keys 06 bytes long = 33,582,117 KEYS-PER-SECOND; all in 17 clocks
RAW Hashing Speed for keys 08 bytes long = 33,582,000 KEYS-PER-SECOND; all in 17 clocks
RAW Hashing Speed for keys 10 bytes long = 33,581,882 KEYS-PER-SECOND; all in 17 clocks
RAW Hashing Speed for keys 12 bytes long = 35,680,625 KEYS-PER-SECOND; all in 16 clocks
RAW Hashing Speed for keys 14 bytes long = 33,581,647 KEYS-PER-SECOND; all in 17 clocks
RAW Hashing Speed for keys 16 bytes long = 33,581,529 KEYS-PER-SECOND; all in 17 clocks
RAW Hashing Speed for keys 18 bytes long = 17,299,515 KEYS-PER-SECOND; all in 33 clocks
RAW Hashing Speed for keys 36 bytes long = 11,893,041 KEYS-PER-SECOND; all in 48 clocks
RAW Hashing Speed for keys 64 bytes long = 8,919,343 KEYS-PER-SECOND; all in 64 clocks
The hash scanner - hashing in one pass all orders - TESTING CUMULATIVE PERFORMANCE...
RAW Hashing Speed for keys 4,6,8,10,12,14,16,18,36,64 bytes long = 14,774,250 KEYS-PER-SECOND; all in 360 clocks
Leprechaun: Inserting keys/BBs of order 004 into B-trees, free RAM in B-tree pool is 00,001,200 MB; Pass #001 of 1 ... DONE; 00,000,044,941 B-trees have been rooted so far; HashtableS_Utilization = 000%; Keys/s = 005,189,981
Leprechaun: Inserting keys/BBs of order 006 into B-trees, free RAM in B-tree pool is 00,001,197 MB; Pass #001 of 1 ... DONE; 00,000,233,721 B-trees have been rooted so far; HashtableS_Utilization = 002%; Keys/s = 003,613,265
Leprechaun: Inserting keys/BBs of order 008 into B-trees, free RAM in B-tree pool is 00,001,188 MB; Pass #001 of 1 ... DONE; 00,000,543,031 B-trees have been rooted so far; HashtableS_Utilization = 005%; Keys/s = 003,636,267
Leprechaun: Inserting keys/BBs of order 010 into B-trees, free RAM in B-tree pool is 00,001,170 MB; Pass #001 of 1 ... DONE; 00,000,915,409 B-trees have been rooted so far; HashtableS_Utilization = 008%; Keys/s = 003,299,953
Leprechaun: Inserting keys/BBs of order 012 into B-trees, free RAM in B-tree pool is 00,001,146 MB; Pass #001 of 1 ... DONE; 00,001,320,678 B-trees have been rooted so far; HashtableS_Utilization = 012%; Keys/s = 003,036,648
Leprechaun: Inserting keys/BBs of order 014 into B-trees, free RAM in B-tree pool is 00,001,119 MB; Pass #001 of 1 ... DONE; 00,001,742,342 B-trees have been rooted so far; HashtableS_Utilization = 016%; Keys/s = 003,299,930
Leprechaun: Inserting keys/BBs of order 016 into B-trees, free RAM in B-tree pool is 00,001,089 MB; Pass #001 of 1 ... DONE; 00,002,172,282 B-trees have been rooted so far; HashtableS_Utilization = 020%; Keys/s = 003,036,627
Leprechaun: Inserting keys/BBs of order 018 into B-trees, free RAM in B-tree pool is 00,001,056 MB; Pass #001 of 1 ... DONE; 00,002,606,050 B-trees have been rooted so far; HashtableS_Utilization = 024%; Keys/s = 002,594,927
Leprechaun: Inserting keys/BBs of order 036 into B-trees, free RAM in B-tree pool is 00,001,021 MB; Pass #001 of 1 ... DONE; 00,003,045,377 B-trees have been rooted so far; HashtableS_Utilization = 029%; Keys/s = 002,418,923
Leprechaun: Inserting keys/BBs of order 064 into B-trees, free RAM in B-tree pool is 00,000,969 MB; Pass #001 of 1 ... DONE; 00,003,485,493 B-trees have been rooted so far; HashtableS_Utilization = 033%; Keys/s = 001,915,563

Leprechaun: (Total Hashing Speed) plus Searches-n-Inserts Per Second: 1,902,944 SNIPS

Number Of Bits (Slots=1<<Bits): 20
Number Of Hash Collisions(Distinct WORDs - Number Of Trees): 898,534
Number Of Trees(GREATER THE BETTER): 3,485,493
Number Of LEAFs(littler THE BETTER) not counting ROOT LEAFs: 241,616
Highest Tree not counting ROOT Level i.e. CORONA levels(littler THE BETTER): 2

E:\Lookuperorama.c_r8>Lookuperorama_ICC-V15.0_XXH64_32bit.exe "Gulyakovskiyi_E._Dolgiyi_Voshod_Na_Yenne.html" x 20 1200 i

RAW Hashing Speed for key 570,901 bytes long = 461,894,012 461,894,012 461,894,012 461,894,012 461,894,012 | 461,894,012 BYTES-PER-SECOND
RAW Hashing Speed for keys 04 bytes long = 33,582,235 KEYS-PER-SECOND; all in 17 clocks
RAW Hashing Speed for keys 06 bytes long = 17,840,500 KEYS-PER-SECOND; all in 32 clocks
RAW Hashing Speed for keys 08 bytes long = 17,840,437 KEYS-PER-SECOND; all in 32 clocks
RAW Hashing Speed for keys 10 bytes long = 11,893,583 KEYS-PER-SECOND; all in 48 clocks
RAW Hashing Speed for keys 12 bytes long = 17,840,312 KEYS-PER-SECOND; all in 32 clocks
RAW Hashing Speed for keys 14 bytes long = 11,893,500 KEYS-PER-SECOND; all in 48 clocks
RAW Hashing Speed for keys 16 bytes long = 11,893,458 KEYS-PER-SECOND; all in 48 clocks
RAW Hashing Speed for keys 18 bytes long = 9,061,650 KEYS-PER-SECOND; all in 63 clocks
RAW Hashing Speed for keys 36 bytes long = 4,530,682 KEYS-PER-SECOND; all in 126 clocks
RAW Hashing Speed for keys 64 bytes long = 3,635,910 KEYS-PER-SECOND; all in 157 clocks
The hash scanner - hashing in one pass all orders - TESTING CUMULATIVE PERFORMANCE...
RAW Hashing Speed for keys 4,6,8,10,12,14,16,18,36,64 bytes long = 7,387,125 KEYS-PER-SECOND; all in 720 clocks
Leprechaun: Inserting keys/BBs of order 004 into B-trees, free RAM in B-tree pool is 00,001,200 MB; Pass #001 of 1 ... DONE; 00,000,044,969 B-trees have been rooted so far; HashtableS_Utilization = 000%; Keys/s = 004,530,936
Leprechaun: Inserting keys/BBs of order 006 into B-trees, free RAM in B-tree pool is 00,001,197 MB; Pass #001 of 1 ... DONE; 00,000,233,517 B-trees have been rooted so far; HashtableS_Utilization = 002%; Keys/s = 003,299,976
Leprechaun: Inserting keys/BBs of order 008 into B-trees, free RAM in B-tree pool is 00,001,188 MB; Pass #001 of 1 ... DONE; 00,000,542,820 B-trees have been rooted so far; HashtableS_Utilization = 005%; Keys/s = 003,036,670
Leprechaun: Inserting keys/BBs of order 010 into B-trees, free RAM in B-tree pool is 00,001,170 MB; Pass #001 of 1 ... DONE; 00,000,915,253 B-trees have been rooted so far; HashtableS_Utilization = 008%; Keys/s = 002,798,490
Leprechaun: Inserting keys/BBs of order 012 into B-trees, free RAM in B-tree pool is 00,001,146 MB; Pass #001 of 1 ... DONE; 00,001,319,824 B-trees have been rooted so far; HashtableS_Utilization = 012%; Keys/s = 002,784,829
Leprechaun: Inserting keys/BBs of order 014 into B-trees, free RAM in B-tree pool is 00,001,119 MB; Pass #001 of 1 ... DONE; 00,001,741,164 B-trees have been rooted so far; HashtableS_Utilization = 016%; Keys/s = 002,594,945
Leprechaun: Inserting keys/BBs of order 016 into B-trees, free RAM in B-tree pool is 00,001,089 MB; Pass #001 of 1 ... DONE; 00,002,170,640 B-trees have been rooted so far; HashtableS_Utilization = 020%; Keys/s = 002,606,785
Leprechaun: Inserting keys/BBs of order 018 into B-trees, free RAM in B-tree pool is 00,001,056 MB; Pass #001 of 1 ... DONE; 00,002,604,525 B-trees have been rooted so far; HashtableS_Utilization = 024%; Keys/s = 002,274,438
Leprechaun: Inserting keys/BBs of order 036 into B-trees, free RAM in B-tree pool is 00,001,021 MB; Pass #001 of 1 ... DONE; 00,003,043,942 B-trees have been rooted so far; HashtableS_Utilization = 029%; Keys/s = 001,735,155
Leprechaun: Inserting keys/BBs of order 064 into B-trees, free RAM in B-tree pool is 00,000,969 MB; Pass #001 of 1 ... DONE; 00,003,483,716 B-trees have been rooted so far; HashtableS_Utilization = 033%; Keys/s = 001,518,186

Leprechaun: (Total Hashing Speed) plus Searches-n-Inserts Per Second: 1,427,208 SNIPS

Number Of Bits (Slots=1<<Bits): 20
Number Of Hash Collisions(Distinct WORDs - Number Of Trees): 900,311
Number Of Trees(GREATER THE BETTER): 3,483,716
Number Of LEAFs(littler THE BETTER) not counting ROOT LEAFs: 242,761
Highest Tree not counting ROOT Level i.e. CORONA levels(littler THE BETTER): 2

E:\Lookuperorama.c_r8>Lookuperorama_ICC-V15.0_XXH3_32bit.exe "Gulyakovskiyi_E._Dolgiyi_Voshod_Na_Yenne.html" x 20 1200 i

RAW Hashing Speed for key 570,901 bytes long = 6,009,484,210 6,009,484,210 6,009,484,210 6,073,414,893 6,009,484,210 | 6,073,414,893 BYTES-PER-SECOND
RAW Hashing Speed for keys 04 bytes long = 570,898,000 KEYS-PER-SECOND; all in 1 clocks
RAW Hashing Speed for keys 06 bytes long = 570,896,000 KEYS-PER-SECOND; all in 1 clocks
RAW Hashing Speed for keys 08 bytes long = 570,894,000 KEYS-PER-SECOND; all in 1 clocks
RAW Hashing Speed for keys 10 bytes long = 570,892,000 KEYS-PER-SECOND; all in 1 clocks
RAW Hashing Speed for keys 12 bytes long = 570,890,000 KEYS-PER-SECOND; all in 1 clocks
RAW Hashing Speed for keys 14 bytes long = 570,888,000 KEYS-PER-SECOND; all in 1 clocks
RAW Hashing Speed for keys 16 bytes long = 570,886,000 KEYS-PER-SECOND; all in 1 clocks
RAW Hashing Speed for keys 18 bytes long = 35,680,250 KEYS-PER-SECOND; all in 16 clocks
RAW Hashing Speed for keys 36 bytes long = 17,839,562 KEYS-PER-SECOND; all in 32 clocks
RAW Hashing Speed for keys 64 bytes long = 17,838,687 KEYS-PER-SECOND; all in 32 clocks
The hash scanner - hashing in one pass all orders - TESTING CUMULATIVE PERFORMANCE...
RAW Hashing Speed for keys 4,6,8,10,12,14,16,18,36,64 bytes long = 26,072,205 KEYS-PER-SECOND; all in 204 clocks
Leprechaun: Inserting keys/BBs of order 004 into B-trees, free RAM in B-tree pool is 00,001,200 MB; Pass #001 of 1 ... DONE; 00,000,044,919 B-trees have been rooted so far; HashtableS_Utilization = 000%; Keys/s = 006,009,452
Leprechaun: Inserting keys/BBs of order 006 into B-trees, free RAM in B-tree pool is 00,001,197 MB; Pass #001 of 1 ... DONE; 00,000,233,400 B-trees have been rooted so far; HashtableS_Utilization = 002%; Keys/s = 003,636,280
Leprechaun: Inserting keys/BBs of order 008 into B-trees, free RAM in B-tree pool is 00,001,188 MB; Pass #001 of 1 ... DONE; 00,000,542,711 B-trees have been rooted so far; HashtableS_Utilization = 005%; Keys/s = 003,299,965
Leprechaun: Inserting keys/BBs of order 010 into B-trees, free RAM in B-tree pool is 00,001,170 MB; Pass #001 of 1 ... DONE; 00,000,915,142 B-trees have been rooted so far; HashtableS_Utilization = 008%; Keys/s = 003,636,254
Leprechaun: Inserting keys/BBs of order 012 into B-trees, free RAM in B-tree pool is 00,001,147 MB; Pass #001 of 1 ... DONE; 00,001,320,402 B-trees have been rooted so far; HashtableS_Utilization = 012%; Keys/s = 003,020,582
Leprechaun: Inserting keys/BBs of order 014 into B-trees, free RAM in B-tree pool is 00,001,119 MB; Pass #001 of 1 ... DONE; 00,001,741,938 B-trees have been rooted so far; HashtableS_Utilization = 016%; Keys/s = 003,020,571
Leprechaun: Inserting keys/BBs of order 016 into B-trees, free RAM in B-tree pool is 00,001,089 MB; Pass #001 of 1 ... DONE; 00,002,171,481 B-trees have been rooted so far; HashtableS_Utilization = 020%; Keys/s = 003,020,560
Leprechaun: Inserting keys/BBs of order 018 into B-trees, free RAM in B-tree pool is 00,001,056 MB; Pass #001 of 1 ... DONE; 00,002,605,256 B-trees have been rooted so far; HashtableS_Utilization = 024%; Keys/s = 002,784,800
Leprechaun: Inserting keys/BBs of order 036 into B-trees, free RAM in B-tree pool is 00,001,021 MB; Pass #001 of 1 ... DONE; 00,003,044,824 B-trees have been rooted so far; HashtableS_Utilization = 029%; Keys/s = 002,594,845
Leprechaun: Inserting keys/BBs of order 064 into B-trees, free RAM in B-tree pool is 00,000,969 MB; Pass #001 of 1 ... DONE; 00,003,484,981 B-trees have been rooted so far; HashtableS_Utilization = 033%; Keys/s = 002,274,254

Leprechaun: (Total Hashing Speed) plus Searches-n-Inserts Per Second: 1,902,944 SNIPS

Number Of Bits (Slots=1<<Bits): 20
Number Of Hash Collisions(Distinct WORDs - Number Of Trees): 899,046
Number Of Trees(GREATER THE BETTER): 3,484,981
Number Of LEAFs(littler THE BETTER) not counting ROOT LEAFs: 241,898
Highest Tree not counting ROOT Level i.e. CORONA levels(littler THE BETTER): 2

E:\Lookuperorama.c_r8>

For comparison it is nice to see your Assembly:
https://godbolt.org/z/ozrhxP

Your main loop being only 18 instructions, but 63 as a whole:

.L6:
        movzx   esi, BYTE PTR [r10]
        mov     rax, rdi
        xor     r8, rsi
.L5:
        imul    r8, rbp
        mov     edx, DWORD PTR [rax]
        add     rax, 4
        mov     rcx, r8
        xor     r8, rsi
        shr     rcx, 32
        xor     r8, rdx
        xor     ecx, edx
        mov     DWORD PTR [rax-4], ecx
        cmp     rax, r9
        jne     .L5
        add     r10, 1
        add     rbp, r8
        cmp     r11, r10
        jne     .L6
.L1:

My whole 'FNV1A-Pippip', 32 instructions:
https://godbolt.org/z/i40ipj

        mov     rax, QWORD PTR [rdi]
        cmp     rsi, 8
        jbe     .L2
        lea     rax, [rsi-1]
        shr     rax, 4
        lea     rdx, [8+rax*8]
        movabs  rax, -3750763034362895579
        sub     rsi, rdx
        add     rdx, rdi
.L3:
        xor     rax, QWORD PTR [rdi]
        add     rdi, 8
        imul    rax, rax, 591798841
        xor     rax, QWORD PTR [rdi-8+rsi]
        imul    rax, rax, 591798841
        cmp     rdi, rdx
        jne     .L3
.L4:
        mov     rdx, rax
        shr     rdx, 32
        xor     eax, edx
        mov     edx, eax
        shr     edx, 16
        xor     eax, edx
        ret
.L2:
        movabs  rdx, -3750763034362895579
        mov     ecx, 8
        sub     ecx, esi
        sal     ecx, 3
        sal     rax, cl
        shr     rax, cl
        xor     rax, rdx
        imul    rax, rax, 591798841
        jmp     .L4

Collisionwise, I feel it will be interesting to stress your function by replacing SHA3-224...

Any noise will appear to contain patterns if patterns are looked for

If you listen to empty white noise long enough you might begin to hear voices or sounds in the static. Any program designed to look for repeating patterns or complexity occuring in, what should be, random noise is going to find "something" there, because noise is noise

A chicken & egg paradox in the claim that the Human Mind predates the Big Bang

Your claim:

The "intelligent impulses" or even "human mind" itself (because a musician can understand these impulses) had existed long before the "Big Bang" happened.

The problem here, is that you are relying on your own "human mind" in order to determine that these pulses are "intelligent".

In other words, based on the fact that your human mind has interpreted these pulses as intelligent, you have concluded that the human mind must be as old as the data with which you have generated them.

So you've essentially used your own claim in order to prove it.

To put this in more precise (and perhaps tedious) details:

  • Let M denote the human mind
  • Let f(x) denote whether or not x predates the Big Bang
  • Let g(x, y) denote whether or not x interprets y as intelligent pulses
  • Let h(x) be your hash function

Your claim can subsequently be formulated as:

image

Which obviously depicts incorrect logical inference.


Note that the above (along with your initial claim) doesn't even take into account the fact that there are many human minds - one per person, some would say (although I guess it depends on your personality as well as your local environment).

If we were to rephrase your claim based on the more generalized assumption that different humans have different minds:

image

Then it wouldn't even adhere to the most basic logical inference.

At best, you could use it in order to claim that some human minds (yours included) have existed before the Big Bang, and of course - that wouldn't actually resolve the logical error; it would only "decay" into the less obvious error described above.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.