Giter VIP home page Giter VIP logo

xxhash's Introduction

xxHash.st

Extremely fast non-cryptographic hash algorithm xxhash


platform license

xxHash is an Extremely fast Hash algorithm, running at RAM speed limits. It successfully completes the SMHasher test suite which evaluates collision, dispersion and randomness qualities of hash functions.

Instalation

PM> Install-Package Standart.Hash.xxHash

Benchmarks

This benchmark was launched on a Windows 10.0.19044.1706 (21H2). The reference system uses a AMD Ryzen 7 2700, 1 CPU, 16 logical and 8 physical cores

BenchmarkDotNet=v0.13.1, OS=Windows 10.0.19044.1706 (21H2)
AMD Ryzen 7 2700, 1 CPU, 16 logical and 8 physical cores
.NET SDK=6.0.300
  [Host]     : .NET 6.0.5 (6.0.522.21309), X64 RyuJIT
  Job-HQVLOG : .NET 6.0.5 (6.0.522.21309), X64 RyuJIT
Runtime=.NET 6.0  
Method x64
Hash32 Array 6.65 GB/s
Hash64 Array 12.28 GB/s
Hash128 Array 12.04 GB/s
Hash3 Array 12.08 GB/s
Hash32 Span 6.65 GB/s
Hash64 Span 12.28 GB/s
Hash128 Span 12.04 GB/s
Hash3 Span 12.08 GB/s
Hash32 Stream 3.22 GB/s
Hash64 Stream 4.81 GB/s

Comparison between С# and C implementation

Method Platform Language 1KB Time 1MB Time 1GB Time Speed
Hash32 x64 C# 138.0 ns 130.2 us 150.3 ms 6.65 GB/s
Hash32 x64 C 140.2 ns 129.6 us 150.3 ms 6.65 GB/s
Hash64 x64 C# 73.9 ns 64.6 us 81.4 ms 12.28 GB/s
Hash64 x64 C 75.5 ns 65.2 us 84.5 ms 11.83 GB/s
Hash128 (SSE2/AVX2) x64 C# 84.95 ns 56.9 us 73.2 ms 13.66 GB/s
Hash128 (SSE2/AVX2) x64 C 84.35 ns 38.1 us 57.2 ms 17.48 GB/s
Hash3 (SSE2/AVX2) x64 C# 75.8 ns 56.6 us 74.6 ms 13.40 GB/s
Hash3 (SSE2/AVX2) x64 C 74.1 ns 42.1 us 59.5 ms 16.80 GB/s

Api

public static uint ComputeHash(byte[] data, int length, uint seed = 0) { throw null; }
public static uint ComputeHash(Span<byte> data, int length, uint seed = 0) { throw null; }
public static uint ComputeHash(Stream stream, int bufferSize = 4096, uint seed = 0) { throw null; }
public static async ValueTask<uint> ComputeHashAsync(Stream stream, int bufferSize = 4096, uint seed = 0) { throw null; }
public static uint ComputeHash(string str, uint seed = 0) { throw null; }


public static ulong ComputeHash(byte[] data, int length, ulong seed = 0) { throw null; }
public static ulong ComputeHash(Span<byte> data, int length, ulong seed = 0) { throw null; }
public static ulong ComputeHash(Stream stream, int bufferSize = 8192, ulong seed = 0) { throw null; }
public static async ValueTask<ulong> ComputeHashAsync(Stream stream, int bufferSize = 8192, ulong seed = 0) { throw null; }
public static ulong ComputeHash(string str, uint seed = 0) { throw null; }

public static uint128 ComputeHash(byte[] data, int length, uint seed = 0) { throw null; }
public static uint128 ComputeHash(Span<byte> data, int length, uint seed = 0) { throw null; }
public static uint128 ComputeHash(string str, uint seed = 0) { throw null; }

// allocations
public static byte[] ComputeHashBytes(byte[] data, int length, uint seed = 0) { throw null; }
public static byte[] ComputeHashBytes(Span<byte> data, int length, uint seed = 0) { throw null; }
public static byte[] ComputeHashBytes(string str, uint seed = 0) { throw null; }

Examples

A few examples of how to use api

byte[] data = Encoding.UTF8.GetBytes("veni vidi vici");

ulong h64_1 = xxHash64.ComputeHash(data, data.Length);
ulong h64_2 = xxHash64.ComputeHash(new Span<byte>(data), data.Length);
ulong h64_3 = xxHash64.ComputeHash(new ReadOnlySpan<byte>(data), data.Length);
ulong h64_4 = xxHash64.ComputeHash(new MemoryStream(data));
ulong h64_5 = await xxHash64.ComputeHashAsync(new MemoryStream(data));
ulong h64_6 = xxHash64.ComputeHash("veni vidi vici");

uint h32_1 = xxHash32.ComputeHash(data, data.Length);
uint h32_2 = xxHash32.ComputeHash(new Span<byte>(data), data.Length);
uint h32_3 = xxHash32.ComputeHash(new ReadOnlySpan<byte>(data), data.Length);
uint h32_4 = xxHash32.ComputeHash(new MemoryStream(data));
uint h32_5 = await xxHash32.ComputeHashAsync(new MemoryStream(data));
uint h32_6 = xxHash32.ComputeHash("veni vidi vici");

ulong h3_1 = xxHash3.ComputeHash(data, data.Length);
ulong h3_2 = xxHash3.ComputeHash(new Span<byte>(data), data.Length);
ulong h3_3 = xxHash3.ComputeHash(new ReadOnlySpan<byte>(data), data.Length);
ulong h3_4 = xxHash3.ComputeHash("veni vidi vici");

uint128 h128_1 = xxHash128.ComputeHash(data, data.Length);
uint128 h128_2 = xxHash128.ComputeHash(new Span<byte>(data), data.Length);
uint128 h128_3 = xxHash128.ComputeHash(new ReadOnlySpan<byte>(data), data.Length);
uint128 h128_4 = xxHash128.ComputeHash("veni vidi vici");

Guid guid    = h128_1.ToGuid();
byte[] bytes = h128_1.ToBytes();

byte[] hash_bytes_1 = xxHash128.ComputeHashBytes(data, data.Length);
byte[] hash_bytes_2 = xxHash128.ComputeHashBytes(new Span<byte>(data), data.Length);
byte[] hash_bytes_3 = xxHash128.ComputeHashBytes(new ReadOnlySpan<byte>(data), data.Length);
byte[] hash_bytes_4 = xxHash128.ComputeHashBytes("veni vidi vici");

Made in 🔰 Ukraine with ❤️

xxhash's People

Contributors

ksmith3036 avatar legigor avatar mcraiha avatar moien007 avatar olmelnyk avatar uranium62 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

xxhash's Issues

.Net Standard 2.1

Hello,
The nuget package is only available for .Net 6 projects.
Please change the target framework to .Net Standard 2.1 so other libraries targeting to .Net Standard 2.1 could use that.

System.BadImageFormatException: Could not load file or assembly

In case if a runtime environment is operating under a x86 platform or an Azure App Service is configured to support only 32-bit platform you'll get the next exception:

System.BadImageFormatException: Could not load file or assembly 'Standart.Hash.xxHash, Version=1.0.6.0, Culture=neutral, PublicKeyToken=null'. An attempt was made to load a program with an incorrect format.
File name: 'Standart.Hash.xxHash, Version=1.0.6.0, Culture=neutral, PublicKeyToken=null'

Also after compilation of code that depends on the Standart.Hash.xxHash lib you may get a warning:

C:\Program Files (x86)\Microsoft Visual Studio\2017\Professional\MSBuild\15.0\Bin\Microsoft.Common.CurrentVersion.targets(2110,5): Warning MSB3270: There was a mismatch between the processor architecture of the project being built "MSIL" and the processor architecture of the reference "C:\Users\username\.nuget\packages\standart.hash.xxhash\1.0.6\lib\netstandard2.0\Standart.Hash.xxHash.dll", "AMD64". This mismatch may cause runtime failures. Please consider changing the targeted processor architecture of your project through the Configuration Manager so as to align the processor architectures between your project and references, or take a dependency on references with a processor architecture that matches the targeted processor architecture of your project.

To fix this you just need to change PlatformTarget in csproj files to -> AnyCPU
<PlatformTarget>AnyCPU</PlatformTarget>

PS Thanks for your library!
From Ukraine with ❤️

Error installing the package with NuGet

Hello,
when I try to install the package with NuGet Package I got this error:
Could not install package 'Standart.Hash.xxHash 4.0.5'. You are trying to install this package into a project that targets '.NETFramework,Version=v4.6.2', but the package does not contain any assembly references or content files that are compatible with that framework. For more information, contact the package author.

I'm really a newbie in .NET, can you please advice?

thanks a lot

Hash of continuous byte arrays

Hello,
Thank you for your work, I'm new to xxhash, my question is if I split a byte array to sub arrays, how can I get the same result as the orginal array ?

Incorrect hash when passing `string` into `xxHash64.ComputeHash`

Using ComputeHash with a string will yield an "incorrect" hash because the library is casting a string instance into an unsafe char* which can cause a different hash to be returned depending on the system it's running on.

public static unsafe ulong ComputeHash(string str, uint seed = 0)
{
Debug.Assert(str != null);
fixed (char* c = str)
{
byte* ptr = (byte*) c;
int length = str.Length * 2;
return UnsafeComputeHash(ptr, length, seed);
}
}

The official .NET documentation clearly specifies that the default encoding can very between systems and additionally that the string and char types use UTF-16 internally

The correct approach here would be to create a stack-allocated Span<byte> and then use Encoding.UTF8.GetBytes.
Additionally an optional encoding parameter could also be added, with the default being UTF8.

Example:
image

Public APIs should have argument validation and not Debug.Asserts

Debug.Assert is like the name alludes only for development time. For runtime proper argument validations should be done for public APIs.

Or, as this comes with a small perf-drop, it should at least be documented that no validation is done, and it's subject to the caller of these API to provide valid arguments.

The status quo is that these APIs are really unsafe 😉 not just in some implementations, but from an API's point of view.

Not supporting in >NET 4.5 version

It is not supporting 4.5 version of .NET. it will be great help if it is made consistant with version.

Right now i can see my console with below error:

Severity	Code	Description	Project	File	Line	Suppression State
Error		Could not install package 'Standart.Hash.xxHash 3.1.0'. You are trying to install this package into a project that targets '.NETFramework,Version=v4.5', but the package does not contain any assembly references or content files that are compatible with that framework. For more information, contact the package author.	

Sign assembly with strong name

Please consider to add a strong name to this assembly, so it can be consumed as a dependency from another signed assembly.

Thank you in advance!

C Test Harness

Firstly, thanks for this excellent library!

In the readme you provide comparison against the reference C code, but I can't find any C project in this repo - is it possible to publish the test harness you used to create these benchmarks?

Comparison to other xxHash libraries

Hello!

First of all, thank you for sharing your code :)

Why? I think that results below speak for themselves :D

                                            Method |       Mean |     Error |    StdDev |
---------------------------------------------------|-----------:|----------:|----------:|
  System.Data.HashFunction.xxHash 2.0.0            | 7,866.3 ms | 66.149 ms | 61.875 ms |
  Extensions.Data.xxHash.core20 1.0.2.1 (XXHash32) | 7,834.1 ms | 70.787 ms | 66.214 ms |
  Extensions.Data.xxHash.core20 1.0.2.1 (XXHash64) | 4,234.4 ms | 18.427 ms | 17.237 ms |
  Standart.Hash.xxHash 1.0.6 (xxHash64)            |   679.1 ms |  4.434 ms |  4.147 ms |
  Standart.Hash.xxHash 1.0.6 (xxHash32)            | 1,201.7 ms |  7.507 ms |  7.022 ms |

I used BenchmarkDotNet to create this little comparison. Test file was of 1.8 GiB size. Feel free to use it anywhere you want.

If you'd like to test it by yourself, here's source code:

using System.Data.HashFunction.xxHash;
using System.IO;
using System.Linq;
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Configs;
using BenchmarkDotNet.Running;
using BenchmarkDotNet.Validators;
using Extensions.Data;
using Standart.Hash.xxHash;

namespace XXHashBenchmark
{
public class Program
{
    public class AllowNonOptimized : ManualConfig
    {
        public AllowNonOptimized()
        {
            Add(
                JitOptimizationsValidator
                    .DontFailOnError); // ALLOW NON-OPTIMIZED DLLS

            Add(
                DefaultConfig.Instance.GetLoggers()
                    .ToArray()); // manual config has no loggers by default
            Add(
                DefaultConfig.Instance.GetExporters()
                    .ToArray()); // manual config has no exporters by default
            Add(
                DefaultConfig.Instance.GetColumnProviders()
                    .ToArray()); // manual config has no columns by default
        }
    }

    public class XxHashBenchmark
    {
        private const string FilePath = "file2";

        private FileStream GetStream()
        {
            return new FileStream(
                FilePath,
                FileMode.Open);
        }

        [Benchmark]
        public void Func1()
        {
            using (var stream = GetStream())
            {
                var x = xxHashFactory.Instance.Create(
                    new xxHashConfig
                    {
                        Seed = 42
                    });

                x.ComputeHash(stream);
            }
        }

        [Benchmark]
        public void Func2()
        {
            var x = XXHash32.Create(42);

            using (var stream = GetStream())
            {
                x.ComputeHash(stream);
            }
        }
        
        [Benchmark]
        public void Func3()
        {
            var x = XXHash64.Create(42);

            using (var stream = GetStream())
            {
                x.ComputeHash(stream);
            }
        }
        
        [Benchmark]
        public void Func4()
        {
            using (var stream = GetStream())
            {
                xxHash64.ComputeHash(stream);
            }
        }
        
        [Benchmark]
        public void Func5()
        {
            using (var stream = GetStream())
            {
                xxHash32.ComputeHash(stream);
            }
        }
    }

    public static void Main(string[] args)
    {
        BenchmarkRunner.Run<XxHashBenchmark>(new AllowNonOptimized());
    }
}
}

Cheers!

Exception on empty input

When input is empty array/span all hashes throws an exception.

var bytes = new byte[0];
var hash = xxHash3.ComputeHash(bytes, bytes.Length); //exception

var span = bytes.AsSpan();
var hash2 = xxHash3.ComputeHash(span, span.Length); //exception

In all hash functions must be guard clause like this

if (data.Length == 0)
    return ComputeEmptyHash(seed);
...
private static unsafe uint ComputeEmptyHash(uint seed)
{
    var pData = stackalloc byte[1]; //doesn't work with zero
    return UnsafeComputeHash(pData, 0, seed);
}

XXH3 and .NET 5 Compatibility

Thanks for sharing this cool implementation. It performs faster than some other xxHash C# libraries I tested, like HashDepot, specially for long strings.

Do you have any plans to keep it up-to-date with XXH3, and .NET 5 besides the current .NET Standard 2.0 compatibility as it seems like .NET Standard may be retired?

Support ReadOnlySpan

First of all, awesome lib, in my test is was about a magnitude faster than another implementation I was evaluating!

Would you consider adding, or changing the api, to support ReadOnlySpan?

Did this to get around it for the moment, not particularly pretty though.

      var span = new Span<byte>(Unsafe.AsPointer(ref MemoryMarshal.GetReference(roSpan)), roSpan.Length);
      return (int)xxHash32.ComputeHash(span, span.Length);

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.