Giter VIP home page Giter VIP logo

hashstructure's Introduction

hashstructure GoDoc

hashstructure is a Go library for creating a unique hash value for arbitrary values in Go.

This can be used to key values in a hash (for use in a map, set, etc.) that are complex. The most common use case is comparing two values without sending data across the network, caching values locally (de-dup), and so on.

Features

  • Hash any arbitrary Go value, including complex types.

  • Tag a struct field to ignore it and not affect the hash value.

  • Tag a slice type struct field to treat it as a set where ordering doesn't affect the hash code but the field itself is still taken into account to create the hash value.

  • Optionally, specify a custom hash function to optimize for speed, collision avoidance for your data set, etc.

  • Optionally, hash the output of .String() on structs that implement fmt.Stringer, allowing effective hashing of time.Time

  • Optionally, override the hashing process by implementing Hashable.

Installation

Standard go get:

$ go get github.com/mitchellh/hashstructure/v2

Note on v2: It is highly recommended you use the "v2" release since this fixes some significant hash collisions issues from v1. In practice, we used v1 for many years in real projects at HashiCorp and never had issues, but it is highly dependent on the shape of the data you're hashing and how you use those hashes.

When using v2+, you can still generate weaker v1 hashes by using the FormatV1 format when calling Hash.

Usage & Example

For usage and examples see the Godoc.

A quick code example is shown below:

type ComplexStruct struct {
    Name     string
    Age      uint
    Metadata map[string]interface{}
}

v := ComplexStruct{
    Name: "mitchellh",
    Age:  64,
    Metadata: map[string]interface{}{
        "car":      true,
        "location": "California",
        "siblings": []string{"Bob", "John"},
    },
}

hash, err := hashstructure.Hash(v, hashstructure.FormatV2, nil)
if err != nil {
    panic(err)
}

fmt.Printf("%d", hash)
// Output:
// 2307517237273902113

hashstructure's People

Contributors

bflad avatar desimone avatar f21 avatar fortytw2 avatar gavv avatar larsfronius avatar matfax avatar matheus-meneses avatar mitchellh avatar mwhooker avatar teraken0509 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

hashstructure's Issues

SlicesAsSets option behaves poorly with duplicate items in the slice

If two slices differ only by an item that is repeated in both, it's trivial to construct hash collisions.

package main

import (
	"fmt"

	hashstructure "github.com/mitchellh/hashstructure/v2"
)

func main() {
	// clearly two different lists
	list1 := []string{"a", "b", "c", "e", "e"}
	list2 := []string{"a", "b", "c", "d", "d"}

	// with the same hash
	fmt.Println(hashstructure.Hash(list1, hashstructure.FormatV2, &hashstructure.HashOptions{SlicesAsSets: true}))
	fmt.Println(hashstructure.Hash(list2, hashstructure.FormatV2, &hashstructure.HashOptions{SlicesAsSets: true}))
}

// output
// 12638153115695167423 <nil>
// 12638153115695167423 <nil>

Same hash returned for prometheus regex field

I am seeing this issue where the same hash is returned when the contents of the struct are different even though the field is exported.
I am referencing the regex field from the prometheus config which is indeed exported, but this results in the same hash.
https://github.com/prometheus/prometheus/blob/0e12f11d6112585415140bcdf0fd30c42e4255fe/config/config.go#L525
https://github.com/prometheus/prometheus/blob/0e12f11d6112585415140bcdf0fd30c42e4255fe/model/relabel/relabel.go#L89C2-L89C8
I see this issue only for the regex field. Below are the 2 configs I am using -

scrape_configs:
  - job_name: kube-proxy
    scrape_interval: 30s
    label_limit: 63
    label_name_length_limit: 511
    label_value_length_limit: 1023
    kubernetes_sd_configs:
    - role: pod
    relabel_configs:
    - action: keep
      source_labels:
      - __meta_kubernetes_namespace
      - __meta_kubernetes_pod_name
      separator: "/"
      regex: kube-system/kube-proxy.+
    - source_labels:
      - __address__
      action: replace
      target_label: __address__
      regex: "regex1"
      replacement: "$$1:10249"

scrape_configs:
  - job_name: kube-proxy
    scrape_interval: 30s
    label_limit: 63
    label_name_length_limit: 511
    label_value_length_limit: 1023
    kubernetes_sd_configs:
    - role: pod
    relabel_configs:
    - action: keep
      source_labels:
      - __meta_kubernetes_namespace
      - __meta_kubernetes_pod_name
      separator: "/"
      regex: kube-system/kube-proxy.+
    - source_labels:
      - __address__
      action: replace
      target_label: __address__
      regex: "regex2"
      replacement: "$$1:10249"

Hashing time.Time does not generate unique hashes

See included ginkgo test case. 2 Different times return the same hash

It("hashes time differently", func() {
			t1 := time.Now()
			t2 := t1.Add(1 * time.Second)

			hashedT1, err := hashstructure.Hash(t1, nil)
			Expect(err).Should(BeNil())

			hashedT2, err := hashstructure.Hash(t2, nil)
			Expect(err).Should(BeNil())

			Expect(hashedT1).ShouldNot(Equal(hashedT2))
		})

Same has for different struct?

I have this simple code:

package main

import (
        "fmt"

        hashstructure "github.com/mitchellh/hashstructure/v2"
)

type Kitchen struct {
        numOfPlates int
        cost int
}


func hash(myType Kitchen) uint64 {
        hash, err := hashstructure.Hash(myType, hashstructure.FormatV2, nil)
        fmt.Println(err)
        return hash
}

func main() {
        fmt.Println("Starting...")
        myStruct :=  Kitchen{numOfPlates: 11, cost:3000}
        myStruct2 :=  Kitchen{numOfPlates: 10, cost:2000}
        myHash1 := hash(myStruct)
        myHash2 := hash(myStruct2)
        fmt.Printf("This is hash1: %d and this is hash2: %d\n", myHash1, myHash2)
        if myHash1 == myHash2 {
                fmt.Println("They are the same!")
        }
        fmt.Println("Finishing...")
}

The hash in both cases is the same even though the values of the structs are different. This is the output:

Starting...
<nil>
<nil>
This is hash1: 6527152372158230899 and this is hash2: 6527152372158230899
They are the same!
Finishing...

I'd expect a different hash. Is this a bug?

Unexported identifiers are not factored into hashing

func TestUnexportedFields(t *testing.T) {
	type str struct {
		v string
	}

	k1 := str{"123"}
	k2 := str{"456"}

	v1, _ := hashstructure.Hash(k1, nil)
	v2, _ := hashstructure.Hash(k2, nil)

	if v1 == v2 {
		t.Errorf("cache collision: %d vs %d", v1, v2)
	}
}

Expected: nothing
Actual: cache collision: 15609384054371896546 vs 15609384054371896546

This was surprising and unexpected. If this is intended behavior I suggest updating the documentation and considering making this an option.

Hashes generated for complex structs containing pointers don't seem to be correct

Hi,

Looking for another pair of eyes here...Not sure what's wrong with this bit of code, but shouldn't the hash values be different?

	type Object struct {
		ID string
	}

	type TransactionExample struct {
		id    string
		objects []*Object
	}

	txn := TransactionExample{
		id:"txn1",
		objects: []*Object{},
	}

	txn2 := TransactionExample{
		id:"txn2",
		objects: []*Object{
			{ID: "test"},
		},
	}

	hash, _ := hashstructure.Hash(txn, nil)
	hash2, _ := hashstructure.Hash(txn2, nil)

	fmt.Println("Hashes:", hash, hash2)

Result:

Hashes: 5827385550866931895 5827385550866931895

Would appreciate some insight on this issue.

Thanks!
Dani

ZeroNil has no effect on int pointers

I just ran into this issue, took some time to figure out.

type t struct {
    A *int
}
var zero int

fmt.Println(hashstructure.Hash(t{}, hashstructure.FormatV2, nil))
fmt.Println(hashstructure.Hash(t{A: &zero}, hashstructure.FormatV2, nil))

opts := &hashstructure.HashOptions{ZeroNil: true}
fmt.Println(hashstructure.Hash(t{}, hashstructure.FormatV2, opts))
fmt.Println(hashstructure.Hash(t{A: &zero}, hashstructure.FormatV2, opts))

opts.ZeroNil = false
fmt.Println(hashstructure.Hash(t{}, hashstructure.FormatV2, opts))
fmt.Println(hashstructure.Hash(t{A: &zero}, hashstructure.FormatV2, opts))

The output from the code above is as follows, showing that the ZeroNil option has no effect.

16677325619216437773 <nil>
16677325619216437773 <nil>
16677325619216437773 <nil>
16677325619216437773 <nil>
16677325619216437773 <nil>
16677325619216437773 <nil>

Note that for other types I tried (string, bool) the setting did seem to work correctly.

Cyclic reference causes stack overflow

Hi, I had found an issue that self-reference or cyclic-reference will lead to stack overflow. Here is an example:

package main

import "github.com/mitchellh/hashstructure/v2"

type Node struct {
        Ptr *Node
}

func main() {
        n := &Node{
                Ptr: nil,
        }
        n.Ptr = n
        hash, err := hashstructure.Hash(n, hashstructure.FormatV2, nil)
        if err != nil {
                panic(err)
        }
        println(hash)
}

Two different contents of same struct give the same hash

Hello,
I have the same hash with this:

package main

import (
	"fmt"

	"github.com/mitchellh/hashstructure"
)

type bar struct {
	stuf string
}

type foobarbar map[string][4][]string

type (
	foo []bar
)

type cacheData struct {
	foobar  foo
	foobar1 foobarbar
}

func main() {

	{
		hash, err := hashstructure.Hash(cacheData{}, nil)
		if err != nil {
			panic(err)
		}

		fmt.Println(hash) // => 9141985097084809465
	}

	{
		hash, err := hashstructure.Hash(cacheData{
			foobar1: foobarbar{
				"hello": [4][]string{[]string{"world"}, nil, nil, nil},
			},
		}, nil)
		if err != nil {
			panic(err)
		}

		fmt.Println(hash) // => 9141985097084809465 !!!! this the same of hashstructure.Hash(cacheData{}, nil)
	}

}
go version
go version go1.7.5 linux/amd64

go env

go env
GOARCH="amd64"
GOBIN=""
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/home/johndoe/go"
GORACE=""
GOROOT="/usr/lib/go"
GOTOOLDIR="/usr/lib/go/pkg/tool/linux_amd64"
CC="gcc"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build521523302=/tmp/go-build -gno-record-gcc-switches"
CXX="g++"
CGO_ENABLED="1"

visit() visits unexported fields and panics

Hello @mitchellh,

Thanks for publishing this library. I ran into this panic trace while trying it out:

panic: reflect.Value.Interface: cannot return value obtained from unexported field or method [recovered]
        panic: reflect.Value.Interface: cannot return value obtained from unexported field or method

goroutine 25 [running]:
testing.tRunner.func1(0xc8200ac360)
        /usr/local/go/src/testing/testing.go:450 +0x171
reflect.valueInterface(0x12e2a0, 0xc82010a030, 0x79, 0xc82010c001, 0x0, 0x0)
        /usr/local/go/src/reflect/value.go:912 +0xe7
reflect.Value.Interface(0x12e2a0, 0xc82010a030, 0x79, 0x0, 0x0)
        /usr/local/go/src/reflect/value.go:901 +0x48
github.com/mitchellh/hashstructure.(*walker).visit(0xc8200c5d90, 0x12e2a0, 0xc82010a030, 0x79, 0xc8201080f0, 0xaf63bd4c8601b7a9, 0x0, 0x0)
        /home/rev/go/src/github.com/mitchellh/hashstructure/hashstructure.go:193 +0x10fe
...

On line 193 of hashstructure.go we have:

parent := v.Interface()

In this case the Interface() method will panic if v is an unexported field.

According to the hashstructure docs:

Unexported fields on structs are ignored and do not affect the hash value.

But this is not accurate, while exported fields with tag hash:"ignore" are
effectively ignored unexported fields are visit()ed anyway, and when
Interface() is called on an unexported struct field the visit function panics.

You can replicate said error with the following snippet:

type structWithoutExportedFields struct {
  v struct{}
}

Hash(structWithoutExportedFields{}, nil)

We could call CanInterface() before casting it, but we can also avoid the
whole situation by checking if the field is unexported and skipping it, just
like what you do with tag:"ignore".

Error for syscall.Errno

I encountered an error with syscall.Errno (defined as uintptr):

var err syscall.Errno
_, herr := hashstructure.Hash(err, hashstructure.FormatV2, nil)
binary.Write: invalid type syscall.Errno

I could test for that but I'm wondering what kind of restrictions I should assume for calling hashstructure.Hash.

IsZero has no effect on slices

No matter the setting of the IsZero option, nil and empty initialized slices are always considered the same.

Drawing from the TestHash_equalNil test function, adding this case will break tests:

{
	Test{
		Str:   nil,
		Int:   nil,
		Map:   nil,
		Slice: nil,
	},
	Test{
		Str:   nil,
		Int:   nil,
		Map:   nil,
		Slice: make([]string, 0),
	},
	false,
	false,
},

This test sets (first false) ZeroNil to false (thus differing nil from []string{}), and subsequently expects the hashes to be different (second false).

Changing the second false to true (thereby expecting same hashes from the two tests) passes the test, which obviously shouldn't be the case.

bitwise XOR to combine hashes for set/struct fails miserably even for basic cases

We are using hashstructure library in one of our projects. It is causing collisions even in basic cases. See the below example

package main
import (
	"github.com/mitchellh/hashstructure"
	"fmt"
)

func main() {

	type Item struct {
		name		string
		priority	int
	}

	type BitwiseXORTest struct {
		name	string
		items	[]Item		`hash:"set"`
	}

	testCase1 := BitwiseXORTest{
		"testCase",
		[]Item{{"FirstItem", 1}, {"SecondItem", 2}},
	}
	testCase2 := BitwiseXORTest{
		"testCase",
		[]Item{{"SecondItem", 1}, {"FirstItem", 2}},
	}

	hash1, _ := hashstructure.Hash(testCase1, nil)
	hash2, _ := hashstructure.Hash(testCase2, nil)

	if hash1 == hash2 {
                //Prints Hash Matches:  8963032841294998213 8963032841294998213
		fmt.Println("Hash Matches: ", hash1, hash2)
	}

}

In the above case, even if the priorities are different for both the items, hash generated is the same. This must be happening because bits touched by FNV64(priority) are not touched by FNV of other fields. XOR also fails incase of identical values

Since we have all the values in the set/struct before hand, a better approach instead of bitwise XOR would be to

  1. calculate the hashes of all the elements in the set/struct
  2. sort them
  3. combine them in the same way as ordered slice

This approach will solve the above mentioned problems. Thoughts?

Different slice in struct result in the same hash

Here's how to reproduce it:

package main

import (
	"fmt"
	"github.com/mitchellh/hashstructure/v2"
)

type TempStruct struct {
	Strings []string `hash:"set"`
}

func main() {
	channelIds := &[]string{
		"66693f7ecdd2e6e2b6f30c18", "66693f7ecdd2e6e2b6f30c19",
	}

	structt := TempStruct{Strings: *channelIds}

	hashInt, _ := hashstructure.Hash(structt, hashstructure.FormatV2,
		&hashstructure.HashOptions{SlicesAsSets: true})
	channelIds1 := &[]string{
		"66759857b42b04c45ed0c6e6", "66759857b42b04c45ed0c6e7",
	}
	struct1 := TempStruct{Strings: *channelIds1}

	hashInt1, _ := hashstructure.Hash(struct1, hashstructure.FormatV2, nil)
	fmt.Println(hashInt)
	fmt.Print(hashInt1)
}

which will print:

4385387346745637338
4385387346745637338

Embedded structs not working

if i have something like:

type MyBase struct {
    ID string
}

type MyThing struct {
    MyBase
   Name string
}

func doSomething() {
    thing1 := MyThing {
        ID: "one",
        Name: "me",
    }

    thing2 := MyThing {
        ID: "two",
        Name: "me",
    }
}

the ID field is not considered when hashing. and in the above example, thing1 and thing2 will have the same hash.

ExampleHash test failure

With Golang 1.6 I'm getting the following test failure:

=== RUN   ExampleHash 
--- FAIL: ExampleHash (0.00s) 
got: 
6691276962590150517 
want: 
8669281753891521642 
FAIL 

Cyclic structure support

package main

import (
    "fmt"

    "github.com/mitchellh/hashstructure"
)

func hash(i interface{}) {
    hash, err := hashstructure.Hash(i, nil)
    if err != nil {
        panic(err)
    }

    fmt.Printf("%v: %d\n", i, hash)
}

func main() {
    m := map[string]interface{}{}
    m["1"] = "1"
    // prints 0
    hash(m)
    m["2"] = "2"
    // prints 0
    hash(m)
    m["3"] = 3
    // does not print 0
    hash(m)
    m["m"] = m
    // fatal error: stack overflow
    hash(m)
}

Cyclic structures aren't supported. Is it expected and/or any plan to add support ?

Also note that in the first two cases, hash is same even if map is updated.

IgnoreZeroValue panics for structs containing a map

Simple example:

package main

import (
	"hash/fnv"

	"github.com/mitchellh/hashstructure/v2"
)

type test struct {
	Field map[string]string
}

func main() {
	t := new(test)
	hashstructure.Hash(t, hashstructure.FormatV2, &hashstructure.HashOptions{
		Hasher:          fnv.New64(),
		IgnoreZeroValue: true,
	})
}

Output:

panic: runtime error: comparing uncomparable type map[string]string

goroutine 1 [running]:
github.com/mitchellh/hashstructure/v2.(*walker).visit(0xc000068ed0, 0x4b4460, 0xc00000e028, 0x16, 0x0, 0x1007f2a71c34418, 0x7f2a71c325d0, 0x0)
	/tmp/gopath678501233/pkg/mod/github.com/mitchellh/hashstructure/[email protected]/hashstructure.go:327 +0x1145
github.com/mitchellh/hashstructure/v2.Hash(0x4b4460, 0xc00000e028, 0x2, 0xc000068f50, 0xc000034778, 0xc000068f78, 0x4056e5)
	/tmp/gopath678501233/pkg/mod/github.com/mitchellh/hashstructure/[email protected]/hashstructure.go:128 +0x1cb
main.main()
	/tmp/sandbox448559066/prog.go:15 +0xb6

Playground link

Is this an assumed behavior, or can something be done to prevent the panic?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.