prysmaticlabs / go-ssz Goto Github PK
View Code? Open in Web Editor NEWSimple Serialize, the canonical serialization library for the Ethereum Serenity project
License: MIT License
Simple Serialize, the canonical serialization library for the Ethereum Serenity project
License: MIT License
Example: https://github.com/golang/go/blob/master/src/encoding/json/fuzz.go
I imagine that we can drop in encoding/json fuzz test if we match their API for marshal/unmarshal.
See prysmaticlabs/prysm#2959 for an example.
Let's add benchmarks for ssz and compare against other serializers to understand how well/poor ssz performs in go.
Suggestion: add go-ssz to https://github.com/alecthomas/go_serialization_benchmarks.
New spec tests:
http_archive(
name = "eth2_spec_tests",
build_file_content = """
filegroup(
name = "test_data",
srcs = glob([
"**/*.yaml",
]),
visibility = ["//visibility:public"],
)
""",
sha256 = "a531804ac35d2398d37cfa755a686280d8cb3a9649e993e3cf89640f06191d5e",
url = "https://github.com/prysmaticlabs/eth2.0-spec-tests/releases/download/v0.8.1/base64_encoded_archive.tar.gz",
)
Failing for various reason. Blocking prysmaticlabs/prysm#2312
The ssz spec have been changed to only support fixed-size array (not dynamic size array).
Currently our ssz implementation only supports byte slice which is akin to dynamic-size array.
We need to update the implementation and align our encoding/decoding/hash scheme with the spec.
Related source code:
https://github.com/prysmaticlabs/prysm/tree/master/shared/ssz
I currently see the following error when running tests with bazel in the repository:
--- FAIL: TestBlockCache_maxSize (0.05s)
hash_cache_test.go:127: Expected hash cache key size to be 10000, got 10025
FAIL
I suspect this has to do with some race conditions in the hash cache package leading to the problem. Tagging @shayzluf on this.
This is a tracking issue to prevent misuse of the SigningRoot
public function, which is explicitly used to hash tree root objects without their signature field. We should throw an error if the last field of the struct is not called Signature
.
There is no concept of nil in the Simple Serialize specification. Instead, we should treat nil objects as instantiated, zero-valued object. This is a tracking issue for a solution to this problem.
Currently, we treat nil in marshaling as []byte{}
This is a tracking issue for a particular problem encountered when marshaling a beacon block body type as generated by yaml go.
SSZ is a straightforward approach towards taking an arbitrary input and producing a slice of bytes from it. Given input can take the form of structs and lists which may have elements of variable size, and the output is a flattened slice of bytes, we need some way to determine item delimiters or indices at which variable size elements/fixed size elements occur. Let's define some terms first:
A type is said to be fixed size if it:
A type is said to be variable size if it:
Let's see some examples:
[32][]byte{}
is considered variable size[4]byte{ SomeItem struct { Slot uint64 Root [32]byte } }
is considered fixed sizeAnotherItem struct { CustodyBitfield []byte Slot uint32 }
is considered variable sizeand finally
type Item struct {
Root [32]byte
Slot uint64
NestedItem struct {
Slot uint64
}
}
is considered fixed size and
type Item struct {
Root [32]byte
Slot uint64
NestedItem struct {
Slot uint64
SomeField [32][]uint64
}
}
is considered variable size.
Consider the type [][]uint16{{1, 2}, {3}}
, it will marshal into
[8 0 0 0 12 0 0 0 1 0 2 0 3 0]
This basically says that the first element begins at index 8
, and the next element begins at index 12
, given we need a way to determine when the variable size parts of marshaled bytes begin.
Offsets do not always happen at the beginning of the marshaled result, as they can be interwoven with fixed size elements. Consider the struct:
type Item struct {
ValidatorIndex uint64
NestedItem struct {
SomeField []byte
Slot uint64
}
}
The marshaled results will look something along the lines of
[ValidatorIndex, offset where SomeField begins, SomeField, slot]
All of our marshal functions return the size of the element we are marshaling. This will help us track the next offset. For example, say we are serializing:
type Item struct {
ValidatorIndex uint64
NestedItem struct {
SomeField []byte
Slot uint64
}
}
We first compute the total size of the item. Let's assume that SomeField has 3 elements in it. Then, the total size of the struct is 8 (for uint64) + (3 + offset) + 8. We use this to preallocate a buffer we will write into during the marshaling process.
We set a current writing index = 0
, and then marshal validator index, which returns a size of 8
, we update the current writing index += 8
. We then determine that SomeField is variable size, so we will write an offset for it starting at index 8 (we will write 4 bytes as offsets are uint32). Then, we marshal the byte slice which returns length 3, so we update current index += 4 + 3
. Then we serialize the Slot field, which has size 8 so we update current index += 8
and we are done. The same procedure applies for all sorts of complex types.
BeaconBlockBody struct {
Value struct {
RandaoReveal []byte `json:"randao_reveal" ssz:"size=96"`
Eth1Data struct {
DepositRoot []byte `json:"deposit_root" ssz:"size=32"`
DepositCount uint64 `json:"deposit_count"`
BlockHash []byte `json:"block_hash" ssz:"size=32"`
} `json:"eth1_data"`
Graffiti []byte `json:"graffiti" ssz:"size=32"`
ProposerSlashings []struct {
ProposerIndex uint64 `json:"proposer_index"`
Header1 struct {
Slot uint64 `json:"slot"`
ParentRoot []byte `json:"parent_root" ssz:"size=32"`
StateRoot []byte `json:"state_root" ssz:"size=32"`
BodyRoot []byte `json:"body_root" ssz:"size=32"`
Signature []byte `json:"signature" ssz:"size=96"`
} `json:"header_1"`
Header2 struct {
Slot uint64 `json:"slot"`
ParentRoot []byte `json:"parent_root" ssz:"size=32"`
StateRoot []byte `json:"state_root" ssz:"size=32"`
BodyRoot []byte `json:"body_root" ssz:"size=32"`
Signature []byte `json:"signature" ssz:"size=96"`
} `json:"header_2"`
} `json:"proposer_slashings"`
AttesterSlashings []struct {
Attestation1 struct {
CustodyBit0Indices []uint64 `json:"custody_bit_0_indices"`
CustodyBit1Indices []uint64 `json:"custody_bit_1_indices"`
Data struct {
BeaconBlockRoot []byte `json:"beacon_block_root" ssz:"size=32"`
SourceEpoch uint64 `json:"source_epoch"`
SourceRoot []byte `json:"source_root" ssz:"size=32"`
TargetEpoch uint64 `json:"target_epoch"`
TargetRoot []byte `json:"target_root" ssz:"size=32"`
Crosslink struct {
Shard uint64 `json:"shard"`
StartEpoch uint64 `json:"start_epoch"`
EndEpoch uint64 `json:"end_epoch"`
ParentRoot []byte `json:"parent_root" ssz:"size=32"`
DataRoot []byte `json:"data_root" ssz:"size=32"`
} `json:"crosslink"`
} `json:"data"`
Signature []byte `json:"signature" ssz:"size=96"`
} `json:"attestation_1"`
Attestation2 struct {
CustodyBit0Indices []uint64 `json:"custody_bit_0_indices"`
CustodyBit1Indices []uint64 `json:"custody_bit_1_indices"`
Data struct {
BeaconBlockRoot []byte `json:"beacon_block_root" ssz:"size=32"`
SourceEpoch uint64 `json:"source_epoch"`
SourceRoot []byte `json:"source_root" ssz:"size=32"`
TargetEpoch uint64 `json:"target_epoch"`
TargetRoot []byte `json:"target_root" ssz:"size=32"`
Crosslink struct {
Shard uint64 `json:"shard"`
StartEpoch uint64 `json:"start_epoch"`
EndEpoch uint64 `json:"end_epoch"`
ParentRoot []byte `json:"parent_root" ssz:"size=32"`
DataRoot []byte `json:"data_root" ssz:"size=32"`
} `json:"crosslink"`
} `json:"data"`
Signature []byte `json:"signature" ssz:"size=96"`
} `json:"attestation_2"`
} `json:"attester_slashings"`
Attestations []struct {
AggregationBitfield []byte `json:"aggregation_bitfield"`
Data struct {
BeaconBlockRoot []byte `json:"beacon_block_root" ssz:"size=32"`
SourceEpoch uint64 `json:"source_epoch"`
SourceRoot []byte `json:"source_root" ssz:"size=32"`
TargetEpoch uint64 `json:"target_epoch"`
TargetRoot []byte `json:"target_root" ssz:"size=32"`
Crosslink struct {
Shard uint64 `json:"shard"`
StartEpoch uint64 `json:"start_epoch"`
EndEpoch uint64 `json:"end_epoch"`
ParentRoot []byte `json:"parent_root" ssz:"size=32"`
DataRoot []byte `json:"data_root" ssz:"size=32"`
} `json:"crosslink"`
} `json:"data"`
CustodyBitfield []byte `json:"custody_bitfield"`
Signature []byte `json:"signature" ssz:"size=96"`
} `json:"attestations"`
Deposits []struct {
Proof [][]byte `json:"proof" ssz:"size=32,32"`
Data struct {
Pubkey []byte `json:"pubkey" ssz:"size=48"`
WithdrawalCredentials []byte `json:"withdrawal_credentials" ssz:"size=32"`
Amount uint64 `json:"amount"`
Signature []byte `json:"signature" ssz:"size=96"`
} `json:"data"`
} `json:"deposits"`
VoluntaryExits []struct {
Epoch uint64 `json:"epoch"`
ValidatorIndex uint64 `json:"validator_index"`
Signature []byte `json:"signature" ssz:"size=96"`
} `json:"voluntary_exits"`
Transfers []struct {
Sender uint64 `json:"sender"`
Recipient uint64 `json:"recipient"`
Amount uint64 `json:"amount"`
Fee uint64 `json:"fee"`
Slot uint64 `json:"slot"`
Pubkey []byte `json:"pubkey" ssz:"size=48"`
Signature []byte `json:"signature" ssz:"size=96"`
} `json:"transfers"`
} `json:"value"`
Serialized []byte `json:"serialized"`
Root []byte `json:"root" ssz:"size=32"`
} `json:"BeaconBlockBody,omitempty"`
In order to reproduce, checkout the core-spec-test
locally, navigate to the spectests
directory, and run go test .
. The error message is:
--- FAIL: TestYaml (0.20s)
panic: runtime error: slice bounds out of range [recovered]
panic: runtime error: slice bounds out of range
goroutine 36 [running]:
testing.tRunner.func1(0xc000144100)
/usr/local/go/src/testing/testing.go:792 +0x387
panic(0x13e7c80, 0x177ea80)
/usr/local/go/src/runtime/panic.go:513 +0x1b9
github.com/prysmaticlabs/go-ssz.marshalByteArray(0x13bd4a0, 0xc00126ae00, 0x197, 0xc0007f0000, 0xc1a, 0xc1a, 0xbca, 0xbca, 0x0, 0x0)
This occur when marshaling the very last item of the beacon block body, which is the Signature
of the Transfers
slice. When marshaling SSZ, we preallocate a bytes buffer of a certain size that we will write into for each field. We have a size.go
file that recursively determines how much we should allocate based on the size of every value within the input. This is confirmed to be correct and match the length of the serialized output (3098 == 3098).
The last bytes of the expected output are:
... 61 113 55 224 117 108 211 134 98 146 124 185 89 161 212 98 220 247 168 37 111 33 206 114 171 141 150 253 180 155 190 162 183 0 103 95 235 228 35 224 131 224 254 140 154 224 78 33 193 85 115 128 36 125 51 78 20 88 183 98 135 108 228 62 33 124 115 126 62 84 117 250 32 79 42 89 167 52 154 236 133 181 236 119 190 153 252 151 113 199 245 221 69 22 191 78 77 13 151 3 117 255 20 247 24 42 2]
But our buffer right before the panic looks like:
... 61 113 55 224 117 108 211 134 98 146 124 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
Which looks like it is missing 16 extra bytes to fit in the signature ([96]byte) as there are only 80 zeros.
Putting the text through diffchecker.com shows us that we add several extra offsets (4 to be exact), right before the []struct
fields such as []ProposerSlashing
, []Transfers
, etc.
Left: Expected, Right: Our Output
The primary suspicion is something is being added when are determining if a struct field is fixed or variable size in the struct marshaler found in marshal.go
. There is a condition in that function that goes as follows:
for i, f := range fields {
if !isVariableSizeType(val.Field(i), f.typ) {
fixedIndex, err = f.sszUtils.marshaler(val.Field(i), buf, fixedIndex)
if err != nil {
return 0, err
}
} else {
fmt.Printf("Writing %v\n", f.name)
nextOffsetIndex, err = f.sszUtils.marshaler(val.Field(f.index), buf, currentOffsetIndex)
if err != nil {
return 0, err
}
// Write the offset.
offsetBuf := make([]byte, BytesPerLengthOffset)
binary.LittleEndian.PutUint32(offsetBuf, uint32(currentOffsetIndex-startOffset))
copy(buf[fixedIndex:fixedIndex+uint64(BytesPerLengthOffset)], offsetBuf)
// We increase the offset indices accordingly.
currentOffsetIndex = nextOffsetIndex
fixedIndex += uint64(BytesPerLengthOffset)
}
We keep track of the last index written for fixed size elements and for variable size. elements
it basically says, if a field is not variable size type, we marshal it normally and update the fixedIndex
variable to be the last written index of that marshaling event. If it is of variable size, we write an offset accordingly.
My suspicion is that block operations such as []Transfer
, etc. should not be treated as variable size since they have a MAX_TRANSFERS
limiting value, which might be causing some of these problems. However, initial conversations with Cayman from Chainsafe have led me to believe that they should indeed be treated as variable elements.
The goal of this tracking issue is to determine other ways of debugging these problems and getting to the bottom of this small diff that prevents our tests from passing.
Right now our docs for SSZ are lacking - we should improve the overall docs of the repo as well as its gotchas clearly in the main README.
Thanks @prestonvanloon for pointing this out. There are several places in our ssz repo where we could improve inefficient code and have a large impact in our Prysm runtime, such as excessive usage of typ.Field(i) with reflect, etc. Opening this issue for tracking
Hi, as in the test function of PerformSSZCheck
:
go-ssz/spectests/generic_test.go
Line 677 in 6d5c9aa
When performing the valid check, the current logic is check:
deserialize(serialized_bytes) returns error?
serialize(deserialize(serialized_bytes)) =? serialized_bytes
However, as the spec-test says, we should check :
deserialize(serialized_bytes) =? yaml_value
serialize(yaml_value) =? serialized_bytes
So why not use the passed the yamlPath
to get the value?
On the other hand, the value written in the yaml file is also encoded(some way serialized) in the ssz format. So we need to decode(deserialize) it to get the origin value. Oh, that seems strange and not independent. Is this the reason why you not use it?
Refer to the test case generator code below:
This is a tracking issue for modifying all fmt.Errorf calls in the go-ssz repo to the errors.Wrapf
pattern which offers more benefits in terms of tracing the error calls and their origin.
This PR tracks the overall progress of aligning our implementation of SSZ to the dev branch of the spec.
๐ Move completed SSZ work in spec-v0.6
branch to this repo. ๐ | #6
(master
branch was migrated rather than spec-v0.6
branch)
selfSignedRoot
truncate_last
sszChunkSize
to bytesPerChunk
bytesPerLengthOffset
bitsPerByte
๐ Create new helper functions ๐ | #5
pack
merkleize
mix_in_length
mix_in_type
๐ Modifications to encoding composite types ๐
๐ Modifications to existing helper functions ๐
๐ Modifications to merkleHash ๐
BYTES_PER_CHUNK
to a new pack
function.merkleHash
to merkleize
๐ Update TreeHash to the new definition ๐
There seems to be a panic occurring when some types are unmarshaled.
Discussed with @rauljordan that we should enhance our spectest suite to perform a roundtrip test.
Hopefully that will shine light on the issue and we can resolve it from there.
The godocs for this library are lacking any examples. It would be helpful for readers to see examples to understand the library.
In Go, arrays cannot be nil. However, we have pseudo-arrays in SSZ in struct types that can be specified as:
type Block struct {
Slot uint64
Root []byte `ssz-size:"32"`
}
According to SSZ, that Root
field type is an array, not a slice. However, it is perfectly valid Go code to declare:
b := &Block{Slot: 5}
and not filling in the root field, leading to a major problem in marshal/unmarshal causing it to go haywire. It does not possibly understand how an array type could be nil, leading to faulty results and major problems at runtime.
Encouraging users to not do this is not an appropriate solution, it must be handled within the SSZ code codebase.
This is a tracking issue to ensure our go-ssz implementation is up to date with the latest v0.8 specification. This specification has been tagged as frozen by the Ethereum researchers, meaning there will be no future feature additions or removals, and only critical bug fixes infrequently.
Currently the only way to find the tree root of an object is by doing HashTreeRoot
on the object. However if the object is already serialized there is no way to currently find the tree root of that object using the serialized data. The expected API would look something like this
encodedObj,err := Marshal(object)
root,err := SerializedTreeRoot(encodedObj)
This issue will be resolved when this method is implemented
It would be nice if we could allow strings to be valid SSZ types if we simply treat them as byte slices when marshaling/unmarshaling/hashtreeroot in SSZ. This is a good first issue for anyone wanting to contribute.
Line 95 in f36c537
Example: 1099511627776 will overflow int
This may be a more general use case, but we specifically want to encode/decode byte slices at a fixed length using a StructTag.
Example:
type Something struct {
Data []bytes `ssz:"length=32"`
}
This struct tag specifies that the Data field should be treated as [32]byte, even though it is a variable slice.
Currently the only valid ssz number type is uint64. However, we have some field types that would conveniently be represented in int32. The specific example is protobuf enum fields.
The suggestion here is to support int32 fields and encode/decode them like uint64. The resulting decoding may lose information from large numbers that would not fit in int32, but the use case I provided for enums has no risk of this (numbers would be 0 to 127).
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.