The zeropb from danhhz

specialize offsets map

The offsets map is currently implemented with FastIntMap since it was convenient, but FastIntMap has to either be tuned in one general way or else multiple versions code generated to get the nice compiler specializations for all the masks. We could make the happy zero-allocation case more likely if we specialize it at code generation time using our knowledge of the message fields.

For densely packed fields an array is sufficient. It'd be easiest to start with a []uint32 but the values are offsets into a message and so we likely don't need the full range of 4294967296, so that would use more space than necessary. Maybe a followup to make it smaller? We'd have a similar problem to FastIntMap then where we'd have to have all the various sizes we want to use in the zeropb package (or maybe codegen the crazy ones into the generated proto code, eek). We also have to decide how to handle offsets that are larger than whatever range we pick. If we want any valid input to work, we need to fall back to allocating a map and we've basically re-derived FastIntMap (this is exactly why I used it in the first place).

My initial thought is starting with []uint32 and eating the struct size. That's definitely the simplest. It's also not as bad as it sounds. Most of the fields that would be generated other protobuf libraries are 4-8 bytes per field, which means best case is the same size struct and worst case is 2x. Sigh, maybe uint16 really is the way to go, seems like 65536 bytes should be enough. Instead of the map fallback, decode could error if given a buffer longer than max offset, which seems like a reasonable limitation.

If we do specialize densely packed fields, we need a heuristic and also a second impl for non-densely packed fields. My initial idea for the heuristic is if there are no more holes than fields. That is, if max field id <= 2 * num fields. The second impl could initially always be a map (unfortunate but predictable performance). We could add a proto option to let the user override the heuristic if they're okay with the tradeoff.

Both offset impls would be structs that happen to expose the same set of methods. Then the codegen would simply put one or the other in the generated message struct and the rest of the codegen would be the same. No interfaces so no allocations.

There's probably things we could eventually do to improve the second impl if we cared (though I initially do not). It could be an array (of a size computed by some heuristic over the field types) and a slice view of that array (or I guess it'd have to be two of these?). Then we keep all the fields sorted and do everything via binary search, appending to add new parsed fields. That would be allocation free until we got too many fields, which isn't bad. It also doesn't have the unpredictability of FastIntMap's flip from small to large.

support encoding

First, an assumption that encoding should use the same structs as decoding. These structs don't have a native go field per message field, they just have an embedded byte slice and accessors. I think it's possible to do encoding entirely within this slice in response to calling mutator functions. In short, always maintain a serialized version of the message in the buffer (also keeping the offsets up to date).

setting a previously unset single field: append the field tag and data to the end of the buffer, update the offsets map
unsetting a set single field: copy everything in the buffer forward over the old data and remove from the offset map, adjust all affected offsets in the map
setting a previously set single field: if the field tag + data is exactly the same size as the old tag + data, overwrite; if not, unset the field and then proceed as if setting an unset field
appending to a repeated field: append the field tag and data to the end of the buffer, set the offsets map if it's unset (if the repeated field was previously empty)
unsetting a repeated field: find each instance of the field in the buffer and compact the rest of the buffer over it, adjust all affected offsets in the map
bulk overwrite of repeated field: if non-empty, unset; then proceed to append them all as above

Most modifications to simple fields will likely be the same size and get the fast case. Ditto appending to a repeated field. Anything that ends up compacting will be slower, but I'd guess most uses of protobufs don't repeatedly modify the fields anyway.

Unintended benefit of all this: Encode is free!

Note that this even works with unknown fields, which are just passed through unmodified (though perhaps earlier in the buffer than they were.)

danhhz / zeropb Goto Github PK

zeropb's People

Contributors

Stargazers

Watchers

zeropb's Issues

specialize offsets map

support encoding

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent