Giter VIP home page Giter VIP logo

Comments (9)

duckbrain avatar duckbrain commented on May 26, 2024 1

I think it should be enough to leave values implementing the binary.BinaryMarshaller interface untouched when normilizing to map[string]interface{} in order to preserve custom types.

Right you are. For some reason, I was trying to Marshal them there, which was making it a lot harder than it needs to be. PR is up.

Anyway, when documents are loaded from disk, they are simply unmarshalled from the msgpack encoding.

I see how that works and the need for the Document type for building the indexes and having a way to pass the values around, but don't personally see when I would get a document out of the database and not immediately convert it into a type-safe object.

I would like to optimize this in the future, by removing the json step, which is actually a way to simply convert between types

The "simple" way to handle that of course is custom reflection or something like structmap.

On the idea of optimizing for what I see as the common case. I messed around with the idea deferring the unmarshalling until requested or needed to use Get and the like. That's is not completely functional, nor suggested, but you're welcome to peruse/use.

https://github.com/duckbrain/clover/tree/feature/delay-decoding

from clover.

ostafen avatar ostafen commented on May 26, 2024 1

Thank you for the PR, which I just merged in the v2 branch.

The "simple" way to handle that of course is custom reflection or something like structmap.

On the idea of optimizing for what I see as the common case. I messed around with the idea deferring the unmarshalling until requested or needed to use Get and the like. That's is not completely functional, nor suggested, but you're welcome to peruse/use.

https://github.com/duckbrain/clover/tree/feature/delay-decoding

These are interesting ideas, which I think should be handled in a separate issue.
I think this one is solved, so I'm closing it

from clover.

ostafen avatar ostafen commented on May 26, 2024

Hi, @duckbrain, thanks for the suggestion. However, using a textual encoding, such as encoding.TextMarshaller, would have a severe impact on query performance, considering that, at query time, much time is wasted in document deserialization.
This is why, I switched from json to msgpack, which is a binary encoding and is thus more performant.
Supporting custom types is definetely a good idea, but I think this feature should not rely on some assumptions made about the underlying encoding.

from clover.

duckbrain avatar duckbrain commented on May 26, 2024

In that case, encoding.Binary*?

from clover.

ostafen avatar ostafen commented on May 26, 2024

encoding/binary only allows you to encode integer types. Anyway, the problem here is not about finding a better encoding for documents, since msgpack is already doing its job. Supporting custom types should not be tight to a particular encoding scheme, since it would be impossible to change it in the future. The way encoding libraries (among these, json and msgpack itself) allow the user to include custom types is by letting the user define custom Marshal() and Unmarshal() methods on the custom object, which serialize the custom type to byte slices.

from clover.

duckbrain avatar duckbrain commented on May 26, 2024

I was referring to the encoding.BinaryMarshaller and encoding.BinaryUnmarshaller interfaces instead of the text-based interfaces in the same package.

They serve as a standard for encoding and decoding custom types into a binary format, not an encoding scheme, but instead can be used by encoding formats like gob.

Standard library types also implement these standards, so it could remove some special handling for certain types.

from clover.

ostafen avatar ostafen commented on May 26, 2024

Oh, sorry about that, we were basically saying the same thing. Would you like to work on this feature?

from clover.

duckbrain avatar duckbrain commented on May 26, 2024

After looking into it. msgpack is already using encoding.BinaryMarshaller and encoding.BinaryUnmarshaller, but document.Document is normalizing everything into map[string]interface{} and that's what is failing to handle the binary encoding.

It looks like internal.Convert is using json to copy values from one object to another. In general, an normal lookup into an object goes from binary encoded msgpack -> map[string]interface{} -> json -> your object. To fix this Document would need to be able to use an object and reflect directly on it.

I don't see a way to do this without rewriting large swaths of the internals, and I'm not sure backwards compatibility can be preserved with the API.

from clover.

ostafen avatar ostafen commented on May 26, 2024

After looking into it. msgpack is already using encoding.BinaryMarshaller and encoding.BinaryUnmarshaller, but document.Document is normalizing everything into map[string]interface{} and that's what is failing to handle the binary encoding.

I think it should be enough to leave values implementing the binary.BinaryMarshaller interface untouched when normilizing to map[string]interface{} in order to preserve custom types.

It looks like internal.Convert is using json to copy values from one object to another. In general, an normal lookup into an object goes from binary encoded msgpack -> map[string]interface{} -> json -> your object. To fix this Document would need to be able to use an object and reflect directly on it.

Yes, but this only holds if you want to convert a Document to a custom struct object, using Unmarshal() (I would like to optimize this in the future, by removing the json step, which is actually a way to simply convert between types).
Anyway, when documents are loaded from disk, they are simply unmarshalled from the msgpack encoding.

from clover.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.