Giter VIP home page Giter VIP logo

Comments (13)

justend29 avatar justend29 commented on May 27, 2024 2

I believe an adaptor to provide an array-like interface to a pair_t would flip the behavior while using the existing readers & writers.
It could wither be used directly, or the compile time option would wrap the value in it.

Anyway, I'll look into it and file a PR

from glaze.

justend29 avatar justend29 commented on May 27, 2024 1

That all seems understandable, and most of this debate seems to be coming from just looking at different sides of the same coin.
JSON is really limited, which is half its beauty, but it doesn't map perfectly to language constructs. I still think default map and pairs should be JSON objects, but choosing the alternative for non-string keys seems palatable. Maybe @stephenberry can choose what a more sensible default is for non-string like keys. Still, the issue of ints and enums being acceptable object keys.

Thanks for sharing the perspective. I think adding a modifier/wrapper to read/write object-like types as arrays could be in the cards, for sure. That doesn't break anything, is opt-in, and supports both trains of thought.

from glaze.

justend29 avatar justend29 commented on May 27, 2024

This is currently the case but I don't believe it has to be. I'm looking into it.

from glaze.

justend29 avatar justend29 commented on May 27, 2024

Pull request #509 introduces support for reading arbitrary types as JSON object keys. It also provides a fix for writing arbitrary types as JSON object keys. I'd like to get a review on it before merging.

As for your example, it works with the PR but there are a few things to note:

  1. Types must provide a comparison operator to be used as the key_type in std::maps. One is therefore needed for type Foo.
  2. Your intention may be to detect a parse error, but the sample JSON would always fail to parse. std::map is read/written as a JSON object, making it {\"values\":{\"3\":7,\"7\":3,\"73\":37}} or, as a raw string literal, R"({"values":{"3":7,"7":3,"73":37}})".

Here is an altered version of your example:

#include <cstdint>
#include <cassert>
#include <compare>

#include <glaze/core/meta.hpp>
#include <glaze/core/macros.hpp>
#include <glaze/json/quoted.hpp>

struct Foo {
  int64_t value;
  [[nodiscard]] std::strong_ordering operator<=>(const Foo&) const noexcept = default;
};


template<>
struct glz::meta<Foo> {
  static constexpr auto value{ &Foo::value };
};


struct Bar {
  std::map<Foo, int> values;
	 
  GLZ_LOCAL_META(Bar, values);
};


int main() {
  const auto parsed = glz::read_json<Bar>(R"({"values":{"3":7,"7":3,"73":37}})");
  assert(parsed.has_value());

  const Bar expected{
    .values = {{{3}, 7}, {{7}, 3}, {{73}, 37}}
  };
  assert(expected.values == parsed->values);
}

from glaze.

Haatschii avatar Haatschii commented on May 27, 2024

Hey,

thank you very much for looking into this and for directly providing a pull request! I am not sufficiently accustomed with glaze to give a reasonable review on the internal changes, but I think that it would solve my problem and is everything I currently need. However I think it might makes sense to discuss the general serialization syntax, as raised in your second point (you are totally correct with your first point btw, my comparison operator must have gotten lost when creating the minimal example):

You say that a std::map is read/written as a JSON object and not as an array of pairs (as in my example). I totally see how this is obvious choice for maps with string-like keys, but I would argue that the array of pairs is better suited for the general case. The reasoning being:

  1. From an abstract perspective the goal should be to map between the c++ object and the JSON representation preserving as many attributes as possible. As std::map is an ordered container, it seems reasonable to account for this ordering in the serialization. This is achieved by the array-of-pairs, as by specification a JSON array is an ordered sequence of zero or more values. A JSON object on the other hand is an unordered set of name/value pairs, and can not be expected to reliably preserve the ordering. Apart from the philosophical aspect, it also makes sense from a performance perspective to guarantee the ordering, as for example perfect emplace_hint can always be used for deserialization. This is even more the case for std::flat_map.

  2. The JSON specification only allows strings as object keys. This means that in order to serialize a map with non-string keys, the keys have to be serialized to an escaped string (which is what you do in the pull request,a s far as I understand). The result is having a JSON object where some strings are themself JSON objects, which seems like a burden on code complexity (in glaze), performance (escaping everything) and readability of the JSON. In contrast using the array-of-pairs approach there is no need to create json strings, as the keys can directly be written as JSON objects.

  3. There is real trouble when trying to use the JSON object approach for a std::multimap, as non-unique keys are a grey area of the JSON specification at best. In contrast the array-of-pairs is well defined also for duplicate keys.

That being said, I am not sure if there is the best way to handle map-like types and it might makes sense to support several options. There also doesn't seem to be a consensus on this problem among other c++ JSON libraries, on how to handle map-like types with non-string keys.

from glaze.

justend29 avatar justend29 commented on May 27, 2024

@Haatschii

I'm really pleased that this will help resolve your problem. Addressing your points above:

  1. here are a few points against changing the default serialization of maps from JSON objects to arrays. You can reference this issue where some decisions about preferring objects were made.

    1. a std::map is in sorted order by key, whereas an array is not - it merely preserves the order of its elements. As such, there is no guarantee that some serialized JSON array is in the sorted order of the deserialized std::map. The order argument conflates two types of ordering.
    2. Where the serialized array is not in sorted order by key, the performance gain is not realized. Nevertheless, the same performance gain would be realized using JSON objects with entries in sorted order by key. It's not specific to arrays, despite arrays inherently offering better access times.
    3. Although neither JSON objects nor arrays enforce the uniqueness of keys required by maps (except for multimaps), arrays don't lend themselves against repetition. The keys in an object should be unique.
    4. Serializing pairs as arrays would require additional checks for lengths; only a length of 2 would be acceptable. This change from array to object representation of pairs was performed in the PR for the referenced issue above
    5. This would be a breaking change to existing users.
    6. I personally think most would believe maps are more akin to JSON objects than arrays.
  2. You're right, having non-string keys as JSON object keys is kind of unnatural. I was just trying to allow Glaze to support it. Should you want to serialize to an array of arrays instead of a JSON object, glaze will already support this. You just need to serialize an input-range of non-pairs instead of pairs. An array, tuple, or whatever could be used. This can be easily formed with C++20 ranges.

    Extending your original submission to serialize to arrays can be done as follows:

      const glz::obj bar_with_arr_x{
        "x", expected.values | std::views::transform([](const auto& pair){
                                  return std::tie(pair.first, pair.second); 
                               })
      };
    
      assert(glz::write_json(bar_with_arr_x) == R"({"x":[[3,7],[7,3],[73,37]]})");
  3. JSON objects should have unique keys, but it's not enforced by the serialized JSON, it's enforced by the implementation. In the case of a map, the first key value is used, all others are ignored. In the case of a multi-map, they're all used.

Considering the idea of additionally allowing arrays of arrays may be worthwhile, possibly in the form of a modifier/wrapper, and could be useful for multi-maps.
In any case, Glaze supports non-string keys in maps if that's best for the use case.

If a map isn't right for your use case, there are other options. you could look into an array type + std::sort + std::unique, a heap with std::make_heap, std::priority_queue, or others. Custom serialization/deserialization overloads are always an option, too.
What qualities of the map are you interested in? Sorting, unique keys, or both?

from glaze.

Haatschii avatar Haatschii commented on May 27, 2024

Hey, thank you for your thoughts. My actual use case, which made me stumble over the inability to use arbitrary types as keys, is really rather simple. Think of Foo in my example as a time class, with the int64_t representing microseconds since epoch (and no other data). For this particular application it really doesn't matter whether the map is represented as an array of pairs or as an JSON object, as there is no loss writing the int64_t, as a string as there is no escaping necessary and it remains readable. Apart from that I do work with a lot map-like structures, however I substituted std::map by a vector of (custom-)pairs almost everywhere for performance reasons. Should I decide to migrate those to glaze as well, I'll probably just write a custom read/write function achieving the array of pairs syntax. That shouldn't be too complicated.

All in all I think your patch will work fine for me. I just thought I'll raise this issue now, because once the patch is merged, changing the way maps with custom keys are serialized in the future would be a breaking change, while right now you can freely choose the format. So for me the following is just for the sake of argument. Feel free to ignore it if you have already made up your mind. (-:

  1. .

    1. I don't think I agree. In contrast to the JSON object the JSON array encodes order, which gives meaning to said order. What meaning that is, depends on the API specification, but for a std::map it would be reasonable to use the order of the map. Of course in an untrusted environment there no guarantee the order is actually correct, but this would also be true for any other type like std::vector or std::list.
    2. I guess this is true and for the not so uncommon use case that writing and reading is both done by glaze it would probably make sense the use emplace_hint also when using the JSON object format. For me personally it would just not feel right to assume that keys are in an expected order when the specification specifically states that they are unordered. But that's probably just me.
    3. True, but assuming you want to handle std::map and std::multimap the same way, I think this is still better as it is completely within the JSON specifications, just not enforcing the uniqueness, while duplicate keys are outside of the specifications.
    4. In an untrusted environment checking for length 2 would be recommended, yes. However, when using a JSON object, I would also advice to check that the object contains only a single key. Not sure if there's much of a difference. Generally serializing a std::pair as key and value of an JSON object has the same issue discussed in point 2, i.e. that the key can only be a string.
    5. Not necessarily. One JSON library I used before (I think DAW JSON, but not sure), had a different behavior depending on the key type: serializing maps with string-like keys to JSON objects and all other key types to an array of pairs. As custom types were not supported until now, implementing it like that would not be breaking. Although I agree that it might be more confusing if number types, although not string-like, would be handled like strings and not like other classes.
    6. I think this really depends on the kind of key you have in mind. For string-like keys I totally agree, but having a JSON key, which is itself an escaped JSON serialization of an complex object seems obscure.
  2. No worries, as I said, the patch should work well for me. But yes, I really don't like having complex objects serialized and escaped to act as JSON keys. However as this is already the case for the serialization of std::pair, I guess there is no way around it without breaking changes.

  3. Well, this is only true for as long as all reading/writing is done by the same implementation. As duplicate keys are not (really) part of the JSON specifications, other implementations may chose to handle them differently or not all. Users of such implementation would then not be able to parse a std::multimap written this way.

from glaze.

stephenberry avatar stephenberry commented on May 27, 2024

This is a great discussion. In my thinking, pair and dictionary types should default to JSON objects, because we already have std::array, std::vector, std::tuple, glz::array, and glz::arr that default to arrays. However, I see some of the benefits of using an array in this context, particularly because the quotes are excessive. However, I do like the quotes because they indicate well that these are keys, and they match the serialization of a key in JSON. I think that @justend29's implementation should be the default. And, I think we could add a compiler time option that would encode/decode pairs as arrays, which would help your use case @Haatschii. I think we'll come across others who want std::pair as an array in some contexts, so we can just make this a compile time option.

from glaze.

justend29 avatar justend29 commented on May 27, 2024

I agree completely.

from glaze.

stephenberry avatar stephenberry commented on May 27, 2024

@justend29 Definitely, provide an adapter (what we typically call in glaze a wrapper). Just follow how wrappers like glz::quoted or any of the other wrappers work. This way the user can specify it in their glz::meta and they don't have to use custom glz::read<...> and glz::write<...> calls.

from glaze.

Haatschii avatar Haatschii commented on May 27, 2024

Thank you for your responses. I think that providing an adapter like glz::quoted to switch the serialization of std::pair and std::map to arrays would be a great solution! This would also allow users to select the serialization format for each class separately and mixing them if desired.

Not sure if I understand the suggestion for a compile time option correctly. That would be a compile flag switching the format globally? I guess that I would personally stand to profit from such an option, as I obviously prefer the array format in most cases. However if there are those adapters I am not sure such option would even be necessary and worth the additional maintenance and testing needed to maintain such a global option.

from glaze.

justend29 avatar justend29 commented on May 27, 2024

@Haatschii
By compile-time option, we're not referring to build option that's part of the build-system, but instead a flag that alters the behaviour at compile-time. glz::opts is a struct that collects these compile-time flags. readers/writers change their behaviour based on the flags in there at compile-time.

from glaze.

Haatschii avatar Haatschii commented on May 27, 2024

Ah yes, that makes more sense. Thanks for the explanation.

from glaze.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.