Giter VIP home page Giter VIP logo

Comments (7)

stephenberry avatar stephenberry commented on June 25, 2024

Thanks for bringing up this issue and explaining your use case.

By requiring null termination we can greatly reduce the number of checks and branches, which makes parsing around 5% faster. When Glaze can use padding on your buffer (by using a non-const std::string) this performance is increased even more. The reduction in parsing time due to null termination and padding can be more than the time it takes to memcpy the buffer into an intermediate. I just say this to explain the design motivation and express how copying the data in your case could be faster than implementing a version that supports non-null termination.

But, I do agree that it is best to avoid copies and allocations, so this is a very valid concern. Plus, I'm sure your code would be simplified by not requiring these copies.

I'll have to think about this more. Could you provide an example structure that you want to parse and how it is being handled for 0-copy? I'm curious as to how TLS is related to JSON.

I'm very interested because I'm currently working on HTTP/websocket code and working Glaze into it.

from glaze.

stephenberry avatar stephenberry commented on June 25, 2024

The underlying buffer simply needs to contain a null character within its allocated memory somewhere after the message. So, intermediate string_views do not need null termination.
Right now Glaze checks input buffer std::string_views for null termination.
I’ll add an option to turn off this checking by the developer.

from glaze.

shiretu avatar shiretu commented on June 25, 2024

Some gore details in my env:
In my case, the TLS is just another filter (in my local terminology) in the chain. This filter is also 0-copy up to it (from the wire) and from it (towards the app layer). Itself, it is not 0 copy, as it has to decrypt/encrypt.

But the rest of the chain on top of it is 0-copy. (except when it gets ziped :) )

Okay, you got the idea, there are filters in the chain that requires copying and massive alteration/mutation of the data. It is their nature. However, JSON parsing should be kind of a read-only thing.

I can not add anything to the buffers, all I can do is read from them or copy them. Reason is that the filters are not aware of each other. They just know they have neighbours and they stimulate them with data (inbound and outbound). It makes things super fast and also encourages metaprogramming (no polymorphism in my case).

My humble and unsolicited recommendation for glaze library is to just accept any kind of input as long as it has std::data() and std::size() overloaded for that type. This way there is no need for custom types to describe the input.

std::string, std::string_view, std::vector<>, std::array<>, etc they all have that overloaded.

My custom buffer used inside the chain also has that. It is the most generic and friction-less way of accepting a chunk of memory for processing.

from glaze.

shiretu avatar shiretu commented on June 25, 2024

At the heart of walking sits 2 logical operations:

  1. advance
  2. compare something with something to see if we reached the end

Currently glaze probably does a pointer increment and then compares the dereferencing of that pointer with \0.

The suggestion I'm making is to keep the pointer advance as is, but compare it un-dereferenced with another end pointer which is computed one-time by doing it like this: (pseudo code)

const char *pCursor=std::data(input);
const char *pEnd=pCursor+std::size(input);
while((pCursor++)!=pEnd){
 //do stuff
}

from glaze.

stephenberry avatar stephenberry commented on June 25, 2024

If data is null terminated then when we need to check for the existence of a character, let's say a {, then we can simply do if (*pCursor == '{'), but if we need to check against an end, then we need to do if (pCursor != pEnd && *pCursor == '{')

Notice that we now have two comparisons and another boolean operation for every single character that we need to check. I'm just explaining this so you can see the performance reason for requiring null termination.

from glaze.

stephenberry avatar stephenberry commented on June 25, 2024

I plan to solve this problem in a few different ways for different use cases. One quick question I have for you is would you be able to assume the input JSON is valid? If we can assume the input JSON is valid we can make a more efficient solution.

I do plan on adding support for unknown JSON that could have errors without null termination, but this will take a bit more time. But, I'll keep this issue open until I've added the support.

from glaze.

meftunca avatar meftunca commented on June 25, 2024

I have the same problem with binary transactions. @stephenberry

#include <cstddef>
#include <glaze/glaze.hpp>
#include <iostream>

int main()
{
   // Example Users
   std::string filePath = "/Users/meftunca/Desktop/big_projects/son/users.beve";
   glz::json_t users = glz::json_t::array_t{glz::json_t{{"id", "58018063-ce5a-4fa7-adfd-327eb2e2d9a5"},
                                                        {"email", "[email protected]"},
                                                        {"password", "@cWLwgM#Knalxeb"},
                                                        {"city", "Sayre ville"},
                                                        {"streetAddress", "1716 Harriet Alley"}}};

   std::vector<std::byte> bytes;
   bytes.shrink_to_fit();
   auto ec = glz::write_file_binary(users, filePath, bytes); // Write to file
   if (ec) {
      std::cerr << "Error: " << glz::format_error(ec) << std::endl;
      return 1;
   }

   glz::json_t user2;
   std::vector<std::byte> bytes2;
   ec = glz::read_file_binary(user2, filePath, bytes2);
   if (ec) {
      std::cerr << "Error: " << glz::format_error(ec) << std::endl;
      return 1;
   }

   std::cout << user2[0]["id"].get<std::string>() << std::endl;
   std::cout << user2[0]["email"].get<std::string>() << std::endl;
   std::cout << user2[0]["password"].get<std::string>() << std::endl;
   std::cout << user2[0]["city"].get<std::string>() << std::endl;
   std::cout << user2[0]["streetAddress"].get<std::string>() << std::endl;

   return 0;
}

Sorry, reading worked fine with std::string.

from glaze.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.