Comments (7)
Thanks for bringing up this issue and explaining your use case.
By requiring null termination we can greatly reduce the number of checks and branches, which makes parsing around 5% faster. When Glaze can use padding on your buffer (by using a non-const std::string) this performance is increased even more. The reduction in parsing time due to null termination and padding can be more than the time it takes to memcpy the buffer into an intermediate. I just say this to explain the design motivation and express how copying the data in your case could be faster than implementing a version that supports non-null termination.
But, I do agree that it is best to avoid copies and allocations, so this is a very valid concern. Plus, I'm sure your code would be simplified by not requiring these copies.
I'll have to think about this more. Could you provide an example structure that you want to parse and how it is being handled for 0-copy? I'm curious as to how TLS is related to JSON.
I'm very interested because I'm currently working on HTTP/websocket code and working Glaze into it.
from glaze.
The underlying buffer simply needs to contain a null character within its allocated memory somewhere after the message. So, intermediate string_views do not need null termination.
Right now Glaze checks input buffer std::string_views for null termination.
I’ll add an option to turn off this checking by the developer.
from glaze.
Some gore details in my env:
In my case, the TLS is just another filter (in my local terminology) in the chain. This filter is also 0-copy up to it (from the wire) and from it (towards the app layer). Itself, it is not 0 copy, as it has to decrypt/encrypt.
But the rest of the chain on top of it is 0-copy. (except when it gets ziped :) )
Okay, you got the idea, there are filters in the chain that requires copying and massive alteration/mutation of the data. It is their nature. However, JSON parsing should be kind of a read-only thing.
I can not add anything to the buffers, all I can do is read from them or copy them. Reason is that the filters are not aware of each other. They just know they have neighbours and they stimulate them with data (inbound and outbound). It makes things super fast and also encourages metaprogramming (no polymorphism in my case).
My humble and unsolicited recommendation for glaze library is to just accept any kind of input as long as it has std::data()
and std::size()
overloaded for that type. This way there is no need for custom types to describe the input.
std::string, std::string_view, std::vector<>, std::array<>, etc they all have that overloaded.
My custom buffer used inside the chain also has that. It is the most generic and friction-less way of accepting a chunk of memory for processing.
from glaze.
At the heart of walking sits 2 logical operations:
- advance
- compare something with something to see if we reached the end
Currently glaze probably does a pointer increment and then compares the dereferencing of that pointer with \0
.
The suggestion I'm making is to keep the pointer advance as is, but compare it un-dereferenced with another end pointer which is computed one-time by doing it like this: (pseudo code)
const char *pCursor=std::data(input);
const char *pEnd=pCursor+std::size(input);
while((pCursor++)!=pEnd){
//do stuff
}
from glaze.
If data is null terminated then when we need to check for the existence of a character, let's say a {
, then we can simply do if (*pCursor == '{')
, but if we need to check against an end
, then we need to do if (pCursor != pEnd && *pCursor == '{')
Notice that we now have two comparisons and another boolean operation for every single character that we need to check. I'm just explaining this so you can see the performance reason for requiring null termination.
from glaze.
I plan to solve this problem in a few different ways for different use cases. One quick question I have for you is would you be able to assume the input JSON is valid? If we can assume the input JSON is valid we can make a more efficient solution.
I do plan on adding support for unknown JSON that could have errors without null termination, but this will take a bit more time. But, I'll keep this issue open until I've added the support.
from glaze.
I have the same problem with binary transactions. @stephenberry
#include <cstddef>
#include <glaze/glaze.hpp>
#include <iostream>
int main()
{
// Example Users
std::string filePath = "/Users/meftunca/Desktop/big_projects/son/users.beve";
glz::json_t users = glz::json_t::array_t{glz::json_t{{"id", "58018063-ce5a-4fa7-adfd-327eb2e2d9a5"},
{"email", "[email protected]"},
{"password", "@cWLwgM#Knalxeb"},
{"city", "Sayre ville"},
{"streetAddress", "1716 Harriet Alley"}}};
std::vector<std::byte> bytes;
bytes.shrink_to_fit();
auto ec = glz::write_file_binary(users, filePath, bytes); // Write to file
if (ec) {
std::cerr << "Error: " << glz::format_error(ec) << std::endl;
return 1;
}
glz::json_t user2;
std::vector<std::byte> bytes2;
ec = glz::read_file_binary(user2, filePath, bytes2);
if (ec) {
std::cerr << "Error: " << glz::format_error(ec) << std::endl;
return 1;
}
std::cout << user2[0]["id"].get<std::string>() << std::endl;
std::cout << user2[0]["email"].get<std::string>() << std::endl;
std::cout << user2[0]["password"].get<std::string>() << std::endl;
std::cout << user2[0]["city"].get<std::string>() << std::endl;
std::cout << user2[0]["streetAddress"].get<std::string>() << std::endl;
return 0;
}
Sorry, reading worked fine with std::string.
from glaze.
Related Issues (20)
- Raw char buffer handling HOT 2
- Built-in support for type control HOT 3
- reflection fails with custom serializer for std::chrono types on clang/gcc HOT 7
- How do I write an enum as a raw string? HOT 1
- compile error with error_on_missing_keys HOT 1
- old GCC 11.4.0 compiler support HOT 8
- Link to NDJSON website looks sus... HOT 2
- [ICE] `msvc 17.8` (`glaze 2.8.1`) HOT 1
- Numeric variant deduction type
- Write_binary/read_binary support for the "json_t" interface HOT 6
- is_[json_type] concepts in glz:: namespace
- Compilation error when structure have more than 32 fields HOT 1
- How to pass glz::read_json to a std::function? HOT 1
- Support std::variant<...,std::variant<...> > HOT 2
- repe::registry needs to handle throws on function calls
- Tuple routines not working as expected HOT 1
- Does glaze handle structs with static members ? HOT 1
- Extra quotes when customizing to_json for map keys HOT 2
- glz::hide support for reflection
- Add std::chrono support
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from glaze.