Giter VIP home page Giter VIP logo

Comments (13)

jkeiser avatar jkeiser commented on May 17, 2024 1

Another design point: sax apis absolutely force branches because they call different functions for different things. A branch free friendly API that passes you a uniform object for everything and lets you decide if you wanted to branch based on the type of object encountered, might significantly improve performance for at least some users. Like, if they want to ignore all numbers and booleans because their file doesn't have them, they can assume all values are strings and use an error mask for anomalies.

from simdjson.

michaeleisel avatar michaeleisel commented on May 17, 2024

+1, would be very helpful for improving speed when creating custom objects from json

from simdjson.

michaeleisel avatar michaeleisel commented on May 17, 2024

I'd be happy to contribute to this. What would be the best way to do it? For example, it could use templating with write_tape to decide if it should write to a tape (DOM) or call a callback function (streaming)?

from simdjson.

lemire avatar lemire commented on May 17, 2024

I think it would involve making it so that our stage 2 calls custom functions instead of the ones that build the tape. It probably won't work "as is". I think that the easiest thing would be to make "practical" for people to build their own custom stage 2.

from simdjson.

jkeiser avatar jkeiser commented on May 17, 2024

Hmm, I think the checkin we just made gets us pretty close to this. We should make up a json SAX-like interface on top of it now!

... if we do provide this interface, it'd be nice if the user had the choice whether to parse numbers and strings, or not ... or even to use their own routines to do so. We could expose a parse_string and parse_number or a callback to make it easy for those who just want it parsed for them.

from simdjson.

jkeiser avatar jkeiser commented on May 17, 2024

It would give them the option of being even more restrictive about the JSON--requiring integers only for a field, for example, and eschewing any actual float processing.

from simdjson.

lemire avatar lemire commented on May 17, 2024

We should make up a json SAX-like interface on top of it now!

Yes!!! I'm eager to write benchmarks... :-)

from simdjson.

lemire avatar lemire commented on May 17, 2024

... if we do provide this interface, it'd be nice if the user had the choice whether to parse numbers and strings, or not ... or even to use their own routines to do so. We could expose a parse_string and parse_number or a callback to make it easy for those who just want it parsed for them.

That sounds like a good idea actually... If it can be done with templates and a zero cost...

from simdjson.

jkeiser avatar jkeiser commented on May 17, 2024

Like it passes you an object with an enum telling you the type of thing encountered.

from simdjson.

jkeiser avatar jkeiser commented on May 17, 2024

One issue here: if we provide a SAX-like callback based API, we either have to give up inlining or let the user compile their callbacks for each supported architecture. Giving up inlining of the callback APIs would be leaving a lot of performance on the floor. Letting the user create architecture-specific versions of their code is not a bad idea anyway, but it requires design--our multi-architecture compilation and selection is pretty flexible now but is limited to simdjson methods.

from simdjson.

lemire avatar lemire commented on May 17, 2024

@jkeiser Astute observation.

from simdjson.

jkeiser avatar jkeiser commented on May 17, 2024

@lemire I believe this is covered by on demand, but I'll let you be the judge of that.

from simdjson.

lemire avatar lemire commented on May 17, 2024

Moved to 0.6 and closing.

from simdjson.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.