Giter VIP home page Giter VIP logo

Comments (11)

tomchristie avatar tomchristie commented on July 16, 2024 1

Noting that JSON requires \x00 - \x1F to always be escaped.

http://www.ietf.org/rfc/rfc4627.txt

from http-api-design.

geemus avatar geemus commented on July 16, 2024

@tomchristie great question, and one I don't think we happened to run across in the wild, so I don't think I've given it too much thought just yet. At a glance, I suppose I might lean toward (assuming both are valid) the one that involves the least processing/change. ie if we don't have a good reason to force an encode on our side (and the resultant decode on the client), it seems like it would be easier to save ourselves the trouble of doing it (as well as remembering that it is required). That said, I haven't run in to this before, so I fully suspect nuances that I'm missing. Could you elaborate a bit on the pro-encoding side of things and/or let me know what you think about the leave-it-be argument I've roughly set forward? Thanks!

from http-api-design.

tomchristie avatar tomchristie commented on July 16, 2024

the one that involves the least processing/change

That'd be a valid option. What that actually means might depend on which frameworks and/or json encoding libs you're using for your various services. For example see differences between Rails 3.2.13 vs Rails 4.0.

Could you elaborate a bit on the pro-encoding side of things and/or let me know what you think about the leave-it-be argument I've roughly set forward?

The option that requires least thought is clearly to use escaped characters.

However it's nicer to users if the API presents un-escaped characters - that way command line tools such as curl which simply dump the response body will be outputting properly rendered text.

It's not clear to me what further subetlies there might be around un-escaped charated tho. For example it's probably still a good idea to escape control characters in that case, as per this example. If leaving as utf-8, what ranges would still need encoding? The current set as used in Rails might be an okay choice, but it's not obvious.

(Note: I'm only using rails as an example here as it's the one place I've noticed where there's actually been some kinda of conscious design decision)

from http-api-design.

geemus avatar geemus commented on July 16, 2024

Yeah, I think it is definitely helpful to reference places where some effort went in to making this decision already. un-escaped does seem likely to work in the most different places (without extra work), at least at a guess. Control characters are an odd case, but perhaps there would be cases where an API would want to include them for curl or something? It would be pretty weird, but maybe possible. You could argue that in most cases you probably shouldn't be including these in API responses in the first place I suppose.

Anyway, seems like we are still leaning more toward unescaped/raw if I'm not mistaken. I'd maybe even say that we could just leave it as that generic recommendation and defer whether or not we need to say it should be escaped for some narrower character set or not until somebody more explicitly runs up against the question so we can have clearer examples/inputs to work from. What do you think?

from http-api-design.

tomchristie avatar tomchristie commented on July 16, 2024

Coming back to this shortly, but in the meantime referencing Python's behaviour with ensure_ascii=False...

The standard JSON escape chars are escaped, using their shortforms... (Ie not the hex version)

  • "
  • \
  • /
  • \b
  • \f
  • \n
  • \r
  • \t

The following control characters are escaped to the hex notation:

  • \x00 - \x1F (Except where listed above)
  • \u2028 and \u2029

Everything else is regular unicode.

(Linking to simplejson because I cant find link to python source code ATM... https://github.com/simplejson/simplejson/blob/master/simplejson/encoder.py#L21)

from http-api-design.

tomchristie avatar tomchristie commented on July 16, 2024

Relevant link on u2028 u2029... http://stackoverflow.com/questions/2965293/javascript-parse-error-on-u2028-unicode-character

Do shout me down if I'm being too verbose :) seems best to put this things down for future reference.

from http-api-design.

tomchristie avatar tomchristie commented on July 16, 2024

Not sure if relevant or not, but JSLint on 'unsafe' chars that should be escaped (in the context of a browser)... http://www.jslint.com/lint.html#unsafe

from http-api-design.

tomchristie avatar tomchristie commented on July 16, 2024

Okay, my thoughts after all that:

I guess I'd probably recommend either as okay, but unicode as preferred, due to being more user friendly when displayed. Would probably be okay to underspecify the required escapes, perhaps simply noting that control characters do need escaping, as per the JSON spec.

Alternatively, consider this as out-of-scope and offer no guidance one way or the other. (Also not unreasonable)

from http-api-design.

geemus avatar geemus commented on July 16, 2024

I'd be up for recommending unicode + escaped control characters. I think that seems reasonable and a good thing to note (thanks for the detailed references/notes). Would you be up for a pull request with the related verbiage? Thanks!

from http-api-design.

tomchristie avatar tomchristie commented on July 16, 2024

Sure thing, consider it on my todo list.
Feel free to nudge me again on here if not done in the next week or so.

from http-api-design.

geemus avatar geemus commented on July 16, 2024

@tomchristie no worries, certainly no hurry here. In the mean time we can definitely refer people to this discussion if it comes up. Will just be nice to polish it up and get it in there when you have a moment. Thanks!

from http-api-design.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.