Giter VIP home page Giter VIP logo

Comments (16)

itchyny avatar itchyny commented on September 18, 2024

This is not just an issue of updating operator, but indexing issue?

 $ jq -n '.x'
null

 $ jaq -n '.x'
Error: cannot index null

from jaq.

01mf02 avatar 01mf02 commented on September 18, 2024

Ok, so this history is a bit complicated.
Let us consider a slightly different example: 0 | .x. What does jq say to that?

$ jq -c -n '0 | .x'
jq: error (at <unknown>): Cannot index number with string "x"

Ok, so we cannot index numbers with strings. That is, there is no single number for which that operation makes sense.
Now comes the question: What sense does it make to index a null? I believe, none. That's why I throw an error there, because I believe it is most likely that the user made this operation by accident. (And that's why I also throw an error when updating against null.)
On the other hand, why does jaq yield null for something like {a: 1} | .b, for example? That's because there is nothing principally wrong with indexing an object with a string, so even though the object does not contain the key "b", jaq returns null here instead of throwing an error.

When I'm already talking about update operators: jq has this history of inventing values "from the air". For example:

$ jq -c -n '[1] | .[2] = 0'
[1,null,0]

Here, we update a single value of the array and implicitly grow the array at the same time.
I'll be frank here: I do not like this. Implicit behaviour like that, in my experience, bites you at some point, or is bad for performance.
That's why jaq is more strict than jq here and disallows that.

Now comes the question: Do you think that I should document these differences (of which there might be quite some) somewhere, and if so, where and in what form? In the README? In the tests? In the source?

from jaq.

itchyny avatar itchyny commented on September 18, 2024

I respect your design because this is your product. But README.md states while preserving compatibility with jq in most cases, so users encountering the behavior difference of the most basic filter may get confused. I don't think the behavior of jq is surprising. Adding notes on the strictness to README.md will help people understand how the tool is different from jq.

from jaq.

01mf02 avatar 01mf02 commented on September 18, 2024

Ok, so I am currently drafting a section in the README about smaller differences between jaq and jq.
While doing that, I thought more about the case null | .a = 0. I can see that if you interpret null as a neutral element, this makes sense. For me the question is: Does this pattern appear in idiomatic jq code? Did you ever use this pattern?

from jaq.

01mf02 avatar 01mf02 commented on September 18, 2024

I updated the README with a few differences in 0d00650, but this is surely not yet exhaustive.

from jaq.

itchyny avatar itchyny commented on September 18, 2024

Thanks for listing jq difference in README.md. Another difference to be worth noticing is the behavior of updating with multiple paths; {} | (.x, .y) = 0. The application of this simple example makes jq much powerful JSON processing tool IMO; (.. | strings) |= gsub("pattern"; "replacer") is very useful in replacing JSON recursively. As for null indexing, I think there're lots of code depending on the behavior, especially with the alternative operator; [some complex indexing] // [default value] (note that the alternative operator does not catch errors so null | .x // 1 yields different result in jaq than in jq).

from jaq.

01mf02 avatar 01mf02 commented on September 18, 2024

Another difference to be worth noticing is the behavior of updating with multiple paths; {} | (.x, .y) = 0. The application of this simple example makes jq much powerful JSON processing tool IMO; (.. | strings) |= gsub("pattern"; "replacer") is very useful in replacing JSON recursively.

I agree, this feature makes jq quite powerful. I have already documented the difference. In the long run, I think it would be nice for jaq to also support this, but in the past, I have not yet achieved a nice way to implement this; that is, a way that is performant and that does not duplicate lots of code.

As for null indexing, I think there're lots of code depending on the behavior, especially with the alternative operator; [some complex indexing] // [default value] (note that the alternative operator does not catch errors so null | .x // 1 yields different result in jaq than in jq).

Fair point. So I just drafted some code that makes null | .x yield null instead of an error.
My current rationale to allow this is that null can be interpreted as empty array or as empty object, depending on the circumstance. However, that interpretation of null is not satisfied in jq, because running null | .[] in jq yields an error --- instead of an empty sequence! (Both [] | .[] and {} | .[] yield an empty sequence, so I would find it logical if null | .[] would yield that as well.)
So how would you go about resolving this conflict? Would you find it acceptable if null | .[] yields the empty sequence, diverging from jq's behaviour? Or do you see another rationale why null | .x should not fail?

from jaq.

01mf02 avatar 01mf02 commented on September 18, 2024

And on a related note: What should null | .[] = 1 yield? With my proposed interpretation of null, I believe that the result should be null. But jq yields an error ...

from jaq.

01mf02 avatar 01mf02 commented on September 18, 2024

Also, null | keys yields an error in jq, while I think that [] would be a perfectly fine result.

from jaq.

01mf02 avatar 01mf02 commented on September 18, 2024

By the way, my rationale is also consistent with the behaviour that [] + null yields [] and {} + null yields {}.

from jaq.

01mf02 avatar 01mf02 commented on September 18, 2024

I have a bit more formal specification of what I believe to need to hold for null values:

I call the neutral values: false, 0, "", [], or {}.
I say that a filter f yields a value if it does not yield an error.

Proposal for operations on null: An operation f(null) should yield a value v if and only if:

  1. for at least one neutral value x, f(x) does yield a value, and
  2. if there are two neutral values x1 and x2 such that f(x1) yields a value y1 and f(x2) yields a value y2, then y1 must be equal to y2.

For example, what about null | .x yielding null? In this case, we can consider def f(n): n | .x. Property 1 is satisfied because f({}) (expanding to {} | .x) yields null. Property 2 is satisfied because there is no other neutral value x than {} that makes f(x) yield a value.

What about null | .[] yielding the empty sequence? In this case, def f(n): n | .[]. Property 1 is satisfied because f([]) yields the empty sequence. Furthermore, property 2 is satisfied because also f({}) (expanding to {} | .[]) yields the empty sequence. No other neutral value yields a value; for example, false | .[] yields an error.

from jaq.

itchyny avatar itchyny commented on September 18, 2024

Making null to be iterable as well as indexable feels very consistent semantics, but I'm afraid it changes various filters;

  • def keys: [path(.[])[]]; emits [] on null (as you noticed, looks reasonable),
  • def to_entries: [keys[] as $k | {key: $k, value: .[$k]}]; emits [] on null (looks ok)
  • def map(f): [.[] | f] emits [] on null (not sure it's ok or not)
  • def join($x): reduce .[] ...; emits "" on null (maybe ok?)
  • def recurse: recurse(.[]?); (does not change the behavior on null?).

Your formal specification looks a sophisticated way of reasoning the changes, but I can give a counter example with no explicit branching; def f(n): n * n > n;; both f(0) and f({}) emit false but f(null) should throw an error (there's no multiplication on null). Another example is def f(n): 0 * n > n; for f(0) == f("").

from jaq.

01mf02 avatar 01mf02 commented on September 18, 2024

Your formal specification looks a sophisticated way of reasoning the changes, but I can give a counter example with no explicit branching; def f(n): n * n > n;; both f(0) and f({}) emit false but f(null) should throw an error (there's no multiplication on null). Another example is def f(n): 0 * n > n; for f(0) == f("").

Thank you for your counterexample! I will have to think about this a bit more ...

On a different note, I have implemented in ab5c764 some initial support for updating with non-path expressions! This enables support for {} | (.x, .y) = 0, for example. Currently supported on the left-hand side are paths, |, , and if ... then ... else ... end. I am optimistic that recurse / .. should be possible to implement too, but I did not get around to it yet.

The implementation of the update operation should be significantly faster than in jq, because jaq does not explicitly construct paths. Furthermore, jaq aims to give more "intuitive" answers; for example, [1, 2, 3, 4, 5] | .[] |= empty yields [] in jaq, whereas it yields [2, 4] in jq. The latter is because jq first builds the paths for .[], which are 0, 1, 2, 3, 4 (the indices of the list). Then, it deletes the 0th element, yielding [2, 3, 4, 5]. Then it deletes the 1st element of the new list --- which is not 2, but 3!! This yields then [3, 4, 5]. Then, jq deletes the 2nd element, which is 4, yielding [3, 5]. Finally, it deletes the 4th and 5th elements, yielding the same list as before, because this list does not contain any 4th and 5th elements. I consider this behaviour to be deeply flawed.

Still, the current implementation of updating in jaq is quite new, so if you find some bugs there, I'll be happy to fix them. :)

from jaq.

itchyny avatar itchyny commented on September 18, 2024

That deletion issue of jq is well known (jqlang/jq#2051) and I suggested a fix two years ago (jqlang/jq#2133).

from jaq.

01mf02 avatar 01mf02 commented on September 18, 2024

Oh, that is interesting to read! I think it is a pity that your pull request was never integrated ...

from jaq.

itchyny avatar itchyny commented on September 18, 2024

I'm closing because this is a design decision of jaq rather than a compatibility issue.

from jaq.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.