Comments (16)
This is not just an issue of updating operator, but indexing issue?
$ jq -n '.x'
null
$ jaq -n '.x'
Error: cannot index null
from jaq.
Ok, so this history is a bit complicated.
Let us consider a slightly different example: 0 | .x
. What does jq say to that?
$ jq -c -n '0 | .x'
jq: error (at <unknown>): Cannot index number with string "x"
Ok, so we cannot index numbers with strings. That is, there is no single number for which that operation makes sense.
Now comes the question: What sense does it make to index a null? I believe, none. That's why I throw an error there, because I believe it is most likely that the user made this operation by accident. (And that's why I also throw an error when updating against null.)
On the other hand, why does jaq yield null for something like {a: 1} | .b
, for example? That's because there is nothing principally wrong with indexing an object with a string, so even though the object does not contain the key "b", jaq returns null here instead of throwing an error.
When I'm already talking about update operators: jq has this history of inventing values "from the air". For example:
$ jq -c -n '[1] | .[2] = 0'
[1,null,0]
Here, we update a single value of the array and implicitly grow the array at the same time.
I'll be frank here: I do not like this. Implicit behaviour like that, in my experience, bites you at some point, or is bad for performance.
That's why jaq is more strict than jq here and disallows that.
Now comes the question: Do you think that I should document these differences (of which there might be quite some) somewhere, and if so, where and in what form? In the README? In the tests? In the source?
from jaq.
I respect your design because this is your product. But README.md states while preserving compatibility with jq in most cases, so users encountering the behavior difference of the most basic filter may get confused. I don't think the behavior of jq
is surprising. Adding notes on the strictness to README.md will help people understand how the tool is different from jq.
from jaq.
Ok, so I am currently drafting a section in the README about smaller differences between jaq and jq.
While doing that, I thought more about the case null | .a = 0
. I can see that if you interpret null
as a neutral element, this makes sense. For me the question is: Does this pattern appear in idiomatic jq code? Did you ever use this pattern?
from jaq.
I updated the README with a few differences in 0d00650, but this is surely not yet exhaustive.
from jaq.
Thanks for listing jq difference in README.md. Another difference to be worth noticing is the behavior of updating with multiple paths; {} | (.x, .y) = 0
. The application of this simple example makes jq much powerful JSON processing tool IMO; (.. | strings) |= gsub("pattern"; "replacer")
is very useful in replacing JSON recursively. As for null indexing, I think there're lots of code depending on the behavior, especially with the alternative operator; [some complex indexing] // [default value]
(note that the alternative operator does not catch errors so null | .x // 1
yields different result in jaq
than in jq
).
from jaq.
Another difference to be worth noticing is the behavior of updating with multiple paths;
{} | (.x, .y) = 0
. The application of this simple example makes jq much powerful JSON processing tool IMO;(.. | strings) |= gsub("pattern"; "replacer")
is very useful in replacing JSON recursively.
I agree, this feature makes jq quite powerful. I have already documented the difference. In the long run, I think it would be nice for jaq to also support this, but in the past, I have not yet achieved a nice way to implement this; that is, a way that is performant and that does not duplicate lots of code.
As for null indexing, I think there're lots of code depending on the behavior, especially with the alternative operator;
[some complex indexing] // [default value]
(note that the alternative operator does not catch errors sonull | .x // 1
yields different result injaq
than injq
).
Fair point. So I just drafted some code that makes null | .x
yield null
instead of an error.
My current rationale to allow this is that null
can be interpreted as empty array or as empty object, depending on the circumstance. However, that interpretation of null
is not satisfied in jq, because running null | .[]
in jq yields an error --- instead of an empty sequence! (Both [] | .[]
and {} | .[]
yield an empty sequence, so I would find it logical if null | .[]
would yield that as well.)
So how would you go about resolving this conflict? Would you find it acceptable if null | .[]
yields the empty sequence, diverging from jq's behaviour? Or do you see another rationale why null | .x
should not fail?
from jaq.
And on a related note: What should null | .[] = 1
yield? With my proposed interpretation of null
, I believe that the result should be null
. But jq yields an error ...
from jaq.
Also, null | keys
yields an error in jq, while I think that []
would be a perfectly fine result.
from jaq.
By the way, my rationale is also consistent with the behaviour that [] + null
yields []
and {} + null
yields {}
.
from jaq.
I have a bit more formal specification of what I believe to need to hold for null
values:
I call the neutral values: false
, 0
, ""
, []
, or {}
.
I say that a filter f
yields a value if it does not yield an error.
Proposal for operations on null
: An operation f(null)
should yield a value v if and only if:
- for at least one neutral value x,
f(x)
does yield a value, and - if there are two neutral values x1 and x2 such that
f(x1)
yields a value y1 andf(x2)
yields a value y2, then y1 must be equal to y2.
For example, what about null | .x
yielding null
? In this case, we can consider def f(n): n | .x
. Property 1 is satisfied because f({})
(expanding to {} | .x
) yields null
. Property 2 is satisfied because there is no other neutral value x than {}
that makes f(x)
yield a value.
What about null | .[]
yielding the empty sequence? In this case, def f(n): n | .[]
. Property 1 is satisfied because f([])
yields the empty sequence. Furthermore, property 2 is satisfied because also f({})
(expanding to {} | .[]
) yields the empty sequence. No other neutral value yields a value; for example, false | .[]
yields an error.
from jaq.
Making null
to be iterable as well as indexable feels very consistent semantics, but I'm afraid it changes various filters;
def keys: [path(.[])[]];
emits[]
onnull
(as you noticed, looks reasonable),def to_entries: [keys[] as $k | {key: $k, value: .[$k]}];
emits[]
onnull
(looks ok)def map(f): [.[] | f]
emits[]
onnull
(not sure it's ok or not)def join($x): reduce .[] ...;
emits""
onnull
(maybe ok?)def recurse: recurse(.[]?);
(does not change the behavior onnull
?).
Your formal specification looks a sophisticated way of reasoning the changes, but I can give a counter example with no explicit branching; def f(n): n * n > n;
; both f(0)
and f({})
emit false
but f(null)
should throw an error (there's no multiplication on null
). Another example is def f(n): 0 * n > n;
for f(0) == f("")
.
from jaq.
Your formal specification looks a sophisticated way of reasoning the changes, but I can give a counter example with no explicit branching;
def f(n): n * n > n;
; bothf(0)
andf({})
emitfalse
butf(null)
should throw an error (there's no multiplication onnull
). Another example isdef f(n): 0 * n > n;
forf(0) == f("")
.
Thank you for your counterexample! I will have to think about this a bit more ...
On a different note, I have implemented in ab5c764 some initial support for updating with non-path expressions! This enables support for {} | (.x, .y) = 0
, for example. Currently supported on the left-hand side are paths, |
, ,
and if ... then ... else ... end
. I am optimistic that recurse
/ ..
should be possible to implement too, but I did not get around to it yet.
The implementation of the update operation should be significantly faster than in jq, because jaq does not explicitly construct paths. Furthermore, jaq aims to give more "intuitive" answers; for example, [1, 2, 3, 4, 5] | .[] |= empty
yields []
in jaq, whereas it yields [2, 4]
in jq. The latter is because jq first builds the paths for .[]
, which are 0, 1, 2, 3, 4
(the indices of the list). Then, it deletes the 0th element, yielding [2, 3, 4, 5]. Then it deletes the 1st element of the new list --- which is not 2, but 3!! This yields then [3, 4, 5]. Then, jq deletes the 2nd element, which is 4, yielding [3, 5]. Finally, it deletes the 4th and 5th elements, yielding the same list as before, because this list does not contain any 4th and 5th elements. I consider this behaviour to be deeply flawed.
Still, the current implementation of updating in jaq is quite new, so if you find some bugs there, I'll be happy to fix them. :)
from jaq.
That deletion issue of jq is well known (jqlang/jq#2051) and I suggested a fix two years ago (jqlang/jq#2133).
from jaq.
Oh, that is interesting to read! I think it is a pity that your pull request was never integrated ...
from jaq.
I'm closing because this is a design decision of jaq rather than a compatibility issue.
from jaq.
Related Issues (20)
- Parsing zero-padded numbers HOT 6
- Using JAQ with a custom type HOT 12
- Build failure for v1.3.0 on macos & linux, but building from latest main (1699a7d1fc7ede) works HOT 5
- Problem on console without support of color. HOT 7
- Build failure on 1.80 nightly and Windows HOT 3
- `--argjson` not supported HOT 5
- jaq for Apple Silicon HOT 2
- Bug using | more HOT 6
- Compile error on Windows HOT 6
- Consider creating easy-to-use library version/crate HOT 3
- Feature request: NO_COLOR standard implementation HOT 1
- Different behavior with 'add' filter HOT 3
- [bug] `map_values(@base64d)` results in `undefined filter` HOT 3
- Invalid link in README.md HOT 4
- `halt` and `halt_error` filters HOT 2
- JSON with Comments (jsonc) support HOT 12
- rustc 1.80 fails building --locked from crates.io HOT 4
- Why `ParseCtx.compile` doesn't return a Result? HOT 1
- jaq ignores multiline (`m`) flag in regex operations HOT 4
- reminder re add/1 HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from jaq.