Comments (46)
In the meantime, here's a simple tsv hack that takes advantage of string substitution (and will break if you have any tabs embedded in strings):
$ echo '{"one":1,"two":"x"}' | jq --raw-output '"\(.one)\t\(.two)"'
1 x
from jq.
This is working but undocumented in master, with a slightly strange syntax that may change. @csv
is a builtin function that reformats its argument (which must be an array) as a CSV string. So, something like:
./jq -r '@csv'
will output 1,2,"hello, world"
when given [1,2,"hello, world"]
.
from jq.
One issue is that order is significant for csv, but not for JSON fields: we can't rely on the order of fields in JSON. This can be solved by specifying the mapping from JSON named fields to csv positional fields, by constructing an array of those fields, using [.date,.count,.title]
:
input: { "date": "2011-01-12 13:14", "count": 17, "title":"He's dead, Jim!" }
jq -r '[.date,.count,.title] | @csv'
"2011-01-12 13:14",17,"He's dead, Jim!"
The csv output has slightly overzealous quoting (more than your target), but does no harm.
A second issue is you want to apply this to an array of objects, not just one. The .[]
operator streams each item of an array in turn:
jq -r '.[] | [.date, .count, .title] | @csv'
"2011-01-12 13:14",17,"He's dead, Jim!"
"2011-01-13 21:30",4711,"What do you mean, ""dead""?"
"2011-01-14 00:07",,"Dead!"
Finally, you also want a header, stating the csv field names at the top. The easiest way to do this is literally (we need parentheses because ,
binds tighter than |
):
jq -r '["date", "count", "title"], (.[] | [.date, .count, .title]) | @csv'
"date","count","title"
"2011-01-12 13:14",17,"He's dead, Jim!"
"2011-01-13 21:30",4711,"What do you mean, ""dead""?"
"2011-01-14 00:07",,"Dead!"
But duplication is bad. We can avoid repeating the same list of field names, by reusing the header array to lookup the fields in each object - but this looks confusing to me, I wonder if there's a clearer way?:
jq -r '["date", "count", "title"] as $fields| $fields, (.[] | [.[$fields[]]]) | @csv'
EDIT here it is as a function , with a slightly nicer field syntax, using path()
(but you still need to remember the -r
switch - maybe a --csw
does make sense):
def csv(fs): [path(null|fs)[]] as $fields| $fields, (.[] | [.[$fields[]]]) | @csv;
USAGE: csv(.date, .count, .title)
PS: if the input is not an array of objects, but just a sequence of objects (your second, JSONish, input), then we could just omit the .[]
- but then we can't get the header at the top. It's probably best to convert it to an array, using the --slurp/-s
option (or put []
around it, if it's generated within jq).
PPS: You can also get an array of the fields of an object using keys
, but it returns them in alphabetical ("unicode codepoint") order. Of course, one could create a tool that does retain the ordering of JSON fields - but it's not part of the JSON spec, and JSON tools and web APIs don't seem to do it. (Though I think it would be convenient in some ways e.g. ordering the string/number fields first make JSON much more human-readable when deeply nested).
from jq.
Looks like .rates | to_entries | .[] | [.key,.value] | @csv
is what you want, then.
from jq.
Hrm. This would definitely be useful, but there are a lot of annoying special cases here. I'm not sure what should happen if the output's not a set of json objects, or if they have different fields. Also, CSV quoting rules are a nightmare, but I suppose I wouldn't have to parse the CSV afterwards...
from jq.
Just make it Excel compliant...
from jq.
+1
Just require the output to be arrays, and map those arrays directly to tsv/csv. Quoting shouldn't be too hard since you're generating it and not parsing it (as you point out).
from jq.
+1
from jq.
That sort of sounds like something else. If unclear, my wish is to get the csv output:
date,count,title
2011-01-12 13:14,17,"He's dead, Jim!"
2011-01-13 21:30,4711,"What do you mean, \"dead\"?"
2011-01-14 00:07,,Dead!
…apart from either of these two JSON/JSONish ones currently supported of the same data:
[ { "date": "2011-01-12 13:14", "count": 17, "title":"He's dead, Jim!" }
, { "date": "2011-01-13 21:30", "count": 4711, "title":"What do you mean, \"dead\"?" }
, { "date": "2011-01-14 00:07", "title":"Dead!" }
]
…or:
{ "date": "2011-01-12 13:14", "count": 17, "title":"He's dead, Jim!" }
{ "date": "2011-01-13 21:30", "count": 4711, "title":"What do you mean, \"dead\"?" }
{ "date": "2011-01-14 00:07", "title":"Dead!" }
Is that what you are shooting for, or something else that's useful for some other purpose?
from jq.
While json attribute order can be arbitrary, I would be happy if (in situations where the jq
command line does not stipulate an ordering) the csv column order is given by the order of attribute discovery from the input file, i e date
, count
and title
in my example, as given by the input order of the first object (while ignoring the order of all subsequent input lines, since none of them introduce any additional attributes / columns).
For (typical?) cases where the jq
command line itself names what fields the output should have, it would indeed be nice to figure out a DRY syntax that infers order by your already stated order on that command line, and either always print out the header names on the first line, or have a special csv-mode flag to specifically disable that output (make the rare case a flag).
My gut feel is that asking for csv output is best done as a command-line mode flag rather than as a step in the pipeline, so that kind of housekeeping logic becomes the responsibility of jq
rather than yourself, while also making it easier to switch output formats back to json again, should you want to, but maybe that is substantially harder to implement?
from jq.
Sorry for digging out this topic, but does @csv work in the currrent version? I got "compile error" when I try to use @csv, @html or others. I am using the windows jq version.
from jq.
No, it's not included in the current version.
from jq.
Is it possible to download somewhere the previous version that had this implemented? I found that it is the only way to parse json...
from jq.
IMO this could be left to a separate utility.
from jq.
@svpenn Thank you! CSV works, however this build adds some numbers to the parsed data. For example:
←[0m←[34;1m"id"←[0m←[37m: ←[0m←[0m518377←[0m←[37m,
The original data is id:518377.
Does anybody know how to resolve this issue?
from jq.
@svnpenn Thanks a lot! :)
from jq.
👍 working well for me in master. love this feature.
from jq.
@csv
is great! Can we get a @tsv
as well?
from jq.
+1 for @TSV 👍
Update: if you arrived here trying to find a way to do tsv formatting, the command below should also work:
jq -r '"\(.date)\t\(.count)\t\(.title)"'
from jq.
Here's a the syntax for tab delimited when using DOS:
jq -r ".aggregations[] | .buckets[] | "(.key)\t(.doc_count)""
from jq.
I fail to use the @csv
and the suggestions of @stevezieglerva . Assume that I have:
[
{
"a": 1,
"b": 2
},
{
"a": 3,
"b": 3
},
{
"a": 2,
"b": 1
}
]
stored in foo.bar
. How can I export it to CSV? Tried some combination, but I didn't manage to make it work...
from jq.
OK. Got it...
cat foo.bar | jq '.[] | [.a, .b] | @csv'
Follow up: how can I get rid of the "
's and their escapes (in the case where the value is a string for instance)?
from jq.
@drorata - You might find the -r command-line option useful:
$ $ jq -n '["a","b","a\"b"]|@csv'
"\"a\",\"b\",\"a\"\"b\""
$ jq -r -n '["a","b","a\"b"]|@csv'
"a","b","a""b"
Notice in particular the double-double-quotes in the last line.
However, using the "-r" option may also not be want you want, depending on which variant of CSV you expect. For example, the "-r" option will cause the newline in "a\nb" to become an actual newline.
@csv is useful but it was not designed to produce output that conforms strictly with RFC 4180 (which, for example, requires CR/LF). It looks as though the intent was to provide a simple "UTF-8-style CSV", so that if some specific style of CSV is required, an external tool (or tool chain) can be used.
from jq.
Perhaps @csv should output actual CSVs?
2014-09-11 15:20 GMT+02:00 pkoppstein [email protected]:
@drorata https://github.com/drorata - You might find the -r
command-line option useful:$ $ jq -n '["a","b","a"b"]|@csv'
""a","b","a""b""$ jq -r -n '["a","b","a"b"]|@csv'
"a","b","a""b"Notice in particular the double-double-quotes in the last line.
However, using the "-r" option may also not be want you want, depending
on which variant of CSV you expect. For example, the "-r" option will cause
the newline in "a\nb" to become an actual newline.@csv https://github.com/csv is useful but it was not designed to
produce output that conforms strictly with RFC 4180 (which, for example,
requires CR/LF). It looks as though the intent was to provide a simple
"UTF-8-style CSV", so that if some specific style of CSV is required, an
external tool (or tool chain) can be used."a
","b"
peter:~/jq$—
Reply to this email directly or view it on GitHub
#48 (comment).
from jq.
@slapresta asked:
Perhaps @csv should output actual CSVs?
@csv
must produce valid JSON, since it's a jq filter, not a "post-filter". I believe that jq in combination with -r does in fact produce "valid CSV" (see below), it being understood that there are many variations in common use.
Regarding the particular point about embedded newlines, jq is in conformance with the "800 pound gorilla CSV" standard:
http://www.creativyst.com/Doc/Articles/CSV/CSV01.htm#EmbedBRs
In fact, this document also sanctions the use of EITHER CR or CRLF as a record separator:
Each record is one line ...but
A record separator may consist of a line feed (ASCII/LF=0x0A), or a carriage return and line feed pair (ASCII/CRLF=0x0D 0x0A).
Perhaps what's needed is a reference to some "CSV standard" to which @csv conforms? Or @csv4180? In addition to @TSV, of course :-)
from jq.
Instead of polluting jq, why not use an external tool to convert JSON to CSV (e.g. https://github.com/jehiah/json2csv) ?
from jq.
Agree with JohnParkerXNU
from jq.
Update: see jq-hopkok's tabular folder.
For reference: RFC4180, Common Format and MIME Type for Comma-Separated Values (CSV) Files
http://www.ietf.org/rfc/rfc4180.txt
I use two scripts I pipe JSON into, and get CSV/TSV output. Object keys are used to generate a header.
https://github.com/joelpurra/jq-hopkok/blob/master/src/tabular/array-of-objects-to-csv.sh
https://github.com/joelpurra/jq-hopkok/blob/master/src/tabular/array-of-objects-to-tsv.sh
Because of implementation details in to_entries
(see #561 to_entries
always outputs keys sorted - can it be avoided?), I have to name keys so they're sorted in the output. I follow the format 01--My first column
, 02--My second column
, 03--A third column
, and clean them with another script.
https://github.com/joelpurra/jq-hopkok/blob/master/src/tabular/clean-csv-sorted-header.sh
https://github.com/joelpurra/jq-hopkok/blob/master/src/tabular/clean-tsv-sorted-header.sh
I'll throw the scripts into the (future) pile of utilities that is jq-hopkok when I have time to clean up code. Update: see jq-hopkok's tabular folder.
[
{
"01--Employee name": "Homer Simpson",
"02--Logins": 578
},
{
"01--Employee name": "Carl Carlsson",
"02--Logins": 75926
}
]
<"input.json" array-of-objects-to-tsv.sh | clean-tsv-sorted-header.sh >"output.json"
Employee name Logins
Homer Simpson 578
Carl Carlsson 75926
The same input piped through the CSV scripts produces this.
"Employee name","Logins"
"Homer Simpson",578
"Carl Carlsson",75926
I prefer TSV as it's much easier to split the raw data into columns (some tex/latex packages don't play nice with RFC4180 input). I've taken a shortcut in replacing \t
in strings with \\t
- in my own data I "assume" there are no tabs in the output.
Hope these scripts can help someone.
Update: see jq-hopkok's tabular folder.
from jq.
If I have a structure like this:
{
"timestamp": 1319730758,
"base": "USD",
"rates": {
"AED": 3.672626,
"AFN": 48.3775,
"ALL": 110.223333,
"AMD": 409.604993
}
}
jq '.rates | @csv' responds with "jq: error: object cannot be csv-formatted, only array"
Is it necessary to "cast" the object to an array? If so, how?
from jq.
There's no real casting, here. All the data types are too discrete, overall, for casting (except via string interpolation, but then all you have is a string.). The issue is that all @csv
does is print an array out in csv form. How exactly did you want the output printed?
from jq.
@wtlangford In this case, I am converting a json source to flat delimited for bulk loading into a relational database table (PostgreSQL). I can pipe the output of " jq '.rates' " into awk, filter for the pairs and replace delimiter characters, but keeping everything in jq would be cleaner.
Example:
jq '.rates' rates.json | awk -v OFS=',' '/:/ { gsub(/[\":,]/,"",$0); print "'\''" $1 "'\''" , $2 }'
from jq.
@wtlangford Very close! Is there a way to omit the brackets, and print key, value on same line?
"AZN", 0.786006
instead of
[
"AZN",
0.786006
]
from jq.
Don't forget the @csv
on the end. You might also want to pass -r
to jq when you invoke it to strip out the quoting escapes. Keep in mind that this may not quite be what you want, however. If any of your strings has a newline in it, that'll get printed out unescaped.
from jq.
Just out of curiosity, why was @csv
not sufficient at the end?
On Mon, Sep 22, 2014 at 6:09 PM, Matt Keranen [email protected]
wrote:
Thanks - this is close enough:
jq -c '.rates | to_entries | .[] | [.key,.value]' rates.json | tr -d "[]" | sed s/"/'/g
—
Reply to this email directly or view it on GitHub
#48 (comment).
from jq.
It was, I neglected it in a cut-and-paste error:
jq -c -r '.rates | to_entries | .[] | [.key,.value] | @csv'
from jq.
Excellent! Glad to have helped.
from jq.
Hi, I also have a problem with "Cannot index array with string"
My code: curl 'https://api.github.com/repos/stedolan/jq/commits?per_page=5' | ./jq -r '.[] | .committer | {login: .login,id: .id} | @csv'
Help would be appreciated
from jq.
The login and id values are associated with the "author" field:
jq -c -r '.[] | .author | {login,id} '
To output them using @csv:
jq -c -r '.[] | .author | [.login, .id] | @csv '
from jq.
Thank you!
from jq.
Thanks! the @csv option is very helpful 👍
from jq.
I've finally moved my CSV/TSV utility scripts to jq-hopkok's tabular folder -- see also other helpful shell scripts for parallel processing etcetera.
from jq.
It would be really great to have a way to turn lists of objects into csvs (with header) automatically.
Right now I'm resorting to json2csv
:
curl 'http://.../nfl/participants/' | jq -c '.apiResults[].league.players[] | {teamId: .team.teamId, nickname: .team.nickname, playerId: .playerId, firstName: .firstName, lastName: .lastName}' | json2csv > players.csv
I would love to instead do | @csv
directly to a list of objects.
curl 'http://.../nfl/participants/' | jq -r '.apiResults[].league.players[] | {teamId: .team.teamId, nickname: .team.nickname, playerId: .playerId, firstName: .firstName, lastName: .lastName} | @csv' > players.csv
Is there any chance something like this can be implemented?
EDIT: while https://github.com/joelpurra/jq-hopkok/blob/master/src/tabular/array-of-objects-to-csv.sh exists, I really need something that's either built into jq or at least easily installable like npm install -g json2csv
. Built into JQ would be the ideal here.
EDIT2: woah, did not see this before commenting: https://github.com/joelpurra/jqnpm
from jq.
Here's a def that takes a stream of JSON objects and produces output suitable for feeding directly to @csv or @TSV. It does NOT require the keys to be consistently ordered, and only pays attention to the keys in the first object. An illustrative example follows.
def object2array(stream):
foreach stream as $x (null;
if . == null then $x | [true, keys_unsorted] else .[0]=false end;
(if .[0] then .[1] else empty end),
.[1] as $keys | $x | [getpath( $keys[] | [.]) ] );
Example
def data: [{a:1,b:2}, {b:22,a:11,c:0}];
object2array(data[])
Output
["a","b"]
[1,2]
[11,22]
to_table
to_table
(available here) takes a very different approach. It can handle JSON irregardless of the depth of nesting, but is only useful if there is sufficient regularity, notably of the ordering of keys within corresponding objects.
from jq.
How can I easily use “object2array” myself and on my team? Is there some way to definite global functions and share them? I would like to just run “| object2array” and have it work on my laptop and my colleagues laptops. What’s the best distribution strategy
from jq.
Unfortunately, all the simple options that I know of have one drawback or another.
Probably the simplest approach would be for everyone to add the desired utility functions to a file called ~/.jq since, if it exists, it will automatically be read in on startup. The main drawback of this approach is that it prevents ~/.jq from being used as a directory, which jq's module system by default recognizes as the location of jq modules.
Another option would be to take advantage of jq's module system. This is a tiny bit complicated, but is certainly worth considering. Here is one approach:
- Copy the desired def into ~/.jq/jq.jq
- Invoke jq as usual but preface the jq program with
include "jq";
e.g. jq 'include "jq"; ...' input.json
A third option which I've had some success with is to piggy-back off npm, but as best I can tell, it's not completely trivial.
from jq.
@tsv
seems to work for me on jq-1.6 (Manjaro Linux)
from jq.
Related Issues (20)
- What is the character set of json key? Doesn't it also support hyphens? I found that jq has confusing problems with character set support. I tested 1.7 and previous versions. HOT 1
- Windows jq version 1.7 has a "dirty" suffix when downloading AMD64 version from https://jqlang.github.io/jq/download/ HOT 2
- OT: What versions of arm require `armel`? HOT 3
- We should add an info that release files are bare executables. HOT 1
- strftime produces corrupted characters in windows HOT 7
- Wrong math for big integers HOT 9
- [Feature] Add ~/.jq.d to default module search path HOT 6
- Python bindings HOT 1
- Running `jqtest` triggers UBSAN errors HOT 1
- jq_fuzz_fixed: count-down loop times out if the input is a big number HOT 10
- Strange jq performance problem HOT 14
- heap-buffer-overflow in jq1.7 at decNumber.c HOT 3
- CVE-2023-49355 status ? HOT 2
- Tests don't work in non-decnumber builds
- Feature Request - Full Path Json output HOT 5
- Broken non-Latin output HOT 14
- Version 1.7.1 not working well on XcodeCloud HOT 4
- 1 test fails on WSL Debian HOT 14
- Version number pattern
- Base64d filter is not working as expected. HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from jq.