Giter VIP home page Giter VIP logo

nestly's Introduction

nestly

This is a Node module and command line tool for creating structured data from tabular input using a declarative "meta-structure" in JSON or YAML, which looks something like this:

values:
  '{x}':
    '{y}': '{z}'

In a meta-structure, any object key with { and } markers becomes a nesting point for your data. Everything else (such as the top-level values key) is preserved as-is. In the case above, the tabular input would be assumed to have columns named x, y, and z. The other implicit assumption is that there is only one unique value of z for each permutation of x and y. So, if you had some CSV data like this:

x,y,z
a,a,2
a,b,3
b,a,1
b,b,5

Then nestly would combine the above meta-structure and data to create the following output in YAML:

values:
  'a':
    'a': '2'
    'b': '3'
  'b':
    'a': '1'
    'b': '5'

Command line interface

The nestly command line tool works like this:

nestly [--config file | filename] data-filename [-o | output-filename]

Options:
  --cf          The format of the config file: "json" or "yaml"
                                                       [default: "json"]
  --if          The format of the input data: "csv", "tsv", "json", or
                "yaml"
  --of          The format of the output data: "json" or "yaml"
                                                       [default: "json"]
  -c, --config  The path to your nesting configuration file
  -i, --in      The path of your input data file
  -o, --out     The name of the ouput file
  -h, --help    Show this help screen

The config should be a JSON or YAML file that encodes the "meta-structure" described above. If you don't provide the --if (input format) and --of (output format) options, then the formats are inferred from the filenames. (You should provide the format options if you're piping to or from stdio.) The above output would be generated with:

nestly --config structure.yml xyz.csv -o nested.yml

What problem does this solve?

I made it to ease the incorporation of data into Jekyll projects, where tabular formats can be tricky to work with. For instance, let's say you have some data in a spreadsheet like this:

City Year Population
San Diego 2012 1,337,000
San Diego 2013 1,356,000
San Francisco 2012 827,420
San Francisco 2013 837,442
San Jose 2012 982,579
San Jose 2013 998,537

If I wanted to get the population for a particular city and year in a template, I would need to do something funky like this:

{% assign row = site.data.cities | where: 'City', city | where: 'Year': year | first %}
{{ row.Population }}

But if we generated data with a structure like this:

'{City}':
  population:
    '{Year}': '{Population}'

Then getting the population for a city becomes:

{{ site.data.cities[city].population[year] }}

nestly's People

Contributors

shawnbot avatar

Stargazers

 avatar Lauren Ancona avatar Drew Prentice avatar timelyportfolio avatar

Watchers

timelyportfolio avatar James Cloos avatar  avatar  avatar

nestly's Issues

Nestly should create a new directory to output files into

For instance, in the following usage in EITI, I expected that running this command would create a directory _data/county_revenue and write a series of yml files within it.

data/county_revenue:
    $(query) --format ndjson " \
        SELECT \
          state, \
          fips, \
          county, \
          year, \
          ROUND(revenue) AS revenue \
        FROM county_revenue \
        WHERE \
          state IS NOT NULL AND \
          county IS NOT NULL \
        ORDER BY state, fips, year" \
      | $(nestly) --if ndjson \
          -c _meta/county_revenue.yml \
          -o '_$@/{state}.yml'

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.