Name | Badges |
---|---|
agate | |
agate-charts | |
agate-dbf | |
agate-excel | |
agate-lookup | |
agate-remote | |
agate-sql | |
agate-stats | |
csvkit | |
leather | |
proof |
wireservice / lookup Goto Github PK
View Code? Open in Web Editor NEWA repository of journalist's lookup tables.
A repository of journalist's lookup tables.
This one might be too big...
This is pretty awesome, and what I'm suggesting is possibly overkill, but I was wondering if you had considered using one of the CSV schema formats for specifying the fields in the CSV. These seem to be the two biggest ones out there:
I will admit this seems like a bit of overkill for a CSV of states, but it might be useful if you wanted to automatically validate future changes or additions with an automated test and then you get CI for your CSV. For instance, Goodtables is a validator that uses the JSON schema format (although it needs some work). CSVLint is another new entrant I haven't evaluated it but it also uses the JSON schema format (which seems like the one to consider now).
We have a bunch of stuff like that over in latimes-statestyle that might fit here.
I find the stated description of this repo "A repository of journalist's lookup tables." quite ambiguous.
What types of open data are the maintainers willing to accept?
Should we have an "overflow" repository for other open data which is beyond the scope of this repository, with a more permissive merging strategy?
Sao Tome, for instance.
The documentation makes it seem as if columns
will only allow for a key and value pair. But what if there's a 3-way lookup, e.g. "New York", "NY", "N.Y.", etc...I'm guessing that's alluded to here:
but is key/colname
: datatype
enough? Or rather, is the succinctness worth the limitation in expanding the format?
I'm thinking of Census decade-to-decade lookup tables, in which sometimes later tracts incorporate a combination of past tracts, and this complexity would seemingly be needed to state at the columns
level of metadata.
Also, having a "human readable full name" attribute for each column would be nice.
Anyway, I know these aren't easy questions with non-tradeoffs...but thanks for taking charge on this!
I say this because I often seen FIPS codes provided with leading zeros. Forcing everything to integers might be a workaround on that problem.
I was just grabbing this month's Canadian house price index data, and of course they decided to encode their dates like this:
Date | Index |
---|---|
Jan-2015 | 167.110 |
Feb-2015 | 167.320 |
Mar-2015 | 167.830 |
Apr-2015 | 168.090 |
May-2015 | 169.750 |
Jun-2015 | 172.220 |
Jul-2015 | 174.530 |
Aug-2015 | 176.590 |
Sep-2015 | 177.760 |
Oct-2015 | 177.960 |
Nov-2015 | 178.350 |
Dec-2015 | 178.260 |
Jan-2016 | 178.010 |
Feb-2016 | 179.200 |
It'd be nice to be able to automatically re-encode these to ISO 8601. Would this be a good application of lookup? There's bound to be some variation in how the months are abbreviated, so I'm not entirely sure. Also, days of the month might not always be in the dataset…
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.