Giter VIP home page Giter VIP logo

csv-err's People

Contributors

aaronlidman avatar sbma44 avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

csv-err's Issues

Obsolete postgres for normalizing

Everything in csv-err goes through postgres for normalizing. Import and export is slow, installation is complicated and fragile; I don't like relying on it.

Eventually, would like to rely on ogr2ogr and something like csv-fix and keep everything simple (and hopefully faster) with cli tools.


For osmi:
ogr2ogr -s_srs EPSG:4326 -t_srs EPSG:4326 -f CSV -lco GEOMETRY=AS_WKT the_csv.csv role_mismatch_hull.gml

Representative geometry

Some of the errors have geometry with them, we should use them for the possibility of filtering bounding boxes. These don't have to be full exact geometry just a representative point.

  • Maybe we should have a consistent geometry column?
  • What format should we standardize around? Right now everything is WKT because it's lightweight?

cc @sbma44

Split specific sources in s3

Currently everything is grouped by it's origin, for example keepright and a bunch of errors are gathered together into their own csvs.

So there is:

  • keepright-tasks/error1.csv
  • keepright-tasks/error2.csv
  • keepright-tasks/error3.csv

and then that directory is zipped up.

But sometimes certain errors can't be generated because of random problems, the world is chaotic and such (#2). So, we only want to replace certain files, that isn't currently possible with the directory zip, these all need to be individual files on s3 and then replaced when possible.

Then, how do we know when something came from? Where can I store a timestamp?

cat all major and all minor

There's a marginal difference between the 1, 2, 5 classification. I've been combining them this week and it's fine.

Handle download errors

Previously: osmlab/to-fix#2

Right now the error is

Server error! We are sorry for the inconvenience. 35
Please contact info@geofabrik if the error persists.

but I've seen slightly different errors as well so we can't count on that. Best approach would be to check the file size after download and only proceed with importing those that actually exist. This means export scripts also need to be able to not find certain things that are missing and handle it.

So the goal is that some things might get updated some might not and that's ok. Everything is just ok.

Switch to geojson

Need to look into how difficult this will be downstream in to-fix.

Goal is to bake vector tiles for easy styling; to see coverage. That's easiest with geojson.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.