Giter VIP home page Giter VIP logo

lovely-rita's Introduction

Project status: DELIVERED

This project has been delivered and is no longer maintained. If you'd like to reproduce or improve on it, please contact OpenOakland's Steering Committee at [[email protected]](mailto:[email protected]). Let us know why you're interested and what you hope to accomplish. Consider using the [Project Exploration Worksheet](https://docs.google.com/document/d/1k24P9JiAUEzJLPFRDjVh7aRZexax6NUhfPFLSI3R80M/edit?usp=sharing) (required for all new OpenOakland projects).

Lovely Rita: Insights from Oakland Citation Data

Lovely Rita is set of tools for reading, cleaning, and saving parking parking citation datasets. The name pays homage to the song, Lovely-Rita, by the Beatles.

The project is a part of Oakland's Code for America brigade OpenOakland. You can read more about the project in this presentation.

With Lovely Rita, you can load historical parking citation data, clean the data (addresses and dates), geocode (turn addresses into geospatial coordinates), and save cleaned data to shapefiles for GIS analyses.

Check out our documentation for more detail.

Installation

It is good practice to use a virtual environment.

git clone https://github.com/openoakland/lovely-rita.git
cd lovely-rita
pip install -r requirements.txt
pip install . --user

Raw data format

Raw data should be provided in a .csv with the column names (in any order):

ticket_number
ticket_issue_date
ticket_issue_time
street
street_name
street_number
street_suffix
violation_external_code
violation_desc_long
state
city
badge_number
fine_amount

Command line interface

Several useful workflows can be run from the command line. Learn about the available workflows using lovelyrita --help. Learn about a specific workflow using lovelyrita <workflow> --help.

Python interface

There is also a python inferface if you want to dive deeper into the data. There are more involved examples in the notebooks folder.

Read in the data

from lovelyrita.data import read_data
citations = read_data(data_path)

Clean the data

Lovely Rita can also clean and parse addresses and dates.

from lovelyrita.data import read_data
from lovelyrita.clean import clean
citations = read_data(data_path)
citations = clean(citations)

Analyze the data

  1. Number of citations per zip code
  2. Time-series, number of citations
  3. Type of violation by zip code

Save the data

There is also support for storing the data to shapefiles

from lovelyrita.data import write_shapefile
write_shapefile(citations, 'my-shapefile.shp')

Documentation

Clone the gh-pages branch

git clone -b gh-pages http://github.com/openoakland/lovely-rita.git lovely-rita-docs

Make changes to docs/source/*.rst in master branch.

Build the docs.

cd docs
make html

Docs are built to ../../lovely-rita-docs/html

git add -u git commit -m "docs message" git push origin gh-pages

Tests

There will be tests.

Contributing

Please read CONTRIBUTING.md for details on our code of conduct, and the process for submitting pull requests to us.

Authors

The many wonderful people who helped design and build Lovely Rita (* denote active contributors):

License

This project is licensed under the MIT License - see the license file for details.

Acknowledgments

We would like to acknowledge the help of Danielle Dai and the Oakland Department of Transportation for providing the data and invaluable guidance for this project.

lovely-rita's People

Contributors

atomahawk avatar drewerickson avatar jjia25 avatar r-b-g-b avatar ricky-boebel avatar slavster avatar theecrit avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

lovely-rita's Issues

Data Visualization Tool

Based on the meeting with Danielle, I got the impression that a useful tool would be something to visualize the data segmented in different ways (by neighborhood, by officer, by citation) in various plots. It got me thinking that if we could narrow that down to a few key plots she might want to see, we could design a visualization tool that could allow you to view that plot with different segmentations.

Requirements:

  • Website for data visualization
  • API to serve data to website (can be on same machine as website)
  • DB to store data and respond to API

This might be a long term goal to add. To accomplish this, we'll need javascript or similar skills. We could mention that at the next presentation.

Create Data Processing Pipeline

We need an initial module that handles the cleaning, joining, and imputing of parking citation data.

Assignments

  • Andrew to complete
  • Drew to do code review

Reformat address line data to increase geocoding success rate

Sub-Issue list:

  • block specification within data. examples: 'bll', 'bllk','bk', blck' , 'bl0ck', 'bkl', 'block', 'bfk' (some codes may have other meanings
  • too many digits within street number, sometimes caused by a lack of a "-" (https://www.reddit.com/r/explainlikeimfive/comments/2own82/eli5_how_do_street_addresses_work_why_is_my/)
  • street name is a number example: 112713th ave oakland california and merges with street number
  • the word lot is inserted within text.
  • corner of two streets. example: perry pl orange st oakland california
  • irregular symbols and punctuation.

README needs work

Although we have an outline, it could definitely use some details

Geocode Data Scrape / Import Pipeline

We need an initial module that will:

  • Collect unique addresses from the data
  • Query those addresses against a geocode API / DB
  • process the response data into an entry in our DB

Assignments

  • Drew to complete.
  • Andrew to do code review.

Refine Project Goals

We need to refine some key project goals based on the information Danielle gave us. This can be added as an additional doc in the repo.

Since Andrew and Joanna have been involved the longest, I'm going to assign them first, but it makes sense for all of us to review and give feedback.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.