Giter VIP home page Giter VIP logo

caltrainfails's Introduction

caltrainfails

reads information about caltrain delays from a twitter feed, and tees up some analytics

details:

My new job makes Caltrain an attractive commuting option, but based on the last time I tried public transit to the South Bay I'm rather skeptical that it will work out. But was it my bad luck or is Caltrain routinely horrible?

I couldn't find a well structured dataset that would provide me with delay information, but there is this cool Twitter feed @caltrain. there is a style-guide (http://cow.org/c/updating-guide) so there's some structure but it still is messy enough to make a nice insomnia problem :P

From a CSV file of caltrain fails, we output number of minutes of delays, time (from the timestamp), and direction of train by processing the tweets with REs. See the Excel sheet to understand how I worked some of the data up.

known areas for improvement:

  • set up a cronjob to update the tweets regularly (and get around the Twitter API imposed limit for retriveing them)
  • better way of figuring out where we need to update from than writing an ID and then tossing it out later
  • generally get more data to improve the analysis
  • improve the tests, which really are just function calls at this point :P
  • remove the double hits for NB vs SB, and better handle absence of train direction
  • general improved extraction of data from the tweets (thoughts?)
  • use the timestamp in the tweet text rather than from the tweet object
  • the timestamps look like they are occasionally coming out AM when they should be PM
  • for an actual commuter, what matters is how a NB morning leg and SB evening leg does (vs. NB in general). I need to think of a clean way of structuring this vs. just looking at graphs to make inferences

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.