trillium-solutions / gtfs-feed-archive Goto Github PK
View Code? Open in Web Editor NEWGTFS (general transit feed) archive utility
GTFS (general transit feed) archive utility
@ed-g -- When attempting to download a full archive from the ODOT private portal, users are consistently recieving this error message. Could you take a look when you have a minute?
Let me know if you need any other info.
http://archive.oregon-gtfs.com/oregon-private-feeds/archive-target
Integrate with the timbre library so we can capture log messages to show them in the web interface as well as writing to a file, or printing to console.
Reliability improvements for archive.oregon-gtfs.com.
If archive file is available send its contents with status 200 OK.
If archive file has been requested but is not yet available, return 204 No Content.
If the archive file does not exist and has not been requested, return 404 Not Found.
Load/save cache manager to an EDN file, so we can remember what we've already downloaded.
Running download agents should not be persisted.
When loading we should verify that referenced files actually exist.
This will create a need to expire old/unnecessary cache entries, since otherwise the cache will just keep growing... :-)
That way multiple download won't clobber each other.
I am interested in adding some information to the zip file of zip files;
data location URL, perhaps as a third column in the last-updates-csv file?
For example, https://www.miapp.ca/GTFS/google_transit.zip
It looks like the download never starts, or it times out.
Give a warning if not all files from the GTFS list can be downloaded, and if the user elects to continue anyway, produce a subset of the feeds as Oregon-GTFS-feeds-INCOMPLETE-date.zip
This seems better than producing no archive file if a feed is unfetchable.
Of course when they go to their feeds page the broken feeds will show up but they may not have checked recently.
Either for all GTFS feeds, or only those changed since a certain date. Drop the archives in the users' archive download directory.
- GTFS Archive Tool needs to run automatically on a regular basis
- We need for agency name, feed name, feed URL, and last modified date
to be available either in a CSV or from an API
Capturing a note.
I think the ZIP file download system is clunky for purposes except truly making an archive.
What we want is a way to query the historical versions of feeds contained in the archive and download their data via REST.
How about an endpoint that gives:
Compare last-modified dates of file in the cache-manager. Function (already-have-fresh-feed? feed-name date) which will check based on the cache refresh interval.
If so the download agent should have a "successful" state with file-saved => true, however it should change its file name to be the same as the existing finished download.
this link is published on oregon-gtfs.com
http://archive.oregon-gtfs.com/oregon-public-feeds/archive-creator
it yields an error message
cc @ed-g
spotted by ODOT
The :download-file name depends on the modification time of the feed file.
Therefore until the feed is reachable, we won't know what to call it. Feeds where the network was down when the download agent was started were getting file names with no date. Instead, we can just wait until we grab the file, and then use its modification date.
Locate a clojure library that supports checking the last modified time using the FTP "LIST" command. clj-ftp does support downloading but there is no LIST operation.
Some agencies don't provide a public download link but the data is available after filling out a form. We should just skip entries that don't have a download URL.
Download the file if they don't provide modification-time, then compare against existing archives to see if its the same file we already have: if it compares the same, then pretend as though it had the same modification time.
We could also short-cut by checking the file size. It's not guaranteed to change when the feed does, but it probably will. Then download, say, once per week to make sure.
to work around servers which require a specific user agent
Check the download URL for 200 / 204 / 400 status.
Since some front-ends don't allow an HTTP HEAD request, we must send a GET first, find our redirect, and then check the modification-time.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.