cameron-powell / wunderground-scraper Goto Github PK
View Code? Open in Web Editor NEWA web scraper for weather underground's history pages
A web scraper for weather underground's history pages
Given a location and date (optional?), retrieve a valid history url to pass to scrape_weather_data
Currently, to cut down on how much JS we have to inject, only attempting to set the location when getting the results url. The results url has a set pattern and we can change the date to be what the user requested, and use the much more reliable requests module to get the data.
Ex: Los Angeles, California
Only expects single word city names currently. Needs to handle multi-word city names.
Explain what steps the program takes to accomplish the task and how it accomplishes each.
Explain how to kick off the unit tests
Airport PARL returns an odd/incomplete data history (at least for month 10, day 12, year 2017)
Handle this or fail gracefully.
If there are network issues/etc we may never receive a response. scrape_weather_data should plan for this.
Currently, the ads on the https://www.wunderground.com/history page sometimes keep the page from finishing loading (verify this is actually the issue). Attempted to fix this with 'window.stop();' and it seems to have improved, but not gone away entirely.
Theories:
Could be the JS is just stuck in the queue and there's so much going on it will take forever to get to it.
Could be the JS just isn't getting executed at all.
The steps aren't displaying correctly on github. Modify to make readable again.
When updating for accuracy, forgot to change numbers. Do this.
get_inputs validates inputs but does not notify the user what formats it's expecting. Display the formatting requirements for user inputs before asking the user for input.
'city, state' needs to be reformatted to 'city,+state' for the get request to work correctly.
Currently, get_url is using unacceptable "guesses" at waiting to make things work.
Make waits more intelligent and implement retries.
modify scrape_weather_data to store temperature data in a dict instead of building a string.
Convert the dict to json using the json module. This is a cleaner method.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.