a scraper for Redfin.com using Python3 for relevant real estate information for recently sold homes.
Three methods are introduced to prevent getting blacklisted while scraping.
- random sleep
- random User-Agent
- using different proxies
- RedfinDownloadParser: parse the cvs download from the page. if it exists, download and parse, otherwise, return fail.
- RedfinPageParser: if the cvs isn't provided, the html page will be parsed. And should handle multiple pages.
- Modular
- Request response for different HTTP Code
- Test
- SQL persistants