This project aims to demonstrate basic scraping techniques using Python scrapy library on the school catalogue published at https://www.atlasskolstvi.cz.
Python 3.9+
pip package installer
virtualenv library
Go to a folder and create a pyhon
cd your-parent-folder
virtualenv venv
This should create a venv
environment in your folder.
Further, run the below command to activate your virtual environement.
./venv/Scripts/activate
pip install -r requirements.txt
pip list
Generate a basic scrapy project and your first spider
scrapy startproject src
scrapy genspider atlasskol atlasskolstvi.cz
Set DATA_OUT
in src/settings.py
to a path on your computer where crawled data will be stored.
Inspect other settings to fit your needs.
Woohoo!
scrapy crawl atlasskol