This is a web crawling project built using the Scrapy framework in Python.
This project contains a spider crawls through the /books.toscrape.com and collect the books data from all the page of the website, also as it scraps I have created middleware and pipeline so it can directly get stored in mysql database or run
scrapy crawl booksider -O to-the-file.json or .csv
- Python 3.6 or higher
- Scrapy (
pip install scrapy
)
Clone this repository:
git clone https://github.com/yourname/scrapy-crawler.git
cd scrapy-crawler