Grab Food Delivery Web Scraper

This project aims to develop a web scraper to extract specific information from the Grab Food Delivery platform.

One View: Objective, Approach, Challenges, and Go-to Strategy Look (Finalized Approach-II)

Introduction 🧩

It scrapes restaurant lists, details, delivery fees, and estimated delivery times for selected locations. The scraper is implemented using Python and necessary frameworks like Selenium, following object-oriented programming (OOP) concepts, and optimized for scalability and performance using multithreading.

Tasks 📝

The tasks performed by the web scraper include:

Extracting restaurant lists with details.
Creating a unique restaurant list.
Extracting average delivery fees and estimated delivery time for selected locations.

Data Extraction ⌗

The scraper extracts the following fields/column data visible on the Grab Food Delivery website:

Restaurant Name
Restaurant Cuisine
Restaurant Rating
Estimate Time of Delivery
Restaurant Distance from Delivery Location
Promotional Offers
Restaurant Notice
Image Link of the Restaurant
Is Promo Available (True/False)
Restaurant ID
Restaurant Latitude and Longitude
Estimate Delivery Fee

Documentation 📄

Approach and Methodology

Scraping Logic: The scraper navigates through the Grab Food Delivery website, and selects the location following API calls to fetch the restaurant's data.
OOP Implementation: The code follows object-oriented programming principles, ensuring modularity and maintainability.
Optimization: Multithreading is employed to enhance performance and scalability, enabling efficient data extraction.
Data Handling: Extracted data is saved in CSV and gzip of ndjson format for storage and analysis.

Challenges Faced ✅

Selenium Wire: The selenium wire package uses Blinker, whose latest version is no longer supported, so explicitly has to take 1.7.0.
Blocking and Authentication: I did proxy/IP rotation to avoid blocking one IP.

Improvements and Optimizations

Error Handling: Implement more robust error handling mechanisms to handle edge cases gracefully.
Proxy Rotation: Introduce proxy rotation in more efficient way, right now I am only doing the rotation at the very first step.
Multi-Processing: This can be much better if given time, I will try to optimize it more.

Execution Steps 🚀

# Clone this project
$ git clone https://github.com/{{YOUR_GITHUB_USERNAME}}/food-grab-web-scraping

# Access
$ cd food-grab-web-scraping

# Setup virtual environment
$ python3 -m venv venv

# Install dependencies
$ pip install -r requirements.txt

# Run the project
$ run XHR.py file

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
extracted_data		extracted_data
scrapping_scripts		scrapping_scripts
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Grab Food Delivery Web Scraper

One View: Objective, Approach, Challenges, and Go-to Strategy Look (Finalized Approach-II)

Table of Contents

Introduction 🧩

Tasks 📝

Data Extraction ⌗

Documentation 📄

Approach and Methodology

Challenges Faced ✅

Improvements and Optimizations

Execution Steps 🚀

Please note, this project is developed for education and learning purposes only.

About

Releases

Packages

Languages

Aadi71/food-grab-web-scraping

Folders and files

Latest commit

History

Repository files navigation

Grab Food Delivery Web Scraper

One View: Objective, Approach, Challenges, and Go-to Strategy Look (Finalized Approach-II)

Table of Contents

Introduction 🧩

Tasks 📝

Data Extraction ⌗

Documentation 📄

Approach and Methodology

Challenges Faced ✅

Improvements and Optimizations

Execution Steps 🚀

Please note, this project is developed for education and learning purposes only.

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages