🏢💼 Jobinja Job Listings Scraper 🔍💻

This repository contains a 🐍 Python script that scrapes job listings from Jobinja. The script is designed to extract detailed information from job ads, such as job title 📋, job type ⏱️, location 📍, and other relevant attributes, and output the data for further 📊 analysis.

✨ Features

Scrape Job Listings: Extracts information ℹ️ from job listings available on the Jobinja 🌐 website.
Detailed Data Extraction: Collects various attributes including job title 📜, company name 🏢, location 📍, work experience requirements 💼, contract type 📃, gender 🚻, minimum salary 💰, and education level 🎓.
Data Sorting and Display: Organizes the extracted data based on specified attributes and displays it in a tabular format 🧮 for easy analysis.
Save Extracted Data: Saves the sorted job listings as individual text files 📄 in a specified directory for later review.

🛠️ Requirements

🐍 Python 3.x
The following Python libraries are required:
- requests
- BeautifulSoup from bs4
- pandas
- os

To install the dependencies, run:

pip install requests beautifulsoup4 pandas

🚀 Usage

1. 🔄 Clone the Repository

git clone https://github.com/yourusername/jobinja-job-scraper.git
cd jobinja-job-scraper

2. 📝 Edit the Main Script

Update the base URL 🔗 or headers 📋 if necessary:

url = "https://jobinja.ir/"
headers = {
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3"
}
scraper = JobinjaScraper(url, headers)
scraper.scrape()
scraper.descriptive_statistics()
scraper.sort_data()

3. ▶️ Run the Script

Execute the script by running:

python jobinja_scraper.py

The script will scrape data from Jobinja 🌐, generate descriptive statistics 📊, and display sorted job data 📋. It will also save individual job data into .txt files 📁 in C:/Users/negin/jobinja_sorted_data.

📚 Class Description

`JobinjaScraper`

__init__(self, base_url, headers): Initializes the scraper with the base URL 🔗 and headers 📋.
get_links(self): Retrieves all relevant links 🔗 from the Jobinja base page for further processing.
extract_subpage_text(self): Extracts job attributes such as job title 📋, type ⏱️, location 📍, company 🏢, and other relevant details from each subpage.
scrape(self): Executes the process by calling get_links 🔗 and extract_subpage_text 📜 to gather job data 🗂️.
descriptive_statistics(self): Uses pandas to generate descriptive statistics 📊 for the dataset.
sort_data(self): Sorts the job data based on attributes like job title 📜, job type ⏱️, location 📍, etc., and displays the data in a structured matrix 🧮. It also saves the sorted data into text files 📁 for easy access.

🖨️ Output

Console Output: Displays scraped job data 📋, descriptive statistics 📊, and a sorted data matrix 🧮.
Text Files: Each job listing is saved as an individual text file 📄 in C:/Users/negin/jobinja_sorted_data with detailed job information.

📝 Example Output

URL: https://jobinja.ir/job/listing-url
Content Snippet: [Snippet of the job description]
Job Title: Software Developer
Job Type: Full-Time
Job Location: 📍 Tehran
Company Name: Example Co.
Contract Type: Permanent 📃
Work Experience: 3-5 Years 💼
Min Salary: 💰 40,000,000 IRR
Gender: 🚻 Female
Education Level: 🎓 Bachelor's Degree

🗒️ Notes

The script includes error handling for SSL errors 🔒 and generic request errors ❗ to manage connectivity issues smoothly.
Requests to the server are spaced out with a time delay ⏳ to avoid overwhelming the server (time.sleep(1)).

⚖️ License

This project is licensed under the MIT License.

🤝 Contributing

Feel free to submit a pull request 📥 if you have any improvements ✨ or bug fixes 🐛. All contributions are welcome 🤗.

👤 Author

Created by Negin Faal.

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
API		API
CODE/C2		CODE/C2
Text_Data_JobInja		Text_Data_JobInja
Text_Jobinja		Text_Jobinja
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🏢💼 Jobinja Job Listings Scraper 🔍💻

✨ Features

🛠️ Requirements

🚀 Usage

1. 🔄 Clone the Repository

2. 📝 Edit the Main Script

3. ▶️ Run the Script

📚 Class Description

`JobinjaScraper`

🖨️ Output

📝 Example Output

🗒️ Notes

⚖️ License

🤝 Contributing

👤 Author

About

Releases

Packages

Contributors 2

Languages

License

Neginfl/JobText

Folders and files

Latest commit

History

Repository files navigation

🏢💼 Jobinja Job Listings Scraper 🔍💻

✨ Features

🛠️ Requirements

🚀 Usage

1. 🔄 Clone the Repository

2. 📝 Edit the Main Script

3. ▶️ Run the Script

📚 Class Description

JobinjaScraper

🖨️ Output

📝 Example Output

🗒️ Notes

⚖️ License

🤝 Contributing

👤 Author

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

`JobinjaScraper`

Packages