PythonScraper

The "Search-Engine-and-Crawler" folder was cloned from divyanshch: https://github.com/divyanshch/Search-Engine-and-Crawler. The original crawler can be found in crawler.py crawlerExpand.py separates tasks into functions, implements logging, URL-cleaning etc. crawlerNoBS.py utilizes simple string searches instead of the BeautifulSoup library to find new links

The "Scraper" folder utilizes the same principles as the crawler but combines it with a string search on pages in a single web domain to output a result of searching for keywords instead of saving all pages it encounters.

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
Search-Engine-and-Crawler		Search-Engine-and-Crawler
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PythonScraper

About

Releases

Packages

Contributors 3

Languages

License

BeyerJ/PythonScraper

Folders and files

Latest commit

History

Repository files navigation

PythonScraper

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages