Proxy Scrapper that scrapes proxy and check the proxies (http only) from ~40 source. Need help to fix shit code.
-
HTTP(s)
# Original wget https://raw.githubusercontent.com/ReCaree/proxy-scrapper/master/proxy/http.txt # Duplicates Removed wget https://raw.githubusercontent.com/ReCaree/proxy-scrapper/master/proxy/http-removed.txt
-
SOCKS4
# Original wget https://raw.githubusercontent.com/ReCaree/proxy-scrapper/master/proxy/socks4.txt # Duplicates Removed wget https://raw.githubusercontent.com/ReCaree/proxy-scrapper/master/proxy/socks4-removed.txt
-
SOCKS5
# Original wget https://raw.githubusercontent.com/ReCaree/proxy-scrapper/master/proxy/socks5.txt # Duplicates Removed wget https://raw.githubusercontent.com/ReCaree/proxy-scrapper/master/proxy/socks5-removed.txt
-
Clone this repository and install requirement with:
pip install -r requirements.txt
-
Run the scrapper.
python scrapper.py
- Add SOCK 4/5 checker
- Better multithreading
Fell free to contribute. Add fixes or source.