Skip to content

Commit

Permalink
fix: use better validation for new proxy list (#3)
Browse files Browse the repository at this point in the history
* fix: use better validation for new proxy list

* chore: bump version
  • Loading branch information
Justintime50 authored Dec 8, 2021
1 parent 941a2fa commit de750f3
Show file tree
Hide file tree
Showing 9 changed files with 247 additions and 87 deletions.
4 changes: 4 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,9 @@
# CHANGELOG

## v0.1.1 (2021-12-07)

* Overhauls the proxy list with different proxies (tested more thoroughly) as well as moves the list from a hardcoded constant to a text file

## v0.1.0 (2021-12-06)

* Initial release allowing you to retrieve a random proxy or a list of proxies from a small initial list
6 changes: 5 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,11 @@ Retrieve proxy servers.

Finding and storing a list of proxies can be taxing. Simply import `proxlist` and have it give you a rotating random proxy to run your requests through.

The list of currently configured proxies have `SSL` support, were tested to be able to accept connections, and were able to serve requests within 10 seconds. This may change over time as proxies change and the list gets updated. These proxies come from all over the world and may not be performant, this package is intended for testing purposes and I make no guarantee about where the data sent through these proxies goes - this package should not (yet) be considered for production applications.
The list of currently configured proxies have `SSL` support, were tested to be able to accept connections (3 independant tests to ensure consistency), and were able to serve requests within 15 seconds (your mileage may vary based on the content you are sending/receiving through the proxy and where you are located in the world, if you receive timeouts, simply bump the timeout up or try again). This may change over time as proxies change and the list gets updated.

Proxies are returned in the form of strings (eg: `ip:port`).

These proxies come from all over the world and may not be performant, this package is intended for testing purposes and I make no guarantee about where the data sent through these proxies goes - this package should not (yet) be considered for production applications.

## Install

Expand Down
100 changes: 100 additions & 0 deletions proxlist/data/proxies_to_validate.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,100 @@
46.229.187.169:53281
103.239.147.250:54623
103.152.238.82:8080
27.111.45.18:55443
45.42.177.30:3128
95.170.156.220:808
45.128.220.138:59394
203.207.52.206:8085
45.128.220.229:59394
96.9.77.71:8080
182.16.171.42:43188
45.128.220.194:59394
103.133.24.97:8181
118.201.86.149:3128
45.128.220.18:59394
182.253.171.223:8080
195.191.246.198:53281
123.200.5.210:45780
145.239.226.150:1080
77.40.252.172:8080
193.70.62.38:1080
180.193.216.208:8080
54.37.160.92:1080
190.90.24.14:999
1.117.100.196:7788
38.123.207.247:999
45.233.64.182:999
37.59.203.134:1080
119.82.240.46:6060
190.246.38.135:8080
200.215.249.2:999
202.169.255.12:8181
38.126.208.229:1080
78.138.99.78:1080
45.128.220.48:59394
202.43.190.10:53128
114.240.229.209:808
139.9.188.124:8118
139.255.194.98:8080
147.135.151.69:1080
181.196.205.250:38178
87.255.13.217:8080
61.7.138.87:8080
117.161.75.82:3128
139.255.10.234:8080
103.150.181.48:8080
116.62.127.66:8118
109.201.9.100:8080
47.254.75.151:8181
190.92.67.210:999
181.198.86.74:999
117.114.149.66:55443
181.78.15.105:999
154.66.109.209:8080
92.249.122.108:61778
41.65.224.80:1981
185.208.102.139:8080
151.106.1.68:1080
45.189.117.237:999
119.15.95.158:8080
103.147.52.237:3127
192.162.192.148:55443
103.55.38.27:8080
3.84.87.10:3128
103.53.77.234:8080
36.89.190.85:8080
201.82.2.141:3128
45.128.220.124:59394
88.255.64.70:1981
103.205.183.18:55443
119.81.189.194:80
177.250.173.155:8080
89.208.35.81:3128
120.52.73.44:18080
190.205.42.43:999
189.127.229.92:8080
103.145.34.9:55443
45.160.78.5:999
178.63.126.11:1080
61.8.75.186:3128
191.97.16.181:999
51.75.49.208:42312
103.41.212.227:44759
118.172.43.60:8080
176.121.1.80:8181
195.91.221.230:55443
211.24.105.248:41917
201.20.100.142:53281
138.185.190.46:8080
45.175.160.2:999
105.112.191.250:3128
212.109.219.71:3128
178.217.174.172:8080
187.62.67.166:45005
103.145.133.22:42325
139.59.233.24:3128
190.211.105.86:55443
180.112.219.38:8118
134.249.114.53:8080
203.76.124.35:8080
24 changes: 24 additions & 0 deletions proxlist/data/proxy_list.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
103.147.52.237:3127
103.239.147.250:54623
117.161.75.82:3128
119.82.240.46:6060
134.249.114.53:8080
145.239.226.150:1080
147.135.151.69:1080
180.112.219.38:8118
181.198.86.74:999
190.246.38.135:8080
195.191.246.198:53281
202.169.255.12:8181
202.43.190.10:53128
211.24.105.248:41917
3.84.87.10:3128
37.59.203.134:1080
38.126.208.229:1080
45.128.220.194:59394
45.128.220.48:59394
45.160.78.5:999
51.75.49.208:42312
61.7.138.87:8080
89.208.35.81:3128
95.170.156.220:808
40 changes: 18 additions & 22 deletions proxlist/proxies.py
Original file line number Diff line number Diff line change
@@ -1,32 +1,28 @@
import os
import random
from typing import List

# The following proxy list was generated from https://www.sslproxies.org/ by ensuring:
# 1. Each proxy could handle connections
# 2. Each proxy had SSL (HTTPS) support
# 3. Each proxy could service requests in under 10 seconds
# NOTE: The items above may change without notice for each proxy as could the integrity of this list.
# TODO: Make this configurable as a JSON file and allow users to import a custom list of proxies
PROXY_LIST = [
"103.124.2.229:3128",
"18.183.102.198:8899",
"181.10.230.100:57148",
"181.52.85.249:36107",
"197.248.184.157:53281",
"200.69.79.220:55443",
"203.193.131.74:3128",
"221.139.11.208:8080",
"8.210.219.124:59394",
"85.195.104.71:80",
"89.189.181.161:55855",
]


def random_proxy() -> str:
random_proxy = random.choice(PROXY_LIST)
"""Returns a random proxy (ip:port) from the currently configured list."""
proxy_list = _open_proxy_list()
random_proxy = random.choice(proxy_list)

return random_proxy


def list_proxies() -> List[str]:
return PROXY_LIST
"""Lists all proxies from the currently configured list."""
proxy_list = _open_proxy_list()

return proxy_list


def _open_proxy_list():
"""Opens the current proxy list text file."""
proxy_filepath = os.path.join('proxlist', 'data', 'proxy_list.txt')
with open(proxy_filepath, 'r') as filename:
data = filename.readlines()
proxy_list = [line_item.replace('\n', '').strip() for line_item in data]

return proxy_list
59 changes: 0 additions & 59 deletions proxlist/validate_proxies.py

This file was deleted.

77 changes: 77 additions & 0 deletions scripts/validate_proxies.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
import os
from threading import Thread
from typing import List, Optional

import requests

# This script can be used to test the validity of a list of proxies

# The following rules are the criteria for a proxy to make it into the library:
# 1. Each proxy can handle at least 3 separate connections (proves consistency)
# 2. Each proxy has SSL (HTTPS) support
# 3. Each proxy could service requests in under 15 seconds (performant, 15 seconds chosen for "around the world" buffer)
# NOTE: The items above may change without notice for each proxy as could the integrity of this list.


def main():
"""Print to console the proxies that pass the test.
If proxies appear in the list the number of times of the range below, they are consistently working.
You can then discard any that didn't appear X number of times.
"""
proxy_list = proxies_to_validate()
for proxy in proxy_list:
for i in range(3):
Thread(
target=test_proxy,
args=(proxy,),
).start()


def proxies_to_validate() -> List[str]:
"""Return a list of proxies to validate from a text file.
These can be procured from a website such as: https://www.sslproxies.org/
"""
proxy_filepath = os.path.join('data', 'proxies_to_validate.txt')
with open(proxy_filepath, 'r') as filename:
data = filename.readlines()
proxy_list = [line_item.replace('\n', '').strip() for line_item in data]

return proxy_list


def test_proxy(proxy: str) -> Optional[str]:
"""We test the proxy works by sending a request to this endpoint
which returns the current IP address, if we connect, we'll get the new
IP address of the proxy.
"""
url = "http://api.ipify.org"
headers = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:87.0) Gecko/20100101 Firefox/87.0",
"Accept-Language": "en-US,en;q=0.5",
}
proxies = {
"http": f"http://{proxy}",
"https": f"http://{proxy}",
}

try:
response = requests.get(url, proxies=proxies, headers=headers, timeout=15)
# Only include proxies that are an actual proxy
if response.text is not None and len(response.text) < 25 and len(response.text) > 12:
ip_with_port = (
proxy.split(":")[0] + ":" + proxy.split(":")[1]
) # Some redirect the IP here so we grab the original
print(ip_with_port)
proxy_response = response.text
except Exception:
# Couldn't connect to proxy, discard
proxy_response = None
pass

return proxy_response


if __name__ == "__main__":
main()
9 changes: 7 additions & 2 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,15 +20,20 @@

setuptools.setup(
name='proxlist',
version='0.1.0',
version='0.1.1',
description='Your project description here',
long_description=long_description,
long_description_content_type="text/markdown",
url='http://github.com/Justintime50/proxlist',
author='Justintime50',
license='MIT',
packages=setuptools.find_packages(),
package_data={'proxlist': ['py.typed']},
package_data={
'proxlist': [
'py.typed',
'data/proxy_list.txt',
],
},
classifiers=[
"Programming Language :: Python :: 3",
"License :: OSI Approved :: MIT License",
Expand Down
15 changes: 12 additions & 3 deletions test/unit/test_proxies.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,12 +2,21 @@


def test_random_proxy():
proxy_list = proxlist.proxies._open_proxy_list()
random_proxy = proxlist.random_proxy()

assert random_proxy in proxlist.proxies.PROXY_LIST
assert random_proxy in proxy_list


def test_list_proxies():
proxy_list = proxlist.list_proxies()
proxy_list = proxlist.proxies._open_proxy_list()
retrieved_proxy_list = proxlist.list_proxies()

assert proxy_list == proxlist.proxies.PROXY_LIST
assert retrieved_proxy_list == proxy_list


def test_open_proxy_list():
proxy_list = proxlist.proxies._open_proxy_list()

assert type(proxy_list) == list
assert len(proxy_list) > 10

0 comments on commit de750f3

Please sign in to comment.