The code available on this repository is intended to download the most recent version of the Padrão TISS documentation, which describes how health insurance plans data may be transferred digitally.
One of the advantages of this code is that it still works even with old versions of the ANS page. To check it out, try replacing the value of the url variable with the link to this version from the Internet Archive (you will also need to comment out the banner closing loops).
To properly run web-scraper.py, you need to have the most recent version of Mozilla Firefox installed.
You also need to have any version equal or greater than Python 3.6 installed.
The necessary Python modules can be successfully installed by typing the following in your bash terminal:
$ pip3 install -r requirements.txt
There is no need to run the executables located inside the Webdriver Proxy. They will run automatically.
After setting up all the requirements mentioned above, to run the Web Scraper you may access the root folder of this repository on a bash terminal and type in the following command:
$ python3 web-scraper.py
By default it runs in Headless Mode, i.e. without Firefox graphical interface. This can be changed by setting options.headless = False:
options = Options()
options.headless = False
driver = webdriver.Firefox(options = options)
After that, a .pdf file named Componente Organizacional.pdf with all the content will be downloaded and saved into the project root folder.