Skip to content

victoraavila/Web-Scraping-IC

Repository files navigation

Web-Scraping-IC

The code available on this repository is intended to download the most recent version of the Padrão TISS documentation, which describes how health insurance plans data may be transferred digitally.

One of the advantages of this code is that it still works even with old versions of the ANS page. To check it out, try replacing the value of the url variable with the link to this version from the Internet Archive (you will also need to comment out the banner closing loops).

Requirements

To properly run web-scraper.py, you need to have the most recent version of Mozilla Firefox installed.

You also need to have any version equal or greater than Python 3.6 installed.

The necessary Python modules can be successfully installed by typing the following in your bash terminal:

$   pip3 install -r requirements.txt

There is no need to run the executables located inside the Webdriver Proxy. They will run automatically.

Running the program

After setting up all the requirements mentioned above, to run the Web Scraper you may access the root folder of this repository on a bash terminal and type in the following command:

$   python3 web-scraper.py

By default it runs in Headless Mode, i.e. without Firefox graphical interface. This can be changed by setting options.headless = False:

    options = Options()
    options.headless = False
    driver = webdriver.Firefox(options = options)

After that, a .pdf file named Componente Organizacional.pdf with all the content will be downloaded and saved into the project root folder.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages