Skip to content

Latest commit

 

History

History
4 lines (3 loc) · 364 Bytes

README.md

File metadata and controls

4 lines (3 loc) · 364 Bytes

hyperlink_crawler

This will traverse the Web as a linked graph from the starting --url finding all outgoing links (<a> tag): it will store each outgoing link for the URL, and then repeat the process for each or them, until --limit URLs will have been traversed. The output will be a JSON file with all incoming and outgoing link information