Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement scrape_question method for scraping info from ROS Answers question URLs #5

Open
ChrisTimperley opened this issue Jun 29, 2020 · 0 comments
Labels
enhancement New feature or request high-priority

Comments

@ChrisTimperley
Copy link
Collaborator

ChrisTimperley commented Jun 29, 2020

Depends on the completion #4

You should use requests to fetch the contents of the web page, and lxml to parse the HTML contents into a DocumentTree. You can then use XPath queries to extract the information that you need from the page: https://docs.python-guide.org/scenarios/scrape/

Edit: We can use the AskBot API to programmatically scrape ROS Answers [https://github.com/ASKBOT/askbot-devel/blob/master/askbot/doc/source/api.rst]

def scrape_question(id: str) -> Dict[str, Any]:
    ...

Should use the following API end point: https://answers.ros.org/api/v1/questions/299232/

Requests should be used to interact with the AskBot API

@ChrisTimperley ChrisTimperley added enhancement New feature or request high-priority labels Jun 29, 2020
@ChrisTimperley ChrisTimperley changed the title Implement url_to_question method for scraping info from ROS Answers question URLs Implement scrape_question method for scraping info from ROS Answers question URLs Jul 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request high-priority
Projects
None yet
Development

No branches or pull requests

1 participant