-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Why scrape? #8
Comments
@DLu: does that also contain Q&A content? 170MB seems small for the entirety of ROS Answers? Edit: looks like it does. |
Sorta, the database structure is here: https://github.com/DLu/ros_metrics/blob/main/data/answers.yaml The question title/summary is included. The answer text is not. |
ah, hm. So that might still need scraping then. Would you know of a way to retrieve the answer bodies as well, without scraping? This must exist right? |
Been there, done that. |
Hello everyone, I had no idea that this API existed, thank you so much @gavanderhoorn and @DLu! @DLu I was wondering, I noticed in the database structure that it provides a summary of the question content and not the entire content of the question, and also the comments seem to be missing. Is it possible to also obtain this information using the API? |
I think the field is just named summary, but its actually the whole text. See https://answers.ros.org/api/v1/questions/408502/
Last I checked, no
I would have guessed the beginning of April. How off are the numbers you're getting? |
Somewhat off-topic perhaps, but the following query ( select id from answers where user_id == 5184 returns I also can't get the total karma to match what ROS Answers shows, but that's not really important. |
My local copy says |
Hi guys. Interesting project.
I was curious as to why you're using web scraping to get ROS Answers content? IIRC, there is support for exporting/dumping the database (using a web API) in a relatively usable format. That would seem to allow more convenient processing of it.
The dump / API access was used by @DLu to create the ROS Answers section of metrics.ros.org (source).
Perhaps he could say something as to whether that could also be made available for scientific research purposes.
The text was updated successfully, but these errors were encountered: