Skip to content

Latest commit

 

History

History
46 lines (31 loc) · 2.23 KB

README.md

File metadata and controls

46 lines (31 loc) · 2.23 KB

Twitter-User-Analysis

Python code to pull user data from twitter.

Data files

Both data files were generated on October 8, 2013.

  1. twitter_user_datascience_data.csv This data set was generated by pulling user accounts from twitter associated with the query 'datascience'. Notice that only about 150 users are active, the remaining users are quite sparse.
  2. twitter_user_data.csv This data set was generated by pulling user accounts from twitter associated with the query 'nfl'. This data is full and complete.
  3. twitter_user_data_data.csv This data set was generated by pulling user accounts from twitter associated with the query 'data'. This data is full and complete. (note: this file was generated October 10, 2013)

About the Data files

Each row is associated with a different twitter user/account. Below are the columns.

  1. handle - twitter username
  2. name - full name of the twitter user
  3. age - number of days the user has existed on twitter
  4. num_of_tweets - number of tweets this user has created (includes retweets)
  5. has_profile - 1 if the user has created a profile description, 0 otherwise
  6. has_pic - 1 if the user has setup a profile pic, 0 otherwise
  7. num_following - number of other twitter users, this user is following
  8. num_of_favorites - number of tweets the user has favorited
  9. num_of_lists - number of public lists this user has been added to
  10. num_of_followers - number of other users following this user

How to run the code

  1. git clone https://github.com/swGooF/Twitter-User-Analysis.git
  2. cd Twitter-User-Analysis
  3. First open getdata.py and enter your Twitter access_token_key, access_token_secret, consumer_key, consumer_secret
  4. from a command line 'python getdata.py datascience 3'

What are the parameters

  1. a query string: 'datascience' in the example above
  2. Number of pages to return, each page will return 20 users, the current Twitter API is 180 calls per 15 minutes and each page requires a new call

Some Analysis

R Code For Numerous Models

Basic iPython Analysis