Making things with Twitter

Twitter is a rich source for procedural poetry. The text found on Twitter is constantly updating, ever-changing, and reflects the thoughts and points of view of millions of people around the world. As such, it's a very different source of text than the static texts we've been working with so far in class. It has different affordances and different opportunities.

Twitter makes available an "API", or "application programming interface," to developers who want to access the text on Twitter in their own computer programs. We're going to learn how to do two things with the Twitter API: first, we'll learn how to search for text on Twitter; then we'll learn how to read a particular user's timeline.

More on APIs

So what do we mean by "application programming interface"? Well, consider Twitter's normal search interface (which you should play around with a bit, if you haven't already). It looks a bit like this:

Twitter search results

Twitter search results

The search interface allows us to find tweets whose contents match a particular string---sort of like a big grep for Twitter. Nice! But what if we wanted to make procedural poetry from those tweets? We'd need to find some way of getting them into our Python program.

You might think of a few solutions to this problem, like cutting-and-pasting the text of the tweets one-by-one into a text file. This is a fine solution, but it's very tedious, and limits the scale of what we can do with Twitter! The benefit of using Twitter, after all, is that we have access to billions of tweets, and we don't necessarily want our procedure to be limited by the amount of human labor we can expend in cutting-and-pasting tweets.

Fortunately, there's an easier way: Twitter provides a special version of the search interface that is just for computers. Instead of returning a web page with the search results, this version of the search interface returns a dictionary data structure with information about all of the tweets (including their text). The data structure is designed to be easily computer-readable, and it looks like this:

{
  "statuses": [
    {
      "coordinates": null,
      "favorited": false,
      "truncated": false,
      "created_at": "Mon Sep 24 03:35:21 +0000 2012",
      "id_str": "250075927172759552",
      "entities": {
        "urls": [
 
        ],
        "hashtags": [
          {
            "text": "freebandnames",
            "indices": [
              20,
              34
            ]
          }
        ],
        "user_mentions": [
 
        ]
      },
      "in_reply_to_user_id_str": null,
      "contributors": null,
      "text": "Aggressive Ponytail #freebandnames",
      "metadata": {
        "iso_language_code": "en",
        "result_type": "recent"
      },
      "retweet_count": 0,
      "in_reply_to_status_id_str": null,
      "id": 250075927172759552,
      "geo": null,
      "retweeted": false,
      "in_reply_to_user_id": null,
      "place": null,
      "user": {
        "profile_sidebar_fill_color": "DDEEF6",
        "profile_sidebar_border_color": "C0DEED",
        "profile_background_tile": false,
        "name": "Sean Cummings",
        "profile_image_url": "http://a0.twimg.com/profile_images/2359746665/1v6zfgqo8g0d3mk7ii5s_normal.jpeg",
        "created_at": "Mon Apr 26 06:01:55 +0000 2010",
        "location": "LA, CA",
        "follow_request_sent": null,
        "profile_link_color": "0084B4",
        "is_translator": false,
        "id_str": "137238150",
        "entities": {
          "url": {
            "urls": [
              {
                "expanded_url": null,
                "url": "",
                "indices": [
                  0,
                  0
                ]
              }
            ]
          },
          "description": {
            "urls": [
 
            ]
          }
        },
        "default_profile": true,
        "contributors_enabled": false,
        "favourites_count": 0,
        "url": null,
        "profile_image_url_https": "https://si0.twimg.com/profile_images/2359746665/1v6zfgqo8g0d3mk7ii5s_normal.jpeg",
        "utc_offset": -28800,
        "id": 137238150,
        "profile_use_background_image": true,
        "listed_count": 2,
        "profile_text_color": "333333",
        "lang": "en",
        "followers_count": 70,
        "protected": false,
        "notifications": null,
        "profile_background_image_url_https": "https://si0.twimg.com/images/themes/theme1/bg.png",
        "profile_background_color": "C0DEED",
        "verified": false,
        "geo_enabled": true,
        "time_zone": "Pacific Time (US & Canada)",
        "description": "Born 330 Live 310",
        "default_profile_image": false,
        "profile_background_image_url": "http://a0.twimg.com/images/themes/theme1/bg.png",
        "statuses_count": 579,
        "friends_count": 110,
        "following": null,
        "show_all_inline_media": false,
        "screen_name": "sean_cummings"
      },
      "in_reply_to_screen_name": null,
      "source": "Twitter for Mac",
      "in_reply_to_status_id": null
    },
    {
      "coordinates": null,
      "favorited": false,
      "truncated": false,
      "created_at": "Fri Sep 21 23:40:54 +0000 2012",
      "id_str": "249292149810667520",
      "entities": {
        "urls": [
 
        ],
        "hashtags": [
          {
            "text": "FreeBandNames",
            "indices": [
              20,
              34
            ]
          }
        ],
        "user_mentions": [
 
        ]
      },
      "in_reply_to_user_id_str": null,
      "contributors": null,
      "text": "Thee Namaste Nerdz. #FreeBandNames",
      "metadata": {
        "iso_language_code": "pl",
        "result_type": "recent"
      },
      "retweet_count": 0,
      "in_reply_to_status_id_str": null,
      "id": 249292149810667520,
      "geo": null,
      "retweeted": false,
      "in_reply_to_user_id": null,
      "place": null,
      "user": {
        "profile_sidebar_fill_color": "DDFFCC",
        "profile_sidebar_border_color": "BDDCAD",
        "profile_background_tile": true,
        "name": "Chaz Martenstein",
        "profile_image_url": "http://a0.twimg.com/profile_images/447958234/Lichtenstein_normal.jpg",
        "created_at": "Tue Apr 07 19:05:07 +0000 2009",
        "location": "Durham, NC",
        "follow_request_sent": null,
        "profile_link_color": "0084B4",
        "is_translator": false,
        "id_str": "29516238",
        "entities": {
          "url": {
            "urls": [
              {
                "expanded_url": null,
                "url": "http://bullcityrecords.com/wnng/",
                "indices": [
                  0,
                  32
                ]
              }
            ]
          },
          "description": {
            "urls": [
 
            ]
          }
        },
        "default_profile": false,
        "contributors_enabled": false,
        "favourites_count": 8,
        "url": "http://bullcityrecords.com/wnng/",
        "profile_image_url_https": "https://si0.twimg.com/profile_images/447958234/Lichtenstein_normal.jpg",
        "utc_offset": -18000,
        "id": 29516238,
        "profile_use_background_image": true,
        "listed_count": 118,
        "profile_text_color": "333333",
        "lang": "en",
        "followers_count": 2052,
        "protected": false,
        "notifications": null,
        "profile_background_image_url_https": "https://si0.twimg.com/profile_background_images/9423277/background_tile.bmp",
        "profile_background_color": "9AE4E8",
        "verified": false,
        "geo_enabled": false,
        "time_zone": "Eastern Time (US & Canada)",
        "description": "You will come to Durham, North Carolina. I will sell you some records then, here in Durham, North Carolina. Fun will happen.",
        "default_profile_image": false,
        "profile_background_image_url": "http://a0.twimg.com/profile_background_images/9423277/background_tile.bmp",
        "statuses_count": 7579,
        "friends_count": 348,
        "following": null,
        "show_all_inline_media": true,
        "screen_name": "bullcityrecords"
      },
      "in_reply_to_screen_name": null,
      "source": "web",
      "in_reply_to_status_id": null
    },
  ],
  "search_metadata": {
    "max_id": 250126199840518145,
    "since_id": 24012619984051000,
    "refresh_url": "?since_id=250126199840518145&q=%23freebandnames&result_type=mixed&include_entities=1",
    "next_results": "?max_id=249279667666817023&q=%23freebandnames&count=4&include_entities=1&result_type=mixed",
    "count": 4,
    "completed_in": 0.035,
    "since_id_str": "24012619984051000",
    "query": "%23freebandnames",
    "max_id_str": "250126199840518145"
  }
}

...okay, so "readable" doesn't seem like a good word to describe that, "computer-" or no. But hopefully you can recognize the contours of what you see above: it looks a bit like a Python dictionary, with some lists inside it, and some of those lists contain other dictionaries. Inside these dictionaries and lists are all of the pieces of information we're looking for: the task is just to figure out how to write Python expressions that access those bits of information.

We're not going to talk about ALL of the information in this data structure; we're just going to look at a few patterns for getting the most interesting parts.

But first...

Obtaining Twitter API credentials

In order to use the Twitter API, you can't just use your normal username and password. Instead, you need four magical strings. We're not even going to discuss what these strings are, or what their names mean; for now, just know that these strings, together, act as a sort of "password" for the Twitter API.

The four magical strings are called:

In order to obtain these four magical strings, we need to...

This site has a good overview of the steps you need to perform in order to create a Twitter application. I'll demonstrate the process in class. You'll need to have already signed up for a Twitter account!

Using the Twitter API in Python

To access the Twitter API, we're going to use a Python library called Twython. I've already installed this library on the sandbox machine. (If you want to use this library on your own computer, come see me and I'll help you out.)

Here's a simple program that makes use of the Twitter API using Twython. There's a lot of strange stuff here, so don't be worried if some of it is confusing at first. I'll talk about the parts of the program that you can change below.

import sys
import twython

api_key, api_secret, access_token, token_secret = sys.argv[1:]

twitter = twython.Twython(api_key, api_secret, access_token, token_secret)

query = "sea rose"

response = twitter.search(q=query, result_type="recent", count=20)
for tweet in response['statuses']:
    print tweet['text']
Program: twitter_search.py

This program performs a Twitter search for whatever string is stored in the query variable. It then prints out the text of all matching tweets. Run it like so, replacing $API_KEY with your API key, $API_SECRET with your API secret, $ACCESS_TOKEN with your access token, and $TOKEN_SECRET with your access token secret:

$ python twitter_search.py $API_KEY $API_SECRET $ACCESS_TOKEN $TOKEN_SECRET
RT @evepaludan: £0.77 THE MAN WHO ROSE FROM THE SEA (#Angel #Detectives #2) #fantasy #timetravel #romance ►http://t.co/FiuRJlJZgg◄ RT http:…
RT @lisa_blake4: Lava rolling into the sea creating steam that rose so fast, it spawned multiple vortices! Photo by Bruce Omori. http://t.c…
RT @evepaludan: #99cents THE MAN WHO ROSE FROM THE SEA (#Angel #Detectives #2) #fantasy #timetravel #romance ►http://t.co/BchpcDqhlE◄ http:…
@naoscifra ESPERO QUE SEA MUY NOTORIO, FANSERVICE POR FAVOR
Sea dog skeleton from the Mary Rose, 1545. Portsmouth Historic Dockyard http://t.co/joLKQOMpMt
It's times like these when I really need some Barnegat Sea Scallops with a glass of Louis Larent Rose D'Anjou.
Russian photographer Alexander Semenov spends a lot of his time under the sea, capturing the alien-like beauty of... http://t.co/ALblkKRRi1
RT @evepaludan: #99cents THE MAN WHO ROSE FROM THE SEA (#Angel #Detectives #2) #fantasy #timetravel #romance ►http://t.co/BchpcDqhlE◄ http:…
RT @evepaludan: #99cents THE MAN WHO ROSE FROM THE SEA (#Angel #Detectives #2) #fantasy #timetravel #romance ►http://t.co/BchpcDqhlE◄ http:…
RT @evepaludan: #99cents THE MAN WHO ROSE FROM THE SEA (#Angel #Detectives #2) #fantasy #timetravel #romance ►http://t.co/BchpcDqhlE◄ http:…
RT @evepaludan: #99cents THE MAN WHO ROSE FROM THE SEA (#Angel #Detectives #2) #fantasy #timetravel #romance ►http://t.co/BchpcDqhlE◄ http:…
RT @evepaludan: #99cents THE MAN WHO ROSE FROM THE SEA (#Angel #Detectives #2) #fantasy #timetravel #romance ►http://t.co/BchpcDqhlE◄ http:…
RT @evepaludan: #99cents THE MAN WHO ROSE FROM THE SEA (#Angel #Detectives #2) #fantasy #timetravel #romance ►http://t.co/BchpcDqhlE◄ http:…
RT @evepaludan: #99cents THE MAN WHO ROSE FROM THE SEA (#Angel #Detectives #2) #fantasy #timetravel #romance ►http://t.co/BchpcDqhlE◄ http:…
RT @evepaludan: #99cents THE MAN WHO ROSE FROM THE SEA (#Angel #Detectives #2) #fantasy #timetravel #romance ►http://t.co/BchpcDqhlE◄ http:…
RT @evepaludan: £0.77 THE MAN WHO ROSE FROM THE SEA (#Angel #Detectives #2) #fantasy #timetravel #romance ►http://t.co/FiuRJlJZgg◄ RT http:…
RT @evepaludan: £0.77 THE MAN WHO ROSE FROM THE SEA (#Angel #Detectives #2) #fantasy #timetravel #romance ►http://t.co/FiuRJlJZgg◄ RT http:…
£0.77 THE MAN WHO ROSE FROM THE SEA (#Angel #Detectives #2) #fantasy #timetravel #romance ►http://t.co/FiuRJlJZgg◄ RT http://t.co/tcJOHlGfrb
#99cents THE MAN WHO ROSE FROM THE SEA (#Angel #Detectives #2) #fantasy #timetravel #romance ►http://t.co/BchpcDqhlE◄ http://t.co/KwH9C17V7o
#beauty #1: Ailiseu 100g Bath Dead Sea Salt - Champagne & Rose: Ailiseu 100g Bath Dead Sea Salt - Champagne & ... http://t.co/u1aSgah2nU

You should get a list of tweets that look like they contain either the word sea or the word rose.

Search example: breaking it down

Let's break down this example a little bit, to show what each line is doing.

import twython

This line "imports" the Twython library and makes it available in the program.

api_key, api_secret, access_token, token_secret = sys.argv[1:]

This line reads the Twitter credentials from the command-line, using the sys.argv list.

twitter = twython.Twython(api_key, api_secret, access_token, token_secret)

This line "initializes" the library, and creates an object that gets assigned to a variable (called twitter here, but you could call it whatever). We'll primarily be interacting with the Twitter API by calling methods on this object.

response = twitter.search(q=query, result_type="recent", count=20)

This is where the work of actually contacting the Twitter API happens. The .search() method opens up an Internet connection to Twitter's search server. The parameters in the method call have particular meanings:

There are other parameters you can pass to this function, which you can read about here. These three, though, should be more than enough to get you started.

The .search() method evaluates to a Python dictionary, stored in the example above in a variable called response. You can print this variable if you'd like to see exactly what's in it, but the main item of interest is the key statuses, which is a list of dictionaries. From a high-level perspective, the structure looks something like this:

{
    'statuses': [
        {
            'text': 'tweet text!',
            'retweeted': False,
            'id_str': '123456789',
            [...other key/value pairs omitted...]
            'user': {
                'name': 'Fordham English',
                'screen_name': 'FordhamEnglish',
                [...other key/value pairs omitted...]
            }
        },
        {
            'text': 'another tweet!',
            'retweeted': False,
            'id_str': '123456788',
            [...other key/value pairs omitted...]
            'user': {
                'name': 'Fordham English',
                'screen_name': 'FordhamEnglish',
                [...other key/value pairs omitted...]
            }
        }
    ]
    [...other stuff omitted...]
}

That is: it's a dictionary, one of whose keys (statuses) has a list as its value. That list itself contains other dictionaries, whose key/value pairs describe information about individual tweets. Those dictionaries have yet another dictionary embedded inside them---specifically, the value for the user key, which is a dictionary containing information about the user who made the tweet.

So this line...

for tweet in response['statuses']:

... causes the program to loop over each "status" in the list. We're calling the temporary loop variable tweet, to emphasize that the information we're looking at is about a tweet. The variable tweet itself will contain each item of the list in succession. And each item in the list is a dictionary!

So, the line...

print tweet['text']

...will display the value for the key text, which contains the text of the tweet in question.

(to be continued!)

Using search creatively!

import sys
import twython

api_key, api_secret, access_token, token_secret = sys.argv[1:]
twitter = twython.Twython(api_key, api_secret, access_token, token_secret)

source_words = ["rose", "harsh", "rose", "marred", "and", "with", "stint",
        "of", "petal"]

for word in source_words:
    response = twitter.search(q=word, result_type="recent", count=1)
    if len(response['statuses']) > 0:
        first_tweet = response['statuses'][0]
        tweet_text = first_tweet['text'].lower()
        if word in tweet_text:
            pos = tweet_text.find(word)
            print tweet_text[pos:]
Program: twitter_elaborate.py
$ python twitter_elaborate.py $API_KEY $API_SECRET $ACCESS_TOKEN $TOKEN_SECRET <frost.txt
harsh. hahahaha
rose has ‘got to get out there and play’ http://t.co/kbqpan02rm
marred mind tho, no vex rt @jackdre02: wu asked for ur opinion?“druhgzz_: you myt be the enemy (cont) http://t.co/ptwvpr5kg6
and i'm feel like i am the one who have convocation today. haha
with anything that may work for you.
stint is over. got stuck under a truck after an hour of not saving -
of you non-violence preachers when people were being tear gassed and hit with rubber bullets for peacefully assembling? #foh
petal paisley http://t.co/2lcrseqrqr