Visualizing and measuring the behaviors of hashtag networks.
This project provides multiple programming tools for retrospectively visualizing communication through hashtags on Twitter. By querying successive time periods these tools are used to examine the collective behavior of Twitter users during a specific timeframe. The following interactive plots examine the behavior of three hashtags during the time period surrounding the Boston Marathon Bombing.
My interest in graphing communication networks was sparked in a class on Discrete Mathematics during my senior year at URI. In that class I wrote a research paper on the subject and began using Mathemetica to organize then graph Twitter networks. I’ve continued to learn about the field of social network analysis while further developing a toolkit in Mathematica.
Click to read abstract and research paper.
Click to not read abstract and research paper.
Twitter users are inadvertent cohorts in the perpetual construction of an global dataset. This paper demonstrates that the structure of this living laboratory can be examined with computer generated graph models. The collecting algorithms developed for this purpose have access to public archives which contain all of the nearly 58 million Tweets generated each day. The principle challenge in understanding this social network has been in the development and refinement of these collecting algorithms. Graph theory theorems have been applied to graphs that were generated from which rudimentary conclusions can be drawn. This project should serve as a building block for future analysis of the graph structures contained in the Twitter social network.
Measuring the crisis response of Twitter users
The following interactive plot shows the graph density of three hashtag networks during the 16 days surrounding the Boston Marathon Bombing. In this context graph density represents how often two Tweets have two common hashtags; a complete graph represents a network focused on a single subject. Hovering over a single point on the graph shows the hashtag networks that correspond to that point in time.
Click for more explanation.
Click for less explanation.
Each data point was created from successive hour-long intervals during a 16 day period. Each hour time period is comprised of 63 queries where each query results in two offspring hashtags (a tree graph with 5 generations of 2 offspring).
In the graph there are 3 sets of data representing the hashtags #boston, #bostonstrong, and #bostonmarathon. Regression lines drawn from 7 degree polynomials overlay the individual data points.
Hovering over each data will displays the network graphs drawn from that particular hour period. It is interesting to note that hovering near the time of the bombing will show the context in which #bostonstrong was first used. Similar hovering over the periods when graph density was one shows that at many times the three hashtags were all part of the same complete graph.
After reviewing Mathematica’s library of graph measuring tools I concluded that the measure of graph density was most informative. The manner in which graph density represents the focus of a hashtag network is exemplified by the graphs below. The graph on the left has no particular focus and a density of 1/7. The graph on the right clearly reacting to a single event and is complete (a graph density of 1).
After building the backend for pulling and organizing Tweets I constructed this interface to more quickly probe the nature of hashtag networks. This tool gave me the ability to change search terms, adjust the search characteristics, and dynamicly flip through Mathemetica's graph drawing library. I had hoped to include this interface for use on this site, but was unable to extend my Mathematica license for that purpose. Those interested can freely clone this code off my GitHub.
Click to view GUI.
Click to not to view GUI.