TY - JOUR AU - Pensavalle, Alexander AB - Abstract This article explores intersections between place, race/ethnicity, and gender amongst American Twitter users and makes an argument that studying the intensity of tweets provides insights into how and why particular groups tweet. Given recent events in American political life such as the shooting in Ferguson, Missouri and the reactions by young, urban African Americans on Twitter, understanding the role of race, place, gender, and age is important. We observed the time between tweets of urban American Twitter users and explored whether the medium may be providing traditionally marginalized groups, such as young Black men, with potential avenues for mobilizing communication and access to resources. Historically, the most accessible large-scale social data to social scientists has been government data. With the emergence of computational social science, there is an opportunity to consider large-scale, and sometimes “messy,” detailed datasets of social interaction. These have sometimes been referred to as “Big Data.” Social media data have had a particular importance to Big Data work and present opportunities to apply new methods to study social behavior dynamics. One can discern everything from detecting abusive responses to a political event to tracking and modeling the spread of health epidemics. If there is variation, this may be due to scaling with population, demographic makeup, or cultural attributes. In this article, we explored variation in behavior across cities. We use data from Twitter, a real-time information sharing and social networking tool that has now become a household name. Because Twitter allows users to share messages that include everything from what one had for breakfast to notifying the world of a human rights abuse, the medium chronicles many aspects of our lived social, economic, and political selves (Murthy, 2013). This study seeks to identify and explore some of the possible city-level social metrics that can be derived from Twitter data and to investigate demographic trends within these data. The goal of this research is to understand some of the complex dynamics and demographic indicators of users derived from tweets. Additionally, we seek to understand why some populations are more likely to utilize Twitter or tweet more often. Though Twitter has been studied broadly within the social sciences, the focus has not generally been on demographics. Rather, analyses of retweets, hashtags, and the sentiment of tweets have been more popular research areas. Work that has examined demographic understandings of Twitter users has been varied in both methods and the types of conclusions discovered. Using Big Data computational methods, Sloan et al. (2015) have derived age, occupation, and social class from UK-based Twitter users. In terms of methods, studies generally have been based on small surveys, e.g., Hargittai's and Litt's (2011) survey of college students or samples of convenience, e.g., Marwick's and boyd's (2011) snowball sample which involved sending individual tweets to their followers. The first attempt at augmenting these methods was by Mislove et al. (2011), who examined demographics at the county level and used first name as a proxy to determine gender and last name to determine race/ethnicity. These trends highlight some of the macrolevel demographic changes in American Twitter usage. This article seeks to build on the work of Mislove et al. (2011) and augment the findings of Sloan et al. (2015), by focusing specifically on urban American Twitter users. Our rationale for this comes from the fact that American Twitter use may be declining as the percent of users on the medium, but it seems increasingly to be used by traditionally marginalized urban American groups who generally do not have a significant voice in American political and economic life. A motivation for our work is part of the larger question of whether Twitter has implications for voice, representation, visibility, and mobilization of marginalized groups. As we are particularly interested in urban American Twitter use, we explore Twitter activity from the most populated American cities. We focus closely on population and population density in order to explore differences in tweet intensity between urban American populations. Though the focus of this article is on understanding cultures of urban Twitter use and race, the combination of age, gender, race, and place is necessary to answer the larger questions we are posing. Ultimately, our study finds that cities with traditionally marginalized groups exhibit frequencies of Twitter use that could be associated with these groups perceiving Twitter to provide some level of voice. Given the response by urban African-Americans on Twitter to the shooting of Michael Brown, an unarmed 18-year-old Black man in Ferguson, Missouri in 2015, studying urban social media use within this context is particularly relevant. We posit that cultures of regular and frequent Twitter use may be part of larger trends towards gaining visibility on Twitter and other social media. Twitter Demographics Researching Twitter has been popular since the early days of the medium in 2006. In its infancy, Twitter provided complete data to researchers, enabling, for example, Kwak et al. (2010) to collect 1.47 billion social relations from 106 million tweets in 2009 and to map out broad trends of how Twitter was operating socially. Since then, access to Twitter data has been more limited, with most social researchers restricted to Twitter's free “Spritzer stream,” which delivers approximately 1% of all tweets (Boanjak, Oliveira, Martins, Mendes Rodrigues, & Sarmento, 2012). However, even with these limits, one can collect a larger quantity of data than Kwak et al. (2010) did over the same period due to large jumps in recent social media use. The research that has sprouted around Twitter is also diverse. The Pew Internet and American life project's vanguard longitudinal work found that 15% of American Internet users are on Twitter (Smith & Brenner, 2012), up from 8% in 2010 and that teen Twitter use grew 8% from 2011 to 2012 (Madden et al., 2013). Pew also found that African-American teens are more likely to use Twitter than their White counterparts (39% vs 23%) (Madden, et al., 2013). However, a limitation of the Pew surveys is that they do not offer detail in terms of understanding why these demographic differences exist. Other work has sought to bridge this gap. In terms of gender, women are also more likely than men to be active Twitter users (Hargittai & Litt, 2011) and some social media behaviors may be mediated by gender (Haferkamp, Eimler, Papadakis, & Kruck, 2012). Race and ethnicity have also been increasingly studied in the context of social media and race is considered to be an important object of study in Twitter (Sharma, 2013). Hargittai and Litt (2011) investigate variances in Twitter use amongst college students with regards to gender and race/ethnicity and use parent education as a proxy for socioeconomic status. They found that African-Americans have a higher likelihood of Twitter use than other racial groups and are overrepresented compared to the general American population (Hargittai & Litt, 2011). Brock (2012) develops from this by specifically investigating Black Twitter usage. . He used a linguistics-based analysis to examine how Black Twitter “subverts mainstream, expectations of Twitter demographics, discourses and utility.” Twitter has also been evaluated as a site of Black cultural production, with ‘Trending blacktags [black hashtags] exhibit[ing] high diffusion rates through the Twitter network with weak ties over a relatively short period of time’ (Sharma, 2013, p. 21). This contributes to an argument that the frequency of tweeting behaviors by particular racial groups is important to study. Methods Our study follows from Bruns' and Liang's (2012) argument that ‘Big Data’ methods can be useful for discerning short- and long-term trends on Twitter given the large quantity of data Twitter returns for most data requests. As of September 2015, there are over 500 million tweets per day from over 316 million active users (Twitter Inc., 2015). Twitter maintains an Application Programming Interface (API) that returns detailed information on the user, geographical information, and time/date data amongst other things. However, Twitter data collection poses a number of challenges (Murthy & Bowman, 2014). For example, the use of Twitter data can introduce biases and problematic assumptions (boyd & Crawford, 2012). We followed best practice in terms of data collection and outline our data model and framework in this section. Data Model and Collection Framework The Twitter API provides two primary methods for obtaining data, the ‘Search’, and ‘Stream’ API. The latter allows for near real-time requests containing tweets that match one's defined criteria. The data we collected relies on the locations parameter in Twitter's Stream API, which requests tweets posted from within a bounding box defined by latitude and longitude points. This method has been successfully used for studying Twitter use within varied geographical contexts (Graham, Hale, & Gaffney, 2014). We compared 45 U.S. cities with the most tweeting activity (Twitter Grader, 2009) with the 50 most populated cities based on the 2010 U.S. census. From these lists, we chose 50 cities with a range of population sizes and geographic diversity, a method similar to Mahmud et al. (2012), who successfully collected and studied tweets from the 100 most populous American cities. One city, Baltimore, Maryland, was excluded due to issues with the Twitter API during the sampling period. From December 2011 – March 2013, we collected more than 275 million tweets from these cities. The data collected in addition to tweet text includes time, the user's name and handle, application used, location, user profile, and number of friends and followers amongst other things. Because of the simple rectangular bounding boxes we used, there is no ‘perfect’ geographic scale at which to collect census data for comparison. However, Mitchell et al.'s (2013) method of using Metropolitan Statistical Areas (MSAs) to study Twitter was found to be effective. We therefore used MSAs, which not only include the city center, but also several of the close surrounding suburbs, to provide a more complete idea of the number of people in a given metro area. Extensive US government data for ethnicity/race, age, income, and education are readily available by MSA. We used population data from the 2012 Intercensal Estimates. Coding Twitter Users The rise in popularity of social networking sites like Facebook, LinkedIn, and Twitter has largely reversed trends of anonymous online identities (Trepte & Reinecke, 2013). On many of these sites, it is now required (or one is regularly nudged) to use accurate and up-to-date information. Though there is a move towards increasing public disclosure, some users do disclose less due to privacy concerns or due to worries about cyber bullying or threats of physical violence (Bryce & Fraser, 2014). In the context of Twitter, some users opt for private, protected accounts with profiles that are not publicly visible. That being said, the trend amongst social media users is to disclose more identifying information such as race, gender, and age. Cheng et al. (2010) in their study of where users tweet from found that individuals ‘leak’ information the more they use Twitter. Users tend to include some sort of profile picture and possibly a link to a personal website or other social media profiles. Images (including selfies) and video can also be embedded in tweets. All of this content has the potential to provide demographic information about a user. Coding methods Successful methods to code Twitter users by demographic attributes have been developed. For example, Hargittai and Litt (2011) implemented several useful methods to categorize types of Twitter users by age, gender, and race. Their approach places emphasis on a user's “Use and awareness of Twitter.” We extended their methods to code Twitter user profiles for race and gender. The categories we coded for are Age (Teen; College; 20s; 30s; 40s; 50s; 60+; Unknown), Gender (Female; Male; Unknown), Race (White; Black; Hispanic/Latino; Asian; Other; Unknown), and Other/Nonhuman Account (Bots; Businesses; Groups; Missing/Suspended). Due to acknowledged limitations in Mislove et al.'s (2011) method of using first name as a proxy for gender and last name as a proxy for race/ethnicity, we developed our own coding method, which uses qualitative methods for discerning age, gender, and race. Coding randomly selected Twitter users using teams of coders has been used effectively to quantitatively categorize demographic attributes. For example, Naaman et al. (2010) randomly selected 350 active Twitter users (from those who had at least 10 followers and had tweeted 10 times) and used a team of four coders to code 10 messages for each sampled user. They used profile pictures to categorize the gender of users. Following Naaman et al. (2010), we randomly sampled 100 users from each city in our data set, yielding a total of 4900 users, a sample 14 times larger than Naaman et al. (2010). Each of our four coders coded between 1,000 and 1,300 users and the specific users evaluated by each coder were evenly distributed across cities. Coders were instructed to interpret content on a user's public Twitter profile. This included up to 100 recent tweets and image postings, 10 times more than Naaman et al. (2010). They were also instructed to follow any links posted in a user's profile in order to search for additional points of information. Any form of self-identification — whether in the user's profile, tweets, or other social media profiles — were considered as the most reliable indicators for coding. Secondary indicators included the user's profile picture, self-reported user name, and content where the user identifies her/himself as the subject. Any dates associated with high school or college graduations, as well as an estimated date of birth were considered secondary indicators of age. Accounts containing strong, yet conflicting evidence or inconclusive evidence were always marked as unknown. In addition, an “other” category was used for coding Twitter accounts that did not represent individuals (i.e. for bots, businesses, and organizations, as well as for suspended or missing accounts). Coding with greater than 90% confidence was considered a reliable categorization, confidence levels between 75-89% were considered “uncertain,” and for confidence levels below 75%, coders were asked to discard their categorization in favor of marking the user as unknown for that category. Coding ethics We endeavored to follow best practices in this research. Given recent controversies in large-scale social media research such as Kramer, Guillory, & Hancock's (2014) Facebook study, it is important to make clear that our study did not manipulate any Twitter streams and did not involve contacting individuals or groups of users. Though all the data we studied was publicly posted on Twitter, we took a strong duty of care to responsibly code profiles, placing an emphasis on self-identification wherever possible and classifying users as “unknown” when there was doubt regarding the user's demographic attributes. However, we acknowledge that any method of identifying these types of social characteristics of users is always going to present major ethical challenges and we proactively constructed coding rubrics to minimize biases and racial profiling as best as possible. Measuring the Intensity of Activity on Twitter In this section, we discuss the two key methods we employed when measuring Twitter activity at the city and individual levels. Activity on Twitter is generally measured on a spectrum ranging from seldom use to “all the time” (Bekafigo & McBride, 2013). Both methods are computational. First, we studied the distribution of tweet frequency in our sample. Previous work has found that aspects of Twitter use tend to follow a power law distribution (Mathiesen, Angheluta, Ahlgren, & Jensen, 2013). This method considers whether tweet intervals for individual users are power-law distributed. Second, drawing from Bekafigo's and McBride's (2013) spectrum of Twitter use, we studied the intensity of tweeting behavior. To capture this, we measured the average length of time (i.e. the gap between a user's tweets). The shorter amount of time between a user's tweets, the more intense their tweeting activity generally is. Method 1: Power Law behavior Power-law distributions are a class of distribution for which the mean and standard deviation do not give a good indication of the distribution of data points, because of the large amount of variation in the tail of the distribution. Power law distributions exhibit long tails where most of the data points occur at the initial intervals and the large variance of data points are spread out. In the case of Twitter, for example, the vast majority of users tweet five or fewer times a day and there are many fewer users who tweet between 6–10 times a day. Human activity often follows power law-like distributions and their characteristic bursts of activity and heavy-tailed intervals are often observed in diverse activities (Barabasi, 2005). Clauset et al. (2009) illustrate that it is difficult to conclusively verify power law behavior in real observed data, but they do present a clear methodology for performing these investigations. We extend and develop Clauset et al.'s (2009) methods by calculating a best-fit power law distribution for a sample of users from a selection of cities. Out of the sample of users evaluated, 96% were determined to be plausible for power law distribution. However, we were not able to exclude similar alternative distributions and found alternative distributions might be a better fit. Specifically, the results of our log-likelihood ratio tests favored Power Law with Exponential Cut-off (54.2%), Exponential (2.1%), Log Normal (25.0%), Poisson (0.0%), Weibull (18.8%), and Yule (34.0%). Given the strong fit of the power law with exponential cut-off distribution and the inconclusiveness of a stringent power law fit, it is highly likely that user tweeting behavior is best modeled by some distribution with finite first moment. This makes measures like mean intertweet interval potentially more useful in understanding the patterns of Twitter activity amongst users who are geographically collocated. Method 2: Intertweet Interval (ITI) As a strict power law fit was inconclusive, the distribution should have a finite first moment and thus a simple average intertweet interval (what we call ITI) should suffice to model different behaviors. The average time between tweets has also been successfully incorporated in other work as a measure of user intensity (Boanjak, et al., 2012). Our implementation of an average intertweet interval was based on determining the first and last tweets for each user with at least two observed tweets. We determined average tweet interval length by calculating the difference in the timestamps and the difference in the user's tweet count at the time of the tweets. We took the mean ITI of users from a city that met the above criteria to calculate a city's mean. ITI also allows for the consideration of a larger number of users and, as such, a minimum tweet count sample was not needed to produce a reliable average ITI. One drawback is that by reducing each user's behavior to a single number, we no longer had access to a detailed distribution. However, this compression allowed us to expand our observation window to three months and include more users overall. When using mean ITI, the strength of the average is based on the interval between the first and last observed tweet over a period of time. One of the key limitations of this model is the sensitivity to the period over which the user ITI was calculated. A number of Twitter users will be regularly observed throughout the period while other Twitter users may only have a few tweets observed over the course of a day or two; this is partially due to API sampling issues. It is clear that a user's ITI calculated over a short period of time will have a high variance with regard to “actual” user ITI. This has the potential to vastly overestimate a user's actual average ITI. The closer the observed interval between the first and last tweet for a given user is to the total observed time window, the better representation the mean ITI is of the user's true tweeting behavior. Results Twitter Behavior and City Attributes We hypothesized that social mechanisms that affect the behavior of individuals in a social group (like a city) may draw more from close physical proximity to others rather than simply the size of the group in the case of Twitter. We primarily investigated population density of MSA areas as an associative indicator for Twitter activity (though it should be noted our results were similar using traditional population metrics). Given that the distribution of city population tends to be exponentially distributed (Gabaix, 1999), our method was to fit our data to a linear model log(mean ITI) ∼ log(population density). This model gave us a best fit line which corresponds to f(x) = eaxb. We also calculated the correlation between the values (see Figure 1). The results show a significant correlation between a decrease in mean ITI and increased population density (r2 = 0.25, p < 0.001). Although the relationship is significant, it does not explain all variation in these data. Figure 1 Open in new tabDownload slide Mean Tweet Interval by Population Density Figure 1 Open in new tabDownload slide Mean Tweet Interval by Population Density Despite a general correlation between increased population density and shorter ITI, this is not the case for cities with large Black populations. While White population density is about equal with the correlation for the general population, Black population density correlates more strongly with ITI. However, Hispanic and Latino population density does not have a strong relationship with mean ITI. It may be the case that total population of a city is generally correlated with both White and Black populations. However, our results suggest that a large, especially historical Black population contributes most to the mean ITI of the city. As Figure 1 illustrates, there is a cluster of cities in the bottom right quadrant with high populations and shorter ITI (i.e. faster tweeting rates). These cities  Atlanta, Washington DC, Philadelphia, and Detroit  are historically African-American cities. Black population and Black population-density have the strongest relationship with ITI. Demographics, ITI and Twitter Activity in Cities Considering our findings of mean tweet interval by population density, we would expect to see many city traits correspond with shorter ITI as long as those metrics themselves were correlated with population. Despite this, it may be the case that the Twitter behavior of some groups contributes more significantly to the overall population trends in cities. To evaluate this possibility, we explored the demographics of randomly selected Twitter users. Specifically, we decided to compare the random sample with the most active Twitter users in each city. To do this, we coded an additional sample of the top 10 most active users from each city by ITI (n = 490) with the general sample of 100 random users from each city (n = 4900). On average, coders reliably categorized age, race, and gender 68% of the time and were not confident 24% of the time. Other counts gave no indication of the demographic information or they represented organizational accounts. We found that we were able to more reliably code Black and White users (90%) than Asian or Hispanic/Latino users (45–60%). The reason for this was that a large number of Black user accounts self-identified as Black and this was not the case with Asian or Hispanic/Latino users. Overall, we reliably coded age 30.7%, gender 93.9%, and race 78.3% of the time. Age and Gender Age has traditionally played a prominent role in understandings of social media given that teens have historically been early social media adopters (boyd, 2013). Though these media have changed significantly, Twitter users in our sample tended to be in their 20s and 30s and top tweeters are likely to be in their teens and 20s (see Figure 2). These findings are consistent with other surveys of social media, e.g., Smith & Brenner (2012). Figure 2 Open in new tabDownload slide Age of Twitter Users (Random Sample vs Top Tweeters) Figure 2 Open in new tabDownload slide Age of Twitter Users (Random Sample vs Top Tweeters) Our data indicates that teens tweet once every 12 hours and college students tweet every 12.6 hours. There is a relatively smooth distribution where we see younger users tweeting most often. Despite the fact that more Twitter users are in their twenties and thirties, teen and college students are tweeting more often. High school and college represent key phases of sociability in the US and this sociability increasingly occurs online. Another explanation for these age demographics is that teens are perhaps migrating to Twitter as Facebook has become so pervasive amongst their parents' generation (Miller, 2014). The platform is seen by some teens as less rigid than Facebook and some teens are less frightened, embarrassed, and upset by interactions and content on Twitter (Binns, 2014). Conversely, for working age people, Twitter and other social media use at work is thought of by some managers as ‘cyberloafing’ (Andreassen, Torsheim, & Pallesen, 2014), making it much harder to be as active on Twitter as teens are. Our data also indicate that the average Twitter user is more likely to be male than the general population (see Figure 3). These data suggest that there are more active urban male Twitter users than women. However, top Twitter users are more likely to be women (see Figure 3). Our data indicate that women tweet at a rate of approximately once every 20 hours, while men tweet at a rate of approximately once every 26 hours. Figure 3 Open in new tabDownload slide Gender of Twitter Users (Random Sample vs Top Tweeters) Figure 3 Open in new tabDownload slide Gender of Twitter Users (Random Sample vs Top Tweeters) Race Like the general US population, White people are most populous in our sample of Twitter users. However, Hispanic/Latino individuals are the largest minority in the U.S. population (DeNavas-Walt, Proctor, & Smith, 2010), but, in our Twitter sample are underrepresented (see Figure 4). On the other hand, Black users are overrepresented in our sample. Importantly, almost 50 percent of top Twitter users in our sample are Black (see Figure 4). This represents a substantial 30-point swing between random and top Twitter users, representing a departure from Census data. We also discovered a significant population of Hispanic Twitter users among top tweeters (see Figure 4), despite the lack of correlation between cities with large Hispanic populations and ITI. Populations with Hispanic users are bifurcated: either tweeting at relatively fast rates or very slow rates (but not at average rates). Figure 4 Open in new tabDownload slide Race of Twitter Users (Random Sample vs Top Tweeters) Figure 4 Open in new tabDownload slide Race of Twitter Users (Random Sample vs Top Tweeters) The cities with the largest Black populations by Census also tend to have an overrepresentation of Black Twitter users (and an underrepresentation of White populations on Twitter). Thirty-three of the forty-nine cities sampled indicated the presence of more Black Twitter users than would be expected. Geographic proximity has been found to be an important independent variable to online interactions (Hampton & Wellman, 2003), and, in our case, we found intersections of urban location, demographics, and ITI. Specifically, American cities could be exhibiting general norms of tweet frequency (further categorized by race, age, and gender). We believe these types of city-level norms are especially pronounced for particular racial groups, e.g., young Black men. Importantly, we also found that Black population density is associated with city ITI (see Figure 1). When we explored the differences between top Twitter users and users from our random sample, we found that Black Twitter users have significantly shorter user ITIs than other users. Most other groups tweet once per day on average. Black users tweet more than 1.5 times a day or at least 50 percent more frequently than other groups. Hispanic/Latino users were well represented amongst top users, but were not associated with shorter ITI. What is more likely is a bifurcation amongst Hispanic and Latino Twitter users, where some tweet with higher frequencies and others with lower frequencies. Subgroups by age within racial/ethnic categories also emerge (see Figure 5). Age is an important factor amongst Hispanic and Latino Twitter users, but age differences are more prominent amongst Black Twitter users. Figure 5 illustrates a clear multimodal distribution for Hispanic/Latino and Black populations. When the density function is broken down by broad age groups (see Figure 6), teens and college students were found to be tweeting at a more frequent rate than any other age group. The mean ITI of Black populations is so low because teens are, by far, the largest group making up our sample of Black users. Additionally, Black users tweet at a faster rate across all age groups. Despite the large difference in usage patterns between groups, Hispanic and Latino users were found to be balanced across age groups. Amongst White users, there is also a trend for teens to tweet more often. However, this difference is not as pronounced. Figure 5 Open in new tabDownload slide Distribution of all ITI Figure 5 Open in new tabDownload slide Distribution of all ITI Figure 6 Open in new tabDownload slide Distribution of ITI by Age Figure 6 Open in new tabDownload slide Distribution of ITI by Age Limitations and Bias Though only 1-2% of tweets are geolocated, the number of tweets delivered by Twitter via the Streaming API remains proportional to the total number of tweets at any given time. Our data favor users who elect to report location, which biases towards mobile users. Additionally, as smartphone usage has become increasingly ubiquitous, Twitter, which used to be predominantly accessed via the web, is now predominantly accessed via mobile devices and platforms (Murthy, Bowman, Gross, & McGarry, 2015). As reporting location becomes more pervasive due to smartphone use, we believe that the trends highlighted in our study are of current and future value to understanding Twitter users as a whole. There are limitations to our methods of classifying race and ethnicity, categorizations that are always highly subjective. Our qualitative approach to classifying race/ethnicity was developed as an alternative approach to automated racial classification, which has its own limitations. Additionally, the much higher rate of self-identification amongst Black Twitter users compared to Hispanic/Latino and Asian users introduces biases when placing a premium on the robustness associated with self-identification. Another limitation is the validity of data gathered from Twitter profiles. We are aware that social media users selectively include/exclude elements from their profiles and this profile curation emphasizes particular aspects of their selves. We are also aware that our binary classifications exclude transgender, mixed-race, and many other diverse types of users. Though these are known limitations, we believe that a benefit of this study is that it opens up further lines of inquiry for better understanding these types of users. Specifically, future  particularly qualitative  work is needed to address these limitations. Protected profiles are not included in our study, and it could be the case that Hispanic/Latino or Asian users are using the protected option more and that Black populations could be tweeting more publicly. Discussion We believe that engagement with Twitter indicates some level of perceived utility of the medium for the user. This study makes the argument that tweet intensity is part of a culture on Twitter. Our behavior on social media is affected by a variety of online and offline networks. For example, if one primarily engages with Twitter for professional rather than personal reasons, one's tweet intensity may be far more sporadic, concentrated around professional events, rather than exhibiting a more consistent intensity level. In this way, the rate of our Twitter use can be thought of as one factor contributing to Twitter use. Our findings indicate significant differences in Twitter use by race. As we do not study the content of tweets and the relationships between users, we cannot discuss the nuances of why these differences exist. However, we feel that a key importance of our mixed methods approach is to open up lines of inquiry, such as whether there are pressures to tweet in particular socioeconomic contexts. Additionally, there might be social pressure to disclose location information if such disclosure is expected amongst one's ethnoracial group. Ultimately, studying the intersections of race with age and gender indicate that discernible differences exist. Our measure of Twitter use, the intertweet interval (ITI), is highly relevant as the short intervals between tweets could indicate that Twitter is being used by some urban populations to continue established modes of cultural performance. Specifically, there is a rich Black history of “shout-outs” and “repping,” cultural practices of calling out to others that have been studied extensively in diverse contexts from mentions in printed media to a series of name dropping in rap or hip-hop music (Alim, Ibrahim, & Pennycook, 2008). In tweet form, these messages are brief, but are part of a stream mentioning a range of other users. Users tweeting a cascade of shout-outs would have shorter ITIs than most other users. Though seemingly trivial, shout-outs and forms of call and response on Twitter can help enhance social presence (Dunlap & Lowenthal, 2009) and ultimately facilitate certain community networks. Race remains understudied on Twitter and we argue that race cannot be studied exclusively, but that its intersections with age and gender are fundamentally important. Twitter has been an important site of Black cultural production and this has been manifested through practices such as the circulation of Black hashtags, “Blacktags.” Ultimately, our data indicate that there is something happening on Twitter in U.S. cities with large populations of Black people. We believe there is a politics of Twitter use that needs to be unpacked and our results highlight the possibility that particular groups, such as young Black people, see Twitter as a space where they can be vocal. Our findings are particularly relevant in the shadow of Ferguson, where urban Black men turned to Twitter and other social media to express their outrage at the time of the shooting of an unarmed Black man, Michael Brown, and the acquittal of the White police officer, Darren Wilson, who fatally shot him. Some news coverage and blogs have argued that this focused outrage on Twitter helped bring the case to national attention (Desmond-Harris, 2015). Specifically, the #Blacklivesmatter hashtag was used as a call to action after Trayvon Martin, an unarmed Black 17-year-old, was shot by White police officer George Zimmerman; the hashtag was similarly used to both raise awareness regarding the shooting of Michael Brown, but to simultaneously highlight larger issues around racial inequalities in the US (Garza, 2014). Additionally, our results point to an argument that cultures of Twitter use are relevant. Historically, technology uses relied on forms of elite social capital. With the ubiquity of smartphones, people may be learning about how to use Twitter (and how frequently to tweet) from peers at the bus stop or at school, rather than having to sit in front of a desktop with broadband. Additionally, our results open up arguments that this could be correlated with shared geography. Specifically, geography also has implications for attributes such as social capital. In terms of technology, traditionally marginalized populations may be accumulating social capital in shared urban geographical spaces. Assuming high levels of social capital/connectivity between people in a place can definitely be problematic, as it is not always true or may place a bias towards people who are central within the network. In this way, our measure of ITI is better for looking at certain group dynamics. Given contemporary events in American politics, the notion that young Black people who do not traditionally have a voice within the public sphere can go on social media to help mobilize resources of representation is significant. Though ITI does not inherently tell us anything about a community's communication and access to resources, it does indicate that cities with higher Black populations are tweeting with shorter ITI rates. Additionally, ITI incorporates frequency and regularity, which are useful measures for discerning visibility and social media. Though our study cannot draw conclusions on whether Twitter is a space in which marginalized groups can speak and be heard, it opens up important questions of whether the medium could be used strategically by marginalized groups. In other words, having low average ITI could be a manifestation of strategic use in cities where ethnic minorities are/near to the majority. Additionally, our study helps make the case that we need to examine content during specific political events at the local, geo-located level to provide further understandings of what these populations are doing and whether they are using Twitter strategically. It may or may not be the case that low ITI averages are a way to mobilize or gain attention. Ultimately, the argument we are making is that a connection to place matters for ITI. Of course, it is only one aspect of cultures of Twitter use. However, if you are learning about things from people in your network that are near to you, Twitter could be encouraging high levels of interaction amongst those geographically colocated. This could provide real benefits for voice and self-representation amongst traditionally marginalized groups. Or put another way, those excluded from public spheres could be hyper-vocal on Twitter, leading to shorter ITI levels. On the flipside, however, it could be that the bulk of their use is social coordination with friends, e.g., as a proxy for text messages. Conclusion The demographic composition of Twitter users has been understudied. Rather, scholarly and media attention has been more focused on the study of hashtags, @mentions, retweets, and sentiment analysis of tweets. Of course, these are all useful modes for studying Twitter. However, this article has sought to address gaps in the literature in terms of demographic understandings of Twitter users. Specifically, there is a need in the social sciences and beyond to understand not only who is tweeting, but how intensely different groups tweet. We believe this is particularly important in terms of race. There is substantial Twitter-based literature that makes claims regarding human dynamics and social communication. However, without understandings of who is behind these tweets, we may be making claims without sufficient context. Though previous work has mapped out tweet frequency in major world cities, a focus on American urban Twitter users has generally been lacking. This article has sought to bridge this gap by understanding urban American Twitter users at a granular level. We found evidence of intersections between urban location, demographics, and the intensity by which users tweet. Our findings support Brock's (2012) argument of a ‘Black Twitter’. However, our results extend beyond notions of racialized Twitter spaces by specifically exploring tweet intensity (through the measure of intervals between tweets, which we term ITI). This measure allowed us to evaluate whether traditionally “Black cities” such as Atlanta and Detroit have a faster rate of tweeting than their White counterparts. Brock (2012) makes the fascinating argument that Black Twitter is not representative of the Black community on Twitter and we confirm this through our finding that a small cohort of Black users who tweet intensely rather than the population as a whole (which is in line with power law distributions of Twitter use). That being said, Black users are overrepresented on the medium. This opens up arguments of whether Twitter gives voice and visibility to Black users and influencers, potentially providing a platform for mobilization. Our work also suggests that there is a core of influential, young Black Twitter users, but their influence does not necessarily translate into traditional media, government, business, or other sectors. Twitter may therefore be reproducing ghetto-like stereotypes, much like the cities of these top users, e.g., Chicago, Detroit, and Atlanta. Alternatively, Twitter may be facilitating ad-hoc urban Black collectives that are able to quickly mobilize, as has been the case with the shooting of Michael Brown in Ferguson. There is a need for future quantitative and qualitative demographic-oriented research on race, cities and Twitter to answer questions such as these. Using our ITI measure of the time elapsed between tweets and human coding, we found that there are discernible urban American demographic trends. Specifically, we found that Black Twitter users are the most active on the medium. This supports the argument that race is an important factor in interpreting Twitter use. We also found that the more densely populated a US city is, the greater the overrepresentation of Black Twitter users. Though we explored the role of gender and age in Twitter use, we think that examining race requires understandings of these demographics as well. Our intersectional approach opens up other avenues of research regarding the various ways in which age, race/ethnicity, and gender affect social media use. Ultimately, we cannot get the types of information we have sought to discern in this study from qualitative interviews and content analysis. Teasing out connections of geography and habits of Twitter use is an important line of inquiry and we have sought to make this case. Acknowledgments The authors are grateful to the three anonymous reviewers for their insightful comments and to Ella McPherson and the University of Cambridge R(w)SM Group for comments on an earlier version. References Alim , H. S. , Ibrahim , A., & Pennycook , A. ( 2008 ). Global linguistic flows: Hip hop cultures, youth identities, and the politics of language : Routledge . Google Scholar Google Preview OpenURL Placeholder Text WorldCat COPAC Andreassen , C. S. , Torsheim , T., & Pallesen , S. ( 2014 ). Predictors of use of social network sites at work: A specific type of cyberloafing . Journal of Computer-Mediated Communication , 19 ( 4 ), 906 – 921 . Google Scholar Crossref Search ADS WorldCat Barabasi , A.-L. ( 2005 ). The origin of bursts and heavy tails in human dynamics . Nature , 435 ( 7039 ), 207 – 211 . Google Scholar Crossref Search ADS PubMed WorldCat Bekafigo , M. A. , & McBride , A. ( 2013 ). Who tweets about politics? Political participation of Twitter users during the 2011 gubernatorial elections . Social Science Computer Review , 31 ( 5 ), 625 – 643 . doi: 10.1177/0894439313490405 Google Scholar Crossref Search ADS WorldCat Binns , A. ( 2014 ). Twitter City and Facebook Village: Teenage girls' personas and experiences influenced by choice architecture in social networking sites . Journal of Media Practice , 15 ( 2 ), 71 – 91 . Google Scholar Crossref Search ADS WorldCat Boanjak , M. , Oliveira , E., Martins , J., Mendes Rodrigues , E., & Sarmento , L. ( 2012 ). TwitterEcho: A distributed focused crawler to support open research with twitter data. Paper presented at the 21st International Conference Companion on World Wide Web. boyd , D . ( 2013 ). White flight in networked publics? How race and class shaped American teen engagement with Myspace and Facebook . In L. N. a. P. A. Chow-White (Ed.), Race after the Internet (pp. 203 – 222 ): Routledge . Google Scholar Google Preview OpenURL Placeholder Text WorldCat COPAC boyd , D ., & Crawford , K. ( 2012 ). Critical questions for big data: Provocations for a cultural, technological, and scholarly phenomenon . Information, Communication & Society , 15 ( 5 ), 662 – 679 . Google Scholar Crossref Search ADS WorldCat Brock , A. é. ( 2012 ). From the blackhand side: Twitter as a cultural conversation. [Article] . Journal of Broadcasting & Electronic Media , 56 ( 4 ), 529 – 549 . doi: 10.1080/08838151.2012.732147 Google Scholar Crossref Search ADS WorldCat Bruns , A. , & Liang , Y. E. ( 2012 ). Tools and methods for capturing Twitter data during natural disasters . First Monday , 17 ( 4 ). Google Scholar OpenURL Placeholder Text WorldCat Bryce , J. , & Fraser , J. ( 2014 ). The role of disclosure of personal information in the evaluation of risk and trust in young peoples' online interactions . Computers in Human Behavior , 30 ( 0 ), 299 – 306 . doi: 10.1016/j.chb.2013.09.012 Google Scholar Crossref Search ADS WorldCat Cheng , Z. , Caverlee , J., & Lee , K. ( 2010 ). You are where you tweet: a content-based approach to geo-locating twitter users . Paper presented at the 19th ACM International Conference on Information and Knowledge Management , Toronto, ON , Canada . Google Scholar Google Preview OpenURL Placeholder Text WorldCat COPAC Clauset , A. , Shalizi , C. R., & Newman , M. E. ( 2009 ). Power-law distributions in empirical data . SIAM review , 51 ( 4 ), 661 – 703 . Google Scholar Crossref Search ADS WorldCat DeNavas-Walt , C. , Proctor , B. D., & Smith , J. C. ( 2010 ). US Census Bureau, Current Population Reports, P60-238 . Income, poverty, and health insurance coverage in the United States: 2009 . Google Scholar OpenURL Placeholder Text WorldCat Desmond-Harris , J. ( 2015, January 14 , 2015) . Twitter forced the world to pay attention to Ferguson, Vox. Retrieved from http://www.vox.com/2015/1/14/7539649/ferguson-protests-twitter Google Scholar Google Preview OpenURL Placeholder Text WorldCat COPAC Dunlap , J. C. , & Lowenthal , P. R. ( 2009 ). Tweeting the night away: Using Twitter to enhance social presence . Journal of Information Systems Education , 20 ( 2 ), 129 – 135 . Google Scholar OpenURL Placeholder Text WorldCat Gabaix , X. ( 1999 ). Zipf's Law for Cities: An explanation . The Quarterly Journal of Economics , 114 ( 3 ), 739 – 767 . doi: 10.1162/003355399556133 Google Scholar Crossref Search ADS WorldCat Garza , A. ( 2014 ). A herstory of the #BlackLivesMatter movement. the feminist wire. Graham , M. , Hale , S. A., & Gaffney , D. ( 2014 ). Where in the world are you? Geolocation and language identification in Twitter . The Professional Geographer , 66 ( 4 ), 568 – 578 . Google Scholar Crossref Search ADS WorldCat Haferkamp , N. , Eimler , S. C., Papadakis , A. M., & Kruck , J. V. ( 2012 ). Men are from Mars, women are from Venus? Examining gender differences in self-presentation on social networking sites . Cyberpsychol Behav Soc Netw , 15 ( 2 ), 91 – 98 . doi: 10.1089/cyber.2011.0151 Google Scholar Crossref Search ADS PubMed WorldCat Hampton , K. , & Wellman , B. ( 2003 ). Neighboring in netville: How the Internet supports community and social capital in a wired suburb . City & Community , 2 ( 4 ), 277 – 311 . doi: 10.1046/j.1535-6841.2003.00057.x Google Scholar Crossref Search ADS WorldCat Hargittai , E. , & Litt , E. ( 2011 ). The tweet smell of celebrity success: Explaining variation in Twitter adoption among a diverse group of young adults . new media & society , 13 ( 5 ), 824 – 842 . Google Scholar Crossref Search ADS WorldCat Kramer , A. D. I. , Guillory , J. E., & Hancock , J. T. ( 2014 ). Experimental evidence of massive-scale emotional contagion through social networks . Proceedings of the National Academy of Sciences , 111 ( 24 ), 8788 – 8790 . doi: 10.1073/pnas.1320040111 Google Scholar Crossref Search ADS WorldCat Kwak , H. , Lee , C., Park , H., & Moon , S. ( 2010 ). What is Twitter, a social network or a news media? Paper presented at the Proceedings of the 19th international conference on World wide web. Madden , M. , Lenhart , A., Cortesi , S., Gasser , U., Duggan , M., & Smith , A. ( 2013 ). Teens, social media, and privacy . Pew Internet and American Life Project. Google Scholar Google Preview OpenURL Placeholder Text WorldCat COPAC Mahmud , J. , Nichols , J., & Drews , C. ( 2012 ). Where is this tweet from? Inferring home locations of Twitter users. Paper presented at the ICWSM. Google Scholar Google Preview OpenURL Placeholder Text WorldCat COPAC Marwick , A. E. , & boyd , d. ( 2011 ). I tweet honestly, I tweet passionately: Twitter users, context collapse, and the imagined audience . New Media & Society , 13 ( 1 ), 114 – 133 . Google Scholar Crossref Search ADS WorldCat Mathiesen , J. , Angheluta , L., Ahlgren , P. T., & Jensen , M. H. ( 2013 ). Excitable human dynamics driven by extrinsic events in massive communities . Proceedings of the National Academy of Sciences , 110 ( 43 ), 17259 – 17262 . Google Scholar Crossref Search ADS WorldCat Miller , D. ( 2014 ). Global social media impact study . Retrieved from http://www.ucl.ac.uk/global-social-media Google Scholar Google Preview OpenURL Placeholder Text WorldCat COPAC Mislove , A. , Lehmann , S., Ahn , Y.-Y., Onnela , J.-P., & Rosenquist , J. N. ( 2011 ). Understanding the demographics of twitter users . Paper presented at the Fifth International AAAI Conference on Weblogs and Social Media (ICWSM'11) , Barcelona , Spain . Google Scholar Google Preview OpenURL Placeholder Text WorldCat COPAC Mitchell , L. , Frank , M. R., Harris , K. D., Dodds , P. S., & Danforth , C. M. ( 2013 ). The geography of happiness: Connecting Twitter sentiment and expression, demographics, and objective characteristics of place . PloS one , 8 ( 5 ), e64417. Google Scholar OpenURL Placeholder Text WorldCat Murthy , D. ( 2013 ). Twitter: Social communication in the Twitter age . Cambridge : Polity . Google Scholar Google Preview OpenURL Placeholder Text WorldCat COPAC Murthy , D. , Bowman , S., Gross , A. J., & McGarry , M. ( 2015 ). Do we tweet differently from our mobile devices? A study of language differences on mobile and web-based Twitter platforms . Journal of Communication , 65 ( 5 ), 816 – 837 . doi: 10.1111/jcom.12176 Google Scholar Crossref Search ADS WorldCat Murthy , D. , & Bowman , S. A. ( 2014 ). Big Data solutions on a small scale: Evaluating accessible high-performance computing for social research . Big Data and Society , 1 ( 2 ). doi: 10.1177/2053951714559105 Google Scholar OpenURL Placeholder Text WorldCat Naaman , M. , Boase , J., & Lai , C.-H. ( 2010 ). Is it really about me? Message content in social awareness streams . Paper presented at the Proceedings of the 2010 ACM conference on Computer supported cooperative work, Savannah, Georgia, USA. Google Scholar Google Preview OpenURL Placeholder Text WorldCat COPAC Sharma , S. ( 2013 ). Black Twitter?: Racial hashtags, networks and contagion . New Formations: A Journal of Culture/Theory/Politics , 78 ( 1 ), 46 – 64 . Google Scholar Crossref Search ADS WorldCat Sloan , L. , Morgan , J., Burnap , P., & Williams , M. ( 2015 ). Who tweets? Deriving the demographic characteristics of age, occupation and social class from Twitter user meta-data . PloS one , 10 ( 3 ), e0115545. doi: 10.1371/journal.pone.0115545 Google Scholar OpenURL Placeholder Text WorldCat Smith , A. , & Brenner , J. ( 2012 ). Twitter use 2012 . Pew Internet & American Life Project. Google Scholar Google Preview OpenURL Placeholder Text WorldCat COPAC Trepte , S. , & Reinecke , L. ( 2013 ). The reciprocal effects of social network site use and the disposition for self-disclosure: A longitudinal study . Computers in Human Behavior , 29 ( 3 ), 1102 – 1112 . doi: 10.1016/j.chb.2012.10.002 Google Scholar Crossref Search ADS WorldCat Twitter Grader . ( 2009 ). Top Twitter Cities List. Retrieved August 10, 2010 http://twitter.grader.com/top/cities Twitter Inc . ( 2015 ). About Twitter Retrieved March 15, 2015, from https://about.twitter.com/company About the Authors Dhiraj Murthy (http://www.dhirajmurthy.com/about/) is a Reader of Sociology at Goldsmiths, University of London. He is author of the book Twitter: Social Communication in the Twitter Age (Polity Press) and has published extensively about social media, Big Data, virtual organizations, and race/ethnicity. He can be contacted at: Department of Sociology, Goldsmiths, University of London, Lewisham Way, New Cross, London SE14 6NW, United Kingdom. E-mail:d.murthy@gold.ac.uk Alexander Gross is a researcher in Intermedia Arts at University of Maine, Orono, specializing in in natural language processing, and computational social science. Alexander Pensavalle is a former research fellow at the Social Network Innovation Lab at Bowdoin College, specializing in computational social science and data mining. © 2015 International Communication Association TI - Urban Social Media Demographics: An Exploration of Twitter Use in Major American Cities JF - Journal of Computer-Mediated Communication DO - 10.1111/jcc4.12144 DA - 2016-01-01 UR - https://www.deepdyve.com/lp/oxford-university-press/urban-social-media-demographics-an-exploration-of-twitter-use-in-major-WxmaTccSdM SP - 33 EP - 49 VL - 21 IS - 1 DP - DeepDyve ER -