Russian twitter trolls attacked Bernie too

(Foreign) Political influence campaigns on Twitter

You may have seen stories about Twitter accounts operated by Russians attempting to influence the 2016 election in the United States. Much of the reporting that I’ve seen described a Simple Narrative: Russians tried to help trump and hurt Clinton, even supporting Bernie Sanders in order to attack Clinton. I’ve also seen plenty of Democrats on Twitter attacking Sanders over this. I have not seen any stories reporting the fact that many of these bots also attacked Sanders. (If you’re aware of such stories, I’d be happy to hear about them).

Recently, FiveThirtyEight published a repository with roughly 3 million tweets from about 2,800 accounts that Twitter concluded were associated with the Russian effort.

About the data:

The data set is the work of two professors at Clemson University: Darren Linvill and Patrick Warren. Using advanced social media tracking software, they pulled the tweets from thousands of accounts that Twitter has acknowledged as being associated with the IRA. […] In the paper, Linvill and Warren divide the IRA’s trolling into five distinct categories, or roles: Right Troll, Left Troll, News Feed, Hashtag Gamer and Fearmonger. (These category codes are included in the data.)

And the Simple Narrative, repeated again:

Right Troll and Left Troll are the meat of the agency’s trolling campaign. Right Trolls behave like “bread-and-butter MAGA Americans, only all they do is talk about politics all day long,” Linvill said. Left Trolls often adopt the personae of Black Lives Matter activists, typically expressing support for Bernie Sanders and derision for Hillary Clinton, along with “clearly trying to divide the Democratic Party and lower voter turnout” (emphasis added).

Potential for abuse

The looming specter of foreign influence on our political system has clear potential for abuse, especially if facts are cherry-picked to support a certain narrative. As Adam Johnson writes in FAIR,

This narrative, fueled by center-left outlets like MSNBC, Center for American Progress and Mother Jones, has reached its inevitable, sleazy nadir: the smearing of a black activist by an NPR affiliate for the crime of going on a Russian government–funded radio station a handful of times. […] The piece ends with Kauffman narcing on Changa to one of her political allies, in an obvious attempt to create a chilling effect for others. WABE puts a microphone in front of congressional candidate Richard Winfield’s face and asks him what he thinks about Changa being associated with the stain of “Russian influence”

Today, Facebook announced that it had removed pages associated with “bad actors” who were “involved in coordinated inauthentic behavior.” They provided a sample of the content “designed to sow division” that these pages had posted, and boy does it look dangerous. Facebook was careful to avoid saying “[any] specific group or country is responsible,” but reporting on this story in various news media outlets have jumped to the conclusion that these pages were also part of a Russian plot.

The New York Times and Politico have both reported that these pages had some activity involving the hashtag #abolishICE

More than 290,000 Facebook users followed the now-shuttered pages, which were created between March 2017 and May 2018. The most followed – titled “Aztlan Warriors,” “Black Elevation,” and “Mindful Being” – reached more than 290,000 users. The topics also included the hashtag #AbolishICE, a popular new rallying cry on the left following outrage over the Trump administration’s separation of immigrant families along the Mexican border.

In my analysis below, I focus on tweets about Bernie Sanders because of the Simple Narrative, but the broader picture is one in which the left is in constant danger of being smeared by association with a hostile foreign government.

Russians attacked Bernie too

I dug into the tweets published by FiveThirtyEight to see if facts have been cherry-picked to support the Simple Narrative.

tweets <- read_csv("")

First, I subsetted their data to tweets that contain “Bernie” or “Sanders,” but not “Sarah” or “Huckabee,” occurred before the election, and looked only at accounts that were categorized as left trolls or right trolls (the large majority of accounts).

lrt <- tweets %>% filter(account_category %in% c("RightTroll", "LeftTroll")) %>%
  select(-external_author_id, -language, -harvested_date, -new_june_2018, -post_type, -account_type)
## Warning: `lang()` is deprecated as of rlang 0.2.0.
## Please use `call2()` instead.
## This warning is displayed once per session.
## Warning: `new_overscope()` is deprecated as of rlang 0.2.0.
## Please use `new_data_mask()` instead.
## This warning is displayed once per session.
## Warning: `overscope_eval_next()` is deprecated as of rlang 0.2.0.
## Please use `eval_tidy()` with a data mask instead.
## This warning is displayed once per session.
lrt$publish_date <- as.Date(lrt$publish_date, "%m/%d/%Y %H:%M")
lrt <- lrt %>% filter(publish_date <= "2016-11-08")

How many Bernie tweets were there from each type of account?

lrt %>% group_by(account_category) %>% 
  summarise(num_tweets = n()) %>%
## Warning: The `printer` argument is deprecated as of rlang 0.3.0.
## This warning is displayed once per session.
account_category num_tweets
LeftTroll 1817
RightTroll 3042

How many unique accounts were there of each kind, and what were the total number of follows each type of account? (Note: users who followed multiple such accounts are counted multiple times)

lrcount <- lrt %>% group_by(account_category, author) %>% 
  mutate(count = n()) %>%
  distinct(author, .keep_all = TRUE) %>%

lrcount %>% 
  group_by(account_category) %>% distinct(author, .keep_all = T) %>%
  summarise(unique_accounts = n(), total_follows = sum(followers)) %>%
account_category unique_accounts total_follows
LeftTroll 128 114939
RightTroll 213 217919

Same question, but only for accounts that tweeted about Bernie at least 10 times:

lrcount %>% filter(count >= 10) %>%
  group_by(account_category) %>% distinct(author, .keep_all = T) %>%
  summarise(unique_accounts = n(), total_follows = sum(followers)) %>%
account_category unique_accounts total_follows
LeftTroll 84 84418
RightTroll 52 172664

Of the top 20 accounts with the most tweets about Bernie, how many belong to each of the two categories, and how many total follows did they have?

top20 <- lrcount %>% ungroup() %>%
  top_n(n = 20, wt = count) %>%
top20 %>% 
  summarize(each = n(), total_follows = sum(followers)) %>% 
account_category each total_follows
LeftTroll 4 18238
RightTroll 16 110380

Let’s look at one randomly selected Bernie tweet from each of the top 10 accounts of each type (left/right) and judge for ourselves if it is positive or negative

top10bytype <- lrcount %>%
  group_by(account_category) %>%
  top_n(n = 10, wt = count)
lrt %>% filter(author %in% top10bytype$author) %>%
  group_by(author) %>%
  sample_n(1) %>%
  select(author, account_category, content) %>%
  arrange(account_category) %>%
author account_category content
4MYSQUAD LeftTroll @SnoopDogg Hey, Snoop, are you going to vote for Bernie Sanders?’
BAOBAEHAM LeftTroll .@BernieSanders outspending @HillaryClinton campaign on advertising spend in black media three to one. #BlackPressDay
BLACKEYEBLOG LeftTroll @TomScharff @vicenews Hillary said Bernie Sanders lies about it, there are proofs that there is a lot of fossil fuel money in her campaign’
BLACKMATTERSUS LeftTroll Eric Garner’s daughter supports Bernie Sanders #BlackMattersUS
BLEEPTHEPOLICE LeftTroll @BernieSanders the US is rolling into the abyss, judging by how it treats ’most vulnerable people’’
CANNONSHER LeftTroll @LadySandersfarm @glaad #NoGaySCOTUS -what’s NATURE Telling u?-GAYs=4% of species-NATURE doesn’t give Dem family Rights-Bullying’
JANELPERKINSON LeftTroll The reality, for those who still value reality, is Hillary Clinton is simply beating Bernie Sanders-in popular vote, states won & delegates.
JOHNIEVOGUE LeftTroll Just saw a Bernie Sanders ad that claims Asher Edelman was “the” inspiration for Gordon Gekko. Ivan Boesky—or Oliver Stone—might disagree.
NEWYORKDEM LeftTroll PRESS RELEASE: Sanders Strongest Candidate to Beat Trump
TRAYNESHACOLE LeftTroll It’s official Bernie Sanders is a SAVAGE
AMELIEBALDWIN RightTroll Millennial Fail: Bernie Sanders holdouts no-show Hillary Clinton’s N.H. rally via @bostonherald
FORCEOFLIBERTY RightTroll Bernie Sanders became the new Democratic frontrunner. Eat that, Hillary! #FeelTheBern #BernieStrong
HYDDROX RightTroll .@healthandcents @LVNancy: To reach the undecided, & Bernie/Jill DemExiters, & independents, use hashtag #election2016.
JENN_ABRAMS RightTroll Sanders: she has entire establishment behind her Hillary: I can’t be establishment because I’m a woman #lDemDebate
MISSOURINEWSUS RightTroll I can’t see any True Bernie supporter ever voting for #CrookedHillary It’s just not possible.
PATRIOTBLAKE RightTroll ILoveBernie1: RT AndrovicAna: When i’m overwhelmed I watch the video of a bird landing on BernieSanders podium and it calms me so much I ca…
PIGEONTODAY RightTroll Sanders makes a public plea for Democratic superdelegates to switch allegiances
REDLANEWS RightTroll Bernie Sanders sold out to establishment, your only chance to oppose Wall Street is Trump #FeelTheBern #BernieOrBust
TEN_GOP RightTroll More #FeelTheBern ideology to grow American debt! #UniteBlue #tcot #ccot #WeAreBernie
THEFOUNDINGSON RightTroll Ever wonder why there are no protesters at a Hillary or Bernie rally? Because we are to busy working and teaching our kids not to be scums

You can judge for yourself, but my conclusion is tweets both attacking and supporting Bernie can be found from both the “left troll” and “right troll” designated accounts. A more in-depth analysis of sentiment would probably require human labeling of hundreds of tweets–I am doubtful that automated sentiment analysis will work very well on this data.

Here’s a plot of the number of Bernie tweets each day by account type (right/left)

plotdf <- lrt %>%
  group_by(account_category, publish_date) %>%
  mutate(count = n()) 

ggplot(plotdf, aes(publish_date, count)) + 
  geom_point(aes(color = account_category))

Conclusion: the Simple Narrative is cherry-picked

Examples of tweets attacking Bernie Sanders can be found among both the “left trolls” and the “right trolls” in this data. This fact doesn’t fit nicely within the Simple Narrative, so only those tweets supportive of Sanders are mentioned in reporting on this issue. Where are the articles asking if the Russian campaign possibly prevented Sanders from winning the primary because, as the stronger candidate in matchup polls against trump, he could have prevented their plan to install trump in the White House?

For more perspective, I think it’s important to remember that foreign campaigns are not the only ones where nefarious actors are trying to influence social media. Social media trolls had prominent roles in recent Mexican elections. As a supporter of Sanders in the 2016 primary, I recall there was at least one pro-Clinton super PAC which spent at least $1 million to

[launch] a paid army of “former reporters, bloggers, public affairs specialists, designers” and others to produce online counterattacks

The campaign was “meant to appear to be coming organically from people and their social media networks in a groundswell of activism, when in fact it [was] highly paid and highly tactical.”

Two takeaway messages from all this:

  • Russiagate can and will continue to be weaponized against the left. Evidence will be cherry-picked to support the Narrative.
  • When we interact with accounts on social media or see content there, we should remember there is a decent chance it was designed to appear “organic” when it isn’t really. It’s more likely to come from those sources that have more money to spend on it, and will be harder to detect when it’s domestic compared to foreign.

Reproducibility addendum

Here is code to read FiveThirtyEight’s data and subset to tweets about Bernie.

It also filters to tweets containing “Bernie” or “Sanders,” but not containing “Sarah” or “Huckabee.”

It takes a long time to run because it has to download over 900mb of data.

If you don’t want to run this yourself, I also stored the output (which is much smaller) in a file you can find on github here.

files <- paste0("", 1:9, ".csv")
df <- map_df(files, read_csv)
bdf <- df %>% filter(str_detect(content, "Bernie|Sanders"), language == "English")
bdfns <- bdf %>% filter(!str_detect(content, "Sarah|Huckabee"))
write.csv(bdfns, "bernie_tweets.csv", row.names = FALSE)