The Russia Connection

The 87% Solution to Fake News

By Paul Rosenzweig
Wednesday, May 3, 2017, 1:26 PM

The bad news is that I am only 87% likely to be a real human being.  The good news (as he pointed out to me) is that I'm, nonetheless, 23% more likely than Ben Wittes to be a real human being.

Democracy has come to face a challenge—the prevalence of fake news, created by malicious actors and retweeted and amplified by networks of 'bots that create the false impression of widespread acceptance and agreement.  So long as Western nations embrace concepts of free speech and political debate the former aspect of the problem—screening out false information—will be difficult.  I can't really imagine a way in which anyone (whether government or service provider) could successfully label some content as "false" or "wrong" without great difficulties.  But that doesn't mean we cannot combat the problem.  One plausible way to get a handle on the spread of fake news is to identify artificial networks responsible for their amplification.  And therein lies the technical challenge (and the contingent nature of my humanity).

A recent project by "Truthy" at the University of Indiana gives a good window into how the technology might work (and how difficult it may be to implement at scale).  The project, known as BotOrNot, is an academic project partially funded by NSF and DoD.  The objective is a limited one -- to try and assess whether the traffic from a particular twitter account can be analyzed to determine whether the account is connected to a real human being or whether it is controlled by a bot network.  Because the assessment is probabilistic rather than definitive the "score" assigned to an account is a numerical percentage rather than an absolute "bot or not" determination.

And thus my contingent humanity.  The analytics of my own Twitter account (@RosenzweigP) assign me a score of 13% as a bot (or, reciprocally, as 87% as a real human being).  The analytics rely on things like the timing of my tweets, their language structure, and my use of hashtags as a way of gauging my genuine nature.  Interestingly, the area in which I most resemble a bot seems to lie in a sentiment analysis—apparently, my tweets have been angry of late (you can imagine why) and that is indicative, at some level, of artificiality.

An interesting counterpoint is provided by Ben Wittes' Twitter account which comes back with a score of 37% bot (or 63% human).  From what I can discern, Ben's more "bot-like" nature stems in large part from his tweeting under certain hashtags (like "#notesfromundertrump") that get lots of retweets and have an air of influence operations to them.  It seems that the mechanism has yet to really get a grip on the difference between artificial influence and legitimate persuasion. This is confirmed by the analysis of the Lawfare Twitter account (@lawfareblog) which, oddly, comes in at 34% and is, therefore, more "human" than Ben is! 

By contrast, the Washington Post correspondent Ellen Nakashima (@nakashimae) comes in at 30%—oddly because the content of her tweets (literally the noun to verb ration and the like) is more bot-like. And @PhyrexiaNewborn (who bills him/herself as "Russian Bot #37786") also comes in with a score of 30%, suggesting that the screen name is deliberately ironic. Meanwhile, @lao232 (a Justin Beiber following bot) registers a solid 71% as a bot, not human.

So, overall,  the system seems to have some value—even though it can't quite seem to distinguish between Ben and Lawfare.  The project is clearly in beta testing at this point. 

Indeed, if you want the best example of the indeterminacy of the system, consider that @realDonaldTrump scores 47% (i.e. only 53% human—or basically a coin flip).  Make of that what you will.