Walter Haydock recently posted a proposal that the FBI build and deploy its own troll army, social media bots that he refers to as Artificial Intelligence Targeting Personas (AITPs). These would automatically engage with people, looking for evidence of radicalization and violent tendencies, and report back to a human if such evidence was discovered. According to Haydock, the AITPs would save FBI time and resources, adding efficiency to terrorist investigations.
On first glance, the idea sounds as if it might have merit. But there are three problems that make this proposal highly unsound. First, we don't really understand how these programs work. And that means that we are in the well-known computer situation of GIGO: garbage in, garbage out. Second, the effort would only have merit if the data on which the machine learning were based were unbiased and non-discriminatory. Unfortunately, that's simply not the case. Third and finally, such a system would have a chilling and anti-democratic effect on society.
Let's explain the problems with the proposal in a bit more detail.
To begin with, it places far too much faith in machine intelligence, or AI, a field that is not nearly as well developed as people might imagine. Consider the recent example of Microsoft's Tay-bot. This conversation-ready twitter bot was hastily removed sixteen hours after it was first put out. Tay had been carefully trained to avoid discussions of Eric Garner, but within a day the bot was spewing racist and anti-Semitic obscenities. According to Microsoft, this was because it was being trained through a "coordinated attack" by groups of Twitter users. And if you think we can get away with having bots that don't "learn" from their interaction, we cannot. Without a dynamic interactive component, the bots will be obviously not human. We should learn our lesson from the Tay-bot; AI is simply not very smart yet.
A second problem is that sending immature talking bots as undercover agents will create more problems than it solves. When we are all being measured on a scale of 0 to 1 for the probability of being a terrorist by a poorly trained algorithm, almost no one will be deemed a 0. More likely it's little better than a random number generator, which will mean far too many false positives. Even innocuous behaviors become cast with suspicion because we are measuring them with a crude sieve that, frankly, doesn't know the meaning of the word "innocuous."
To see how this might play out, recall that after 9/11 the NSA started sending "a steady stream of telephone numbers, e-mail addresses and names" to the FBI, which was searching for terrorists. But instead of terrorists, the program led to dead ends and innocent Americans, swamping the FBI with unproductive leads and thus wasting its investigative time.
Compare AITP bots with a better use of data coming from sports. When baseball managers use batting, pitching, and fielding metrics to determine who should play where, they're using directly relevant statistics to make judgements, and moreover their models will be tested in real-life situations and corrected if they're wrong. In other words, they're creating a healthy "data feedback loop" in which good models, based on highly relevant data, get even better over time.
The analogous situation for terrorist bots would require them to successfully find dozens, hundreds or even thousands of proven terrorists, and hone their techniques over time. Those numbers are—hopefully—not even in the ballpark of plausibility. Instead, the bots would need to rely on weak proxies of "terrorism-like activity," which is where we get in trouble.
In other words, the AITP bots will rarely have highly relevant information on, say, how often a particular human has shot dozens of people, or set off a suicide bomb. It will instead be forced to rely on people talking vaguely about violence, which is to say it will look for keywords in speech. Given how easy it would be to fool an AITP bot into being suspicious, and how rarely terrorists are actually provably discovered, it would be nearly impossible to create a healthy data feedback loop under such conditions.
To sum up, a single bot will generate an enormous pile of difficult work for humans to resolve. When you scale up, replacing a single human with an army of bots, then that problem skyrockets. You've replaced the limitations of a human with the limitations of multiple weak algorithms with highly correlated errors. This will give the humans more work sifting through the detritus than they'd had before, and very likely looking in the wrong place to boot. It's much better to do a smarter search to begin with.
Beyond being technically flawed, we are concerned about the potential effect on our larger society. The AITP approach sets us all up for being investigated, lowering the bar too far for intrusive social media surveillance. That has a real effect. As one of us has already noted here recently, the Church committee wrote:
Persons most intimidated may well not be those at the extremes of the political spectrum, but rather those nearer the middle. Yet voices of moderation are vital to balance public debate and avoid polarization of our society.
Using AITP techniques to increase the size of the pool to be investigated will have an inevitable outcome of silencing the middle of the political spectrum. That would be disastrous for democracy.
For that matter, we've already seen how surveillance and suspicion over-reach can effectively isolate and disaffect a population. One of us helped with the data analysis that led to the court case examining whether New York City's stop-and-frisk program was constitutional (it wasn't). And we are seeing AI and machine learning techniques proliferate world-wide while producing highly inaccurate and distasteful responses.
When we deploy terrible AI bots, we will subject many in our population to crude profiling and suspicion, leading to the silencing and victimization, while others might pick up on them, manipulate them, and rightly complain about them. Using such AITP bots to find terrorists would be the Tay-bot controversy times a thousand. At best, it would be useless and an embarrassment. At worst, it could do serious harm.