I was at the National Security Agency yesterday giving a Constitution Day speech and I learned details of a shocking collection program: The government is bulk collecting all traffic on Twitter. Under a program menacingly called "Bulk Data in Social Media" and abbreviated---appropriately enough---as BDSM, Twitter has been providing all public traffic since 2010 for a massive government database that, as of early last year, contained 170 billion tweets. The goal of this program? To "collect the story of America" and to "acquire collections that will have research value" to analysts and others. Believe it or not, Twitter has been cooperating with the BDSM program since its inception in 2010 without any court order or FISA Court review. What's more, the government appears to use no minimization procedures in processing this material; U.S. person data and that of non-U.S. persons are intermixed. Even worse, the government is actively contemplating the use of the BDSM database to examine activity that is clearly protected by the First Amendment: "broad topics of interest . . . run from patterns in the rise of citizen journalism and elected officials' communications to tracking vaccination rates and predicting stock market activity."
Why would NSA do all this?
It wouldn't. The agency I'm talking about here is the Library of Congress. That said, the only part of the above description that isn't true is the name "Bulk Data in Social Media" and the abbreviation BDSM; the actual program is called, more prosaically, the "Twitter Archive at the Library of Congress." Ironically, while researchers all over---including you---will be able to search and collect the tweets, from 15-year-olds all over the world, NSA will probably not. After all, it just can't go willy-nilly through databases containing material on U.S. persons.
So here's the question: If you were shocked when you read the first paragraph of this post and relieved when you read that the agency doing all this collection is not NSA but the good guys over at the Library of Congress, and that the good guys are actually planning to make that data available widely, why did you have those reactions? And do those reactions make sense?
UPDATE/CORRECTION: The following Washington Post story was brought to my attention on---of course---Twitter. It makes clear that the Library of Congress is, in fact, purging deleted tweets:
There are other limitations. The library is not archiving tweets from those who opt for the strictest privacy settings, which allow Twitter users to approve or reject each potential follower. The library is also planning to scrub deleted tweets, meaning the public won’t have access to posts that were published but later removed. Dizard, citing privacy concerns, calls that decision “one of the more significant policy questions we face.”
In its terms of service, Twitter says that the default is “almost always to make the information you provide public for as long as you do not delete it from Twitter.”
Moody says it follows that deleted tweets are off-limits.
I have amended this post accordingly.