In the worlds of data protection and privacy, too often there is a decoupling of national security issues and what might be termed non-national security issues despite the clear interplay between the two realms. Over the past decade, U.S. adversaries have vacuumed up the personal data of many Americans with one nation possibly being at the fore: the People’s Republic of China (PRC). The PRC was connected to the Office of Personnel Management and Equifax hacks, both of which provided massive troves of data the PRC has reportedly used to foil U.S. espionage and intelligence collection efforts abroad. What’s more, the collection of personal data did not stop with these hacks. In September 2020, an Australian security firm turned up evidence of an enormous trove of personal data on American, British, and Australian citizens collected and maintained by a PRC company, Zhenhua Data, with links to the country’s military and security services. It appears the data was scraped from public-facing websites. But the issue does not stop with techniques like these.
Policymakers tend to focus on access to personal data acquired illegally or in the service of espionage. This makes sense, for nothing focuses the data protection mind quite so much as a mammoth, public breach. And there’s no shortage of such exploitations, with Microsoft Exchange, Accellion and SolarWinds being the most recent examples. However, policymakers are not giving enough thought to the possible legal means by which the personal data of Americans and others may be obtained. To be sure, the Trump administration’s rationale for taking steps to ban ByteDance and WeChat in the United States was, in part, that users’ data would eventually make it to the PRC for processing in ways that threatened national security. But as critics pointed out at the time, if the massive industries related to the collection, processing, and selling or sharing of personal data abroad were a concern, there are threats as big, if not bigger, closer to home. Just this week, the Wall Street Journal detailed how one now-defunct defense contractor inadvertently discovered it could track U.S. troops convening in Syria for operations in 2016.
Western policymakers may be missing the forest for the trees by focusing only on hacks, exfiltrations, apps and software as threats to national security. Indeed, there is a universe of personal data that countries like the PRC may be accessing that is generally not part of this conversation: datasets obtained through data brokers.
To the extent data brokers are discussed in Washington, it is often in the context of data protection and data privacy legislation focused on the possible harms that the unfettered use and sharing of personal data may bring in commercial and privacy contexts. But scant consideration is paid to the national security side of the bustling personal data trade. Few policymakers have been focused on the national security side of this issue, but that may be changing. It is possible, and in fact quite likely, that the security and intelligence services of many nations have used personal data that is normally confined to the commercial world. This post explores the national security implications of data brokers and discusses potential reforms, such as those proposed recently by Sen. Ron Wyden, to better protect American data.
California and Vermont have enacted laws requiring the registration of data brokers operating in those states, and legislation has been proposed in Congress to do the same. To be fair, the Vermont law bars some data-brokering activities, but some of these activities may have been illegal already under fraud statutes. Still, such registration would seem to do little to stem the flow of personal data to data brokers and their clients.
First, it is not even clear which entities would be considered data brokers, as there is some disagreement about how to best define the term. In a 2016 report, Upturn and the Open Society Foundation found “[t]here is no authoritative definition of ‘data broker’ on either side of the Atlantic.” Nonetheless, they proposed this working definition:
[a] company or business unit that earns its primary revenue by supplying data or inferences about people gathered mainly from sources other than the data subjects themselves.
This definition is better than the Federal Trade Commission’s (FTC’s) 2014 definition, mainly because it recognizes there are companies whose sole business is not data brokering (for example, LexisNexis or Thomson Reuters) that are major players in this space. Nonetheless, this is how the FTC defined this group of entities:
companies whose primary business is collecting personal information about consumers from a variety of sources and aggregating, analyzing, and sharing that information, or information derived from it, for purposes such as marketing products, verifying an individual’s identity, or detecting fraud.
Earlier this month, Justin Sherman discussed definitional problems and gaps with the California and Vermont statutes on Lawfare, arguing that getting these matters right in federal legislation is critical.
Perhaps policy discussions about the practices of amassing, analyzing and selling personal data would be improved by focusing less on identifying companies that qualify as data brokers and more on the activities of data brokering. This may be the best approach, as detailed in a recent Markup piece about the money that stakeholders in the data brokering field have spent on lobbying efforts in Washington. The article includes lobbying expenditures by major companies involved in the data broker industry, including Oracle, Deloitte, PWC and others.
Data brokering relies on both public and non-public data. The former class of information is obtained legally from governments and often comes in the form of publicly available records and from whatever information people make available online, often through their social media accounts, for example. The non-public data comes from the torrent of information that laptops, smartphones, virtual voice assistants, smart TVs, cars and other devices are transmitting constantly about their users. Also, one’s financial institution, retailer, and employer may well be submitting data to the brokering industry. As is obvious, this little-understood industry is collecting and analyzing enormous amounts of personal information.
Though wide reaching, data brokers or data brokering represent a field that is largely invisible to most people. As Upturn and the Open Society Foundation explained:
The data brokerage industry is vast, varied, and complex. Data brokers count among their customers advertisers, merchants, employers, bankers, insurers, police departments, schools, hospitals, and others. They seek to meet the varied needs of their customers by collecting data from many different sources, and selling different types of products, ranging from simple lists to scores produced by proprietary actuarial models.
And much of this market is opaque to policymakers, in Washington and elsewhere. The organizations further found that data brokers are unwilling to help shine light on their business models and practices, often relying on the confidentiality clauses in their contracts with customers or other organizations.
In their recently published NATO Strategic Communications Centre Of Excellence paper, Henrik Twetman and Gundars Bergmanis-Korats took “a closer look at data brokers and the data industry to investigate how the commercial availability of data can be exploited and lead to security issues for military organisations such as NATO and its Allies.” This study significantly addresses the security implications of data brokers—a break from the norm, given that most of the focus on data brokers to date has been on the commercial uses of available data, typically with regard to marketing or targeted advertising, rather than the availability and potential use by security or intelligence services, especially U.S. adversaries or malign groups.
The authors of this paper ran an experiment to see if they could obtain data from brokers that would theoretically allow them to use the data for any purpose. In short, the authors “wanted to understand what data can be purchased in a smaller European country (in this case Latvia), how easy it is to purchase, what it would cost, and how the data would be delivered to us.” They found:
First, although limited in comparison to other countries, the data available for Latvia are still useful; interesting data sets can be purchased at the right price and from the right vendor. Second, although raw data may be difficult to obtain, it is available to persistent buyers, and even processed data can be used for exploitation.
Moreover, as the authors succinctly observe in a sidebar:
The potential for the exploitation of personal data is huge, and data brokers who routinely collect, aggregate, and store this type of data are naturally prime targets for malicious actors who can either purchase the data legally or obtain it illegally and then use it to achieve their objectives.
The authors also write that “the brokers [they] interacted with were diligent in screening potential buyers, and made inquiries about how the data would be used, stored, and processed.” This finding is in line with current regulations—there are some limits on who and under what circumstances an entity may use data brokering services and products. Under U.S. law, some data brokers, specifically those deemed consumer reporting agencies, may disclose consumer information under limited circumstances. But the restrictions are not comprehensive: Not all data brokers or those in the data brokering business are consumer reporting agencies, meaning this bar would not apply to them as U.S. law is currently written.
And while it is reassuring to hear that Twetman and Bergmanis-Korats found that the data brokers with whom they dealt were diligent in vetting possible sales, how difficult would it be for nefarious actors to set up a front operation or shell company to purchase datasets after having sent all the right signals about why the data was wanted? Additionally, there may be companies in the data brokering world that are not terribly interested in the plausibility of a potential client’s intentions and will sell to those who can buy.
Ultimately, Twetman and Bergmanis-Korats found that nations like Latvia were at limited risk because of the paucity of personal data collection (which makes data more expensive) and limits on data collection and usage, in part due to its membership in the European Union and the reach of the General Data Protection Regulation. However, the authors warned that in nations “where data is abundant and relatively accessible,” the risk is substantially higher—a concerning finding for nations like the U.S., where much more personal data is collected.
To this point, in December 2019, the New York Times obtained a dataset with fairly little trouble that was then used to track a Secret Service agent assigned to protect then-President Trump:
The Times Privacy Project obtained a dataset with more than 50 billion location pings from the phones of more than 12 million people in this country. It was a random sample from 2016 and 2017, but it took only minutes—with assistance from publicly available information—for us to deanonymize location data and track the whereabouts of President Trump.
If the Times has figured out how to utilize these data, it probably makes sense to assume U.S. enemies, adversaries, and perhaps even friendly nations and allies have used data brokers to acquire information. Why wouldn’t they? It is the raison d’etre of intelligence agencies and security services to acquire information from all possible sources for analysis.
The U.S. intelligence community has admitted at least one adversarial nation is acquiring personal data through legal avenues. In February 2021, the National Counterintelligence and Security Center (NCSC) asserted:
For years, the People’s Republic of China (PRC) has collected large healthcare data sets from the U.S. and nations around the globe, through both legal and illegal means, for purposes only it can control. [Emphasis added.]
To be fair, the legal means include partnerships with U.S. firms and funding U.S. research, and so, it is not entirely clear the NCSC is referring to data brokering. But given what little we know about the field, it may well have been. The NCSC further contended that in regard to the health care and medical data of Americans—some of the most sensitive information about a person—“U.S. safeguards focus primarily on privacy, not national security, which creates a vulnerability for foreign actors to gain access to data on U.S. persons.”
But perhaps the most significant reason foreign intelligence services and others are using data brokers is because U.S. agencies are doing it, too. Significant evidence shows that U.S. agencies have acquired the location data of Americans from commercial entities for law enforcement and immigration purposes: According to the Wall Street Journal and Vice’s reporting, Immigration and Customs Enforcement (ICE), the Internal Revenue Service (IRS), Customs and Border Protection (CBP) have all bought location data from a company called Venntel. And it’s not just law enforcement agencies using these services. Earlier this year, the Defense Intelligence Agency admitted it has been buying commercial location data, and last year, it was reported that U.S. Special Operations Command was buying and using location data to aid special forces operations outside the United States. The Intercept reported that ICE signed a contract with LexisNexis for the company to provide the agency with a wide range of personal data. It is not a stretch to envision foreign agencies buying and using these and many other data.
And the data brokering world is not the only realm the U.S. needs to have a better grip on in order to safeguard its national security. The online advertising industry may be another source of information for foreign intelligence services. As six senators recently argued in letters to some of the largest advertising platforms, “[t]his information would be a goldmine for foreign intelligence services that could exploit it to inform and supercharge hacking, blackmail, and influence campaigns.” The senators specifically raised concern about the real-time bidding (RTB) system, a program by which advertisers can place their ads on websites. RTB often makes available to all bidders sensitive information about the people viewing or frequenting a website, regardless of whether their bid prevails. There are allegations that a number of companies use the RTB system to collect data that they then sell to government agencies, including U.S. law enforcement and security agencies.
It also bears mention that many U.S. multinationals share selected personal data with entities under the jurisdiction of countries like Russia and the PRC, in some cases because of legal requirements. For example, in this April 1 Paypal agreement entitled “List of Third Parties (other than PayPal Customers) with Whom Personal Information May be Shared,” PayPal names a number of Russian and Chinese entities with whom users’ personal information may be shared. Ideally, the data is shared with, say, Russia’s VTB Bank only if there is a transaction with a Russian entity, but the disclosure does not provide that information. Putting aside that issue, there may be the possibility that when PayPal shares my personal data with VTB when I’m buying Faberge eggs the Russian government gets access to my data.
Turning to the regulatory landscape in the U.S., data brokers are regulated in various ways. The FTC regulates data brokers and a host of other entities under its powers granted by Section 5 of the FTC Act, which bars deceptive and unfair practices. The FTC and the Consumer Financial Protection Bureau (CFPB) share enforcement authority for the Fair Credit Reporting Act (FCRA), but as mentioned earlier, this statute provides for the regulation of only a class of data brokers (that is, consumer reporting agencies). In its 2014 report on data brokers, the FTC noted that it “generally governs the practices of entities that assemble or evaluate consumer information for use by creditors, employers, insurance companies, landlords, and others engaged in making certain eligibility determinations affecting consumers.” In the same document, the FTC “described three different categories of data brokers: (1) entities subject to the FCRA; (2) entities that maintain data for marketing purposes; and (3) non-FCRA covered entities that maintain data for non-marketing purposes falling outside of the FCRA, such as to detect fraud or locate people.” The latter two classes of data brokering fall outside the FCRA, and to the extent they would be policed, it would be under the FTC’s Section 5 powers or possibly the CFPB’s powers over unfair, deceptive and abusive practices for those data brokers trafficking in financial services data.
The present level of federal regulation on data brokers is clearly insufficient, and some lawmakers are importantly thinking about bolstering the current regime. If data brokers were to be regulated, it might make sense to impose restrictions within the context of federal privacy legislation that would place limits on the type of data that can be collected, processed, used, shared or sold. The limits could approximate those found in the European Union’s General Data Protection Regulation, which stresses proportionality and necessity and also calls for the destruction of data no longer being used.
For example, the necessary and proportionate concept has filtered into federal privacy legislation such as the Consumer Online Privacy Rights Act (COPRA) (S.2968). A number of bills have language that would establish data broker registries at the FTC, and depending on the consent regime envisioned, there may be some limits on the personal data available to data brokers. COPRA would, for instance, treat entities engaged in data brokering as “covered entities.” This designation carries obligations, such as requiring data brokers to heed the new rights people would have under the act. Under COPRA, data brokers would also have to publish privacy policies that explain whether they transfer “covered data,” the categories of third parties and service providers to which such data may be transferred, and the third parties to whom data is transferred except for “governmental entities.” This last term is not defined in the bill and may be credibly read to mean any governmental entity around the world. Most likely the drafters intended U.S. governmental entities, but this is a possible loophole that should be closed in this bill and others like it. COPRA would additionally require the covered entity to disclose the third parties to whom it transfers covered data—though arguably, this won’t mean a lot to the average person. People who access sites in the European Union, for instance, are often confronted with a mind-numbingly long list of entities with whom their personal data may be shared.
More relevant for national security concerns, under COPRA (and several other recently introduced privacy bills such as the SAFE DATA Act [S.4626] or the Information Transparency and Personal Data Control Act [H.R.1816]), people may opt out of data transfers of the sort that allows data brokering to thrive. Moreover, one would need to opt in to the collection, processing and transferring of sensitive personal data. The FTC would conduct a notice-and-comment rulemaking to flesh out this system. Yet notice-and-consent regimes probably won’t do much to stem the tide of personal data flowing out of the U.S. for the simple reason that such issues are not on most people’s minds. If COPRA or a similar bill were to be enacted, one may expect a minority of people to object to these practices by opting out (and not opting in) as necessary. I would venture to speculate that most people would continue to be unconcerned about privacy matters and continue to let their information be collected, processed and transferred. If so, the national security implications of data brokering would remain largely unaddressed.
One must also consider the resource constraints of the agency that would presumably police the compliance of covered entities. The resource and staffing limitations of the FTC have been noted even if the agency received a $10 million bump for fiscal 2021. The FTC is unlikely to be able to thoroughly regulate the data brokering world based on its current funding, especially since it would be tasked with regulating the privacy and data protection practices of many more entities. And while state attorneys general would be able to regulate under many of the new privacy bills, they face many of the same resource restraints the FTC does. Absent more restrictive language and more appropriations for the FTC, the current notice-and-consent framework present in most privacy bills may not adequately confront the problem that data brokering presents for U.S. national security.
Policymakers may also consider an approach taken by the Trump administration in its waning days regarding cloud platforms. In its January 2021 Executive Order 13984, “Taking Additional Steps to Address the National Emergency With Respect to Significant Malicious Cyber-Enabled Activities,” the outgoing administration tasked the Department of Commerce with a rulemaking “that require[s] United States Infrastructure as a Service Product providers to verify the identity of a foreign person that obtains an Account.” A similar regime for data brokering would throw some much-needed light on this industry and provide the U.S. government with a more accurate view of which entities are selling what information to whom.
Slipping a provision into a privacy bill establishing in law a system similar to Executive Order 13984 for all those engaged in the brokering of Americans’ data may be feasible. Then, the question would become which agency would own the collection and enforcement of these requirements. It may make sense to hand this to the Department of Commerce, given its work on the “know your customer” requirements for cloud service providers. The FTC may be another candidate, but as mentioned above, staffing may be an issue. Of course, there could be extraterritorial issues to contend with, for the data brokering world is a global endeavor and Congress would need to be explicit in clarifying that it intends for any such statute to apply outside of the U.S. per Supreme Court precedent. But there would be limits, for it may be unreasonable for the U.S. government to assert jurisdiction in order to collect all information on customers further downstream in the data brokering sales chain. In any event, this may help U.S. policymakers better understand data brokering, but a means of stopping the flow of personal data to adversaries is still needed.
Ultimately, perhaps export controls are needed as personal data would almost certainly qualify as a dual-use technology. While data may not be a technology per se, it is the sine qua non of the information age. Some liken data to the importance oil had in the 20th century in the affairs of states and national security. If that’s the case, the U.S. needs a more aggressive approach than it’s currently taking to regulate entities that deal in personal data. Congress could revisit export control, and the Biden administration may already possess some of the authority it needs.
In the U.S. export control statute, technology is defined as “information, in tangible or intangible form, necessary for the development, production, or use of an item.” Consequently, it may be a stretch to try shoe-horning personal data into this definition, but there may still be a colorable argument. In the event that personal data is not deemed technology and cannot be swept into current export controls on technology, the president, acting through the secretary of commerce, has the authority under the statute to add items to the Export Administration Regulations and the Commerce Control List. The Biden administration could utilize this power, putting export controls on personal data in order to address data brokering that puts U.S. national security at risk.
Moreover, there may be an appetite in Congress for directing the president to implement such controls on the data brokering industry. Democratic Sen. Ron Wyden of Oregon recently released a discussion draft of the Protecting Americans’ Data from Foreign Surveillance Act, a bill he claims “would create new safeguards against exporting sensitive personal information to foreign countries if doing so could harm U.S. national security.” The bill would append a new section to the existing export control statute regarding the export of certain personal data. This bill would provide an impetus and framework for the Biden administration to begin addressing the unimpeded flow of U.S. personal data. Presumably this bill would bar data brokering for clients in or associated with nations such as the PRC, Russia, Iran, North Korea and others.
Under Wyden’s proposal, the Department of Commerce would need to establish an interagency process to determine (a) which categories of personal data would be covered by the export control system, (b) the threshold above which the export of specified categories of personal data would be controlled, (c) the nations for which one would need a license of other authorization to transfer, export, reexport, or do in-country transfers of covered data, and (d) a list of those nations for which one would not need such a license of authorization.
The bill further provides that this interagency process would focus on the categories of personal data that could be exploited to the detriment of national security and would need to name these categories within one year of enactment. Also within one year, these agencies would need to set a threshold between 10,000 and 1,000,000 U.S. residents above which an entity’s proposed transfer of covered categories of personal data may entail obtaining a license or authorization. As noted above, the agencies would create a list of nations to whom the export of covered personal data is likely to harm national security. And any proposed transfers to these nations would require the exporting party to make the case that national security would not be harmed (the interagency process must review all such applications).
Certain transfers would be exempted in Wyden’s bill. For example, a person sending her personal data would not need an export license. Likewise, if a person is performing a service for another and the transfer of strictly necessary personal data is required, no license would be needed. (This language is tightly written to avoid the outcome where this exception nulls the rule that one needs an export license by stipulating it upon necessity.) Moreover, the bill provides that if the personal data is encrypted to certain standards, an export license may not be needed. And to protect the data from foreign adversaries, the interagency process would also need to set the length of time each category of covered personal data must be encrypted.
Violations would be punished under the current export control regime, and some people whose personal data is transferred in violation of the act would be able to sue. Notably, only those physically harmed, detained or imprisoned in a foreign jail as a result of the violation would have a private right of action. And five years after the bill’s enactment, unintentional transfers to nations identified as national security risks could be punished unless the data is encrypted or is delivered by a third party that said the data would not transit or end up in a prohibited nation.
Incidentally, Wyden and co-sponsors also introduced the Fourth Amendment Is Not for Sale Act this month, a bill that would largely bar data brokers from selling or sharing location data and other personal data to U.S. law enforcement and intelligence agencies unless approved by a court.
Another means to control the flow of Americans’ personal data is to expand the use of the Committee on Foreign Investment in the United States (CFIUS) process. This is not without precedent. In 2020, the CFIUS ordered Chinese gaming company Beijing Kunlun Tech Co to sell Grindr, a dating app, because of concerns that the personal information of U.S. residents would be transmitted to China, where the data would be used by PRC security services. The U.S. government may be able to address one source of data leakage the NCSC hinted at: the legal acquisition of personal data. In the recent overhaul of the CFIUS, Congress expressed concern about:
the extent to which a covered transaction is likely to expose, either directly or indirectly, personally identifiable information, genetic information, or other sensitive data of United States citizens to access by a foreign government or foreign person that may exploit that information in a manner that threatens national security.
Consequently, the CFIUS process was widened to encompass a new set of “covered transactions,” to include “[a]ny other investment … by a foreign person in any unaffiliated United States business that … maintains or collects sensitive personal data of United States citizens that may be exploited in a manner that threatens national security.” But this authority would only address instances in which a foreign entity is buying or investing in a U.S. entity that holds or controls personal data. It would not address many of the sundry data collection, processing and sharing practices common in the data brokering world. Having said that, a more vigorous CFIUS process to ensure foreign adversaries are not buying their way in U.S. data brokering operations is welcome and would help address security concerns. Perhaps instituting a “know your customer” requirement for data brokering would allow for easier CFIUS enforcement.
There may be other statutory and policy changes that address the national security concerns raised by unfettered, largely unregulated data brokering. Given the range of companies engaged in data brokering and the wider data collection, processing and sharing ecosystem, crafting a legislative solution will take care and input from stakeholders. However, this is a matter of crucial importance the Biden administration and like-minded nations need to address.