consumer data

Federal Privacy Rules Must Get “Data Broker” Definitions Right

By Justin Sherman
Thursday, April 8, 2021, 11:00 AM

Corporate data collection is all over the headlines, between a recent Senate hearing on antitrust, the appointment of Tim Wu as the president’s special assistant for technology and competition policy, the nomination of Lina Khan for Federal Trade Commission (FTC) commissioner, and growing momentum on the Hill for federal privacy legislation. Much of this privacy conversation focuses on social media companies like Facebook and Twitter, and how they monetize data and feed their artificial intelligence algorithms, as well as market dominant players like Amazon, and how they might internally use data for anti-competitive behavior.

Yet data brokerage—in broad terms, companies buying and selling data on consumers—remains heavily underdiscussed in these conversations on potential regulatory action. Those engaged in this practice include companies that sell large datasets that other firms use to microtarget online ads, and companies that aggregate information from public records and link them to individuals on their websites (those “click here to run a background check” ads you see on search engines). Some companies, such as Lexis/Nexis, have transformed their traditional business into data brokerage, while others, such as traditional consumer reporting agencies and companies that help local governments digitize their records, have often made data brokerage just one component of what they do. That’s why the entity that directly and initially collects a consumer’s information is often only the first in a long chain that will acquire it. All of this means data brokerage must be a core component of federal privacy legislation and enforcement actions.

As with many data concepts, defining the practice of “data brokerage,” or the “data brokerage industry,” or what constitutes a “data broker” itself is complicated. But to strengthen their efforts in developing federal privacy rules, policymakers have some foundation. This post examines existing definitions and understandings of data brokerage in U.S. law and policy. It analyzes the scope of these definitions, surrounding context, and some of the costs and benefits of the various terms. Putting appropriate privacy and security controls on the practice of data brokerage is no easy task, but this could be one place to start.

Existing State Laws: California and Vermont Case Studies

A few states define data brokers and data brokerages in their respective laws. California Civil Code § 1798.99.80, which requires data brokers to register with the state, defines a data broker as “a business that knowingly collects and sells to third parties the personal information of a consumer with whom the business does not have a direct relationship.”

The code specifies several exemptions from this definition:

(1) A consumer reporting agency to the extent that it is covered by the federal Fair Credit Reporting Act (15 U.S.C. Sec. 1681 et seq.).

(2) A financial institution to the extent that it is covered by the Gramm-Leach-Bliley Act (Public Law 106-102) and implementing regulations.

(3) An entity to the extent that it is covered by the Insurance Information and Privacy Protection Act (Article 6.6 (commencing with Section 1791) of Chapter 1 of Part 2 of Division 1 of the Insurance Code).

Vermont Statute 9 V.S.A. § 2430—which similarly mandates data brokerage disclosure to the state, which then lists those firms in an online database—defines a data broker as “a business, or unit or units of a business, separately or together, that knowingly collects and sells or licenses to third parties the brokered personal information of a consumer with whom the business does not have a direct relationship.” It then provides the following exemptions for entities:

(i) developing or maintaining third-party e-commerce or application platforms;

(ii) providing 411 directory assistance or directory information services, including name, address, and telephone number, on behalf of or as a function of a telecommunications carrier;

(iii) providing publicly available information related to a consumer’s business or profession; or

(iv) providing publicly available information via real-time or near-real-time alert services for health or safety purposes.

First, the definitions themselves. California’s definition predicates the data broker qualification on a business knowingly collecting and selling personal information on consumers “with whom the business does not have a direct relationship.” This “direct relationship” clause is key, because it excludes a wide range of businesses from being classified as data brokers by drawing a distinction between an entity that is a data broker and a firm that engages in data brokerage. Facebook, for instance, would ostensibly not be deemed a data broker under this definition if it were to sell information on its own users to a third party—for, after all, those users have a direct business relationship with the platform. Nor would Facebook be deemed a data broker if it engaged in the practice of data brokerage—knowingly collecting and selling the information of its users to a third party—because the users on which it is selling data are direct customers. Yet such a sale, particularly given Facebook’s extensive reach in the U.S. and abroad, would raise numerous concerns about privacy, democracy and national security.

Federal policymakers should consider whether this is a worthwhile distinction. If the company is selling data on consumers, does it really matter to the regulator if the company has a direct relationship with them? And how is a direct relationship defined, exactly? If I complain to Facebook about a user posting my Social Security number, does that interaction now constitute a direct relationship with the firm?

Perhaps the data broker and data brokerage distinction does not matter: One could argue that federal policy should control for the end case, the sale of the U.S. customer’s information. Or the distinction could be reflective of a policy aimed to impose even greater restrictions on companies that sell data directly obtained from their own customers—under the argument that firms in this category have an even greater need to responsibly process and handle the data, because they are the direct collectors of said information. However, the California law did not use this distinction to impose such greater restrictions, suggesting maybe that companies advocated to be excluded from a data broker classification to avoid potential future obligations, like downstream liability for how customers use the data.

Nonetheless, the data broker entity and data brokerage activity distinction under California law appears to become less meaningful with a firm like Equifax—which was recently reported as having sold user utility data to a third-party company that then sold the information to the Department of Homeland Security. Equifax is not listed on California’s data broker registry. The company’s business relationship with the average consumer is arguably more indirect than direct—but it buys and also sells their indirectly acquired and very personal financial and other information. This begs the question of whether Equifax should be classified as a data broker as a matter of public policy if it is selling consumer data to third parties, including federal law enforcement. California’s distinction between a data broker entity and one engaging in the practice of data brokerage—and its exclusion of myriad data-selling and data-sharing firms in the process—demonstrates a need for fundamental reconsideration when crafting federal law and policy over this practice and this industry.

Vermont’s definition of a data broker virtually mirrors California’s, but for the inclusion of the phrase “or licenses” when speaking about activities, in addition to collecting and selling, that fall under the defined activities of a data broker. Companies as a result do not need to technically sell information to a third party to qualify as data brokers if they meet the other requirements; licensing said information to a third party is enough. It’s a broader definition than that in California law, and a sensible one in light of complex data-sharing arrangements between private companies in the United States that may not technically qualify as a sale.

The exemptions from the California and Vermont definitions of a data broker are also worth noting. While California’s exemptions seem primarily architected to avoid conflicts with federal law, the scope of Vermont’s possible exemptions are far wider. Companies that “develop or maintain third-party e-commerce or application platforms” are not classified as data brokers in Vermont. Such an exemption seems unduly broad—yet again cuts back to the California law’s distinction. Some firms may engage in the general practice of data brokerage (collecting and selling and/or licensing consumers’ data to third parties), but they may not be classified as a data broker. Under Vermont’s exemption, Amazon could therefore be collecting data on all kinds of consumers with whom it does not have a direct business relationship, such as through third-party ad plug-ins on other websites, and not legally be labeled a data broker. It’s a troublingly broad carve-out.

Vermont also exempts those providing “providing 411 directory assistance or directory information services … on behalf of or as a function of a telecommunication carrier,” which is an interesting carve-out as those firms could sell the information they compile on individuals without being classified as a data broker. The third exemption, “providing publicly available information related to a consumer’s business or profession,” sounds more sensible, as many companies provide publicly facing information on businesses via job-sharing websites, but this raises the question of why the sale of that data would be exempt from a data broker classification. The public information exemption raises two risks: One, it may create regulatory loopholes for entities that provide much easier access to government records than was likely intended when those records were meant to be accessed physically. Two, this “publicly available information” exemption could be interpreted as anything that is on the internet even if the individual to whom the data relates did not intend for that data to be publicly published in such a fashion (someone sharing my Social Security number). The same goes for real-time public alert systems for health and safety, which are also exempt under Vermont law from a data broker classification.

California and Vermont’s definitions have components that are in many ways sensible. Yet exempting businesses that sell data on their direct customers from being data brokers draws a distinction between being a data broker entity and engaging in the practice of data brokerage, with real policy implications. Firms like Facebook, ostensibly, could therefore sell information on their own customers and not be classified as a data broker. Perhaps this distinction could be used to create different obligations for those selling their own customers’ data versus that of noncustomers. Though it could also be a problematic way to exempt companies from privacy restrictions on data brokerage—and given the distinction is not used in California to impose even greater requirements on data brokerage for a firm like Facebook, this distinction is concerning. Similarly, Vermont’s broad exemption of e-commerce platforms from classification as data brokers would ostensibly enable a company like Amazon to collect reams of data on consumers with which it has no direct relationship (for example, via ad campaigns run elsewhere) and then sell that data to a third party without being classified as a data broker. The “publicly available information” exemption is similarly concerning as more government records are digitized at unprecedented speeds, revealing information that could enable activities like doxing or intimate partner violence. Dodging a data broker definition harms the privacy of American consumers when that classification brings other regulations with it, the least of which is some modicum of basic public transparency.

Existing Federal Documentation: Federal Trade Commission Case Study

The Federal Trade Commission (FTC) also defines data brokers in its 2014 report on the data brokerage industry as “companies that collect consumers’ personal information and resell or share that information with others.” This definition drew on its 2012 report, which broke data brokers into three categories:

(1) entities that maintain data for marketing purposes; (2) entities subject to the Fair Credit Reporting Act (FCRA); and (3) entities that maintain data for non-marketing purposes that fall outside of the FCRA, such as to detect fraud or locate people.

The definition in the FTC report is far broader than that in either state law, does not make the distinction between an entity that is a data broker and the practice of data brokerage, and does not exclude companies that sell data on their direct consumers. Rather than excluding companies that sell data on their direct customers, the FTC report does not make the distinction between an entity that is a data broker and the practice of data brokerage. Facebook, in this case, would thus be considered a data broker if it sold information on its direct customers to a third party. Amazon would also be considered a data broker if it sold information on consumers, even if that information was collected indirectly by its e-commerce platform. The FTC report’s definition of a data broker as a result includes firms that resell data on consumers, not just those that sell data after its initial collection. It also covers firms that sell consumer information both directly and indirectly; it does not distinguish between a company that collected the data indirectly and then sells it, and a company that acquires the data indirectly and then sells it.

Finally, it’s worth noting the FTC report defines a data broker as a firm not just selling consumer data but also sharing consumer data. This is significant. The data brokerage industry is incredibly opaque, but many firms have data-sharing pipelines and agreements with other companies that extend beyond the strict activity of a sale. Including other types of sharing and licensing agreements in the definition of a data broker is therefore a way to encompass a wider range of companies involved in moving large amounts of consumer data through the marketplace.

The FTC’s definition of a data broker is much broader and is probably a much stronger baseline from which to architect the definition of a data broker in federal legislation. It includes companies that sell information on their direct customers, regardless of whether it was collected directly or indirectly, and it also includes information that is publicly available. This last point is a valuable caveat for protecting consumers’ privacy. By not limiting the scope of a data broker classification to just data sales, the FTC definition captures the number of firms and activities that “share” data in the data marketplace with other companies through data-sharing pipelines, licensing agreements and other practices. Again, perhaps there is value in the distinction between a data broker entity and the practice of data brokerage, but where California and Vermont exclude myriad data-selling companies from their definitions with this distinction, the FTC’s report on the data brokerage industry importantly does not.


Interestingly, these case studies into definitions of a data broker and understandings of data brokerage provide little or nothing in the way of specifying to whom the data is being sold. While there are exemptions for certain kinds of sellers (for example, Vermont exempting e-commerce platforms from a data broker classification), there are no specifications listed about whether those data broker entities are selling the personal information of consumers and citizens to, say, a foreign company or even a foreign government. Potentially introducing controls when selling to or sharing with particular entities for particular use cases is another area that remains underdeveloped in these few laws and policy documents.

Many proposals are in the works for a federal privacy bill, and momentum on the issue is in many ways reaching a fever pitch in Washington. Yet for all the talk of large social media platforms or advertising companies that harvest consumer and citizen information, federal privacy legislation will not be sufficiently comprehensive without substantial attention to the data sales and transfers that underpin the data surveillance economy itself.

The author thanks David Hoffman and Jessica Edelson of the Duke Privacy and Democracy Project for their feedback on an earlier draft of this post.