Russia’s 2022 invasion of Ukraine has seen not only traditional conflict on land and sea but also a full-out information battle across social media. Russia has a fearsome reputation as one of the most prolific and effective spreaders of disinformation through TikTok, YouTube and Facebook. These operations have been both covert, as with the 2016 Internet Research Agency-driven efforts to interfere in the U.S. presidential election via Facebook ads, and overt, as in much of what RT and Sputnik have said about coronavirus vaccines during the course of the pandemic. Major social media companies have responded in the United States by prohibiting these media organizations from using their ad networks and by taking down some Russia-backed hacking and troll farm operations. The platforms earned early public praise for taking these bold steps to protect their users from Russia-backed disinformation. However, as the war in Ukraine stretches on, the inevitability of misinformation emanating from both sides of the conflict points to the need for increased transparency of online spaces to better understand and counter attempts to sway public opinion via disinformation.
Congress has been debating various proposals that would require or enable greater transparency from social media companies since the revelations of Russian meddling in the 2016 U.S. presidential election. The public trust in major platforms continues to diminish with each new scandal, yet legislative proposals for further accountability have grown in kind. By my count, eight bills have been introduced in the past two years that include at least some major transparency requirements. The transparency mechanisms proposed, however, run the gamut from requiring high-level reporting of descriptions of algorithms to transparency of actual underlying data. There are also wide discrepancies in who would get more transparency, and the answer is, in many cases, not the public at large.
With that in mind, it’s worth reviewing some of the major transparency proposals to better understand how they would function across three dimensions: Who is this transparency for, what exactly would these bills make transparent, and how would they do so? We can explore these questions by reviewing four major transparency proposals: the Honest Ads Act, the Algorithmic Justice and Online Platform Transparency Act, the Platform Accountability and Transparency Act (PATA) and the Digital Services Oversight and Safety Act (DSOSA).
Who Gets Transparency?
The initial assumption when discussing transparency is that it will benefit the public, but that’s not necessarily the case with all platform transparency proposals. All four proposals call for at least some information to be made available broadly to the public. The Honest Ads Act, for example, calls for public transparency of online advertisements. But several proposals explicitly limit the availability of certain content or punt the decision on whether the public will have access to information (and what kinds of information) to a federal rulemaking process.
Both PATA and DSOSA have some transparency provisions that would be available only to “qualified researchers” at academic or certain nonprofit institutions. Both of these bills provide legal protections for investigations of social media platforms in the public interest, or “safe harbor,” but it would be available only to researchers and, in some cases, to journalists, not to members of the general public.
Often, these limitations on who gets access to data appear to be motivated by the sensitivity of the content that is being made transparent. Regarding PATA, some provisions explicitly envision that extremely sensitive, private data could be made transparent. In these cases, limiting transparency to vetted researchers appears to be a mechanism to allow some limited sharing to vetted parties of potentially sensitive data that simply wouldn’t be releasable to a public audience.
In other cases, it’s unclear why the provision is made for limited access to certain content. DSOSA calls for a dataset of advertisements to be made available to certified researchers and leaves it to the Federal Trade Commission (FTC) to determine what advertising information should be made available to the public through a user-facing library. The public record shows that platforms have been hostile to making certain information about advertising available to the public, so this ambiguity may be strategic.
Sometimes transparency is restricted to regulators. The Algorithmic Justice bill calls for platforms to maintain “detailed records describing their algorithmic processes for review for the Federal Trade Commission” but does not require public or researcher access to these records. The underlying assumption of this transparency provision is that regulators will use these records to verify compliance with other requirements in the bill, including prohibitions against algorithmic discrimination for protected classes, as well as the establishment of and adherence to algorithmic safety standards.
What Information Would Be Made Transparent?
Proposals also vary in what type of content they propose to make transparent. The types of content on the table fall into several broad categories:
- Paid advertising. All of these proposals require transparency of paid advertising on social media platforms. This would include information about how ads are targeted to specific audiences, information that currently is not generally available in voluntary disclosures by large platforms such as Meta.
- Reasonably public content. There are also transparency provisions for certain categories of public content. These typically include “high engagement” content, such as public posts that reach a certain level of virality; public posts in the largest public online forums, such as the Reddit feed r/wallstreetbets; and public content from accounts with very large audiences, such as those of so-called influencers. This category also typically includes public content from public figures such as government officials, regardless of audience size.
- Algorithmic reporting. This category of transparency encompasses company descriptions of algorithm processes or decision-making around take-downs of problematic content, as well as other types of platform moderation. The amount of detail provided varies greatly, from extremely general plain-text descriptions that could be fully displayed to users to detailed auditable trails of algorithmic decision-making.
- Sensitive information. These posts contain private information. This would include content where there’s a compelling public interest for research, such as potential harms to children, but also a compelling need for protecting privacy.
Of the four bills assessed, it’s worth noting that the two older bills are also the most narrowly targeted in terms of what they seek to make transparent. The Honest Ads Act dates to 2018 and is narrowly focused on advertising content. The Algorithmic Justice bill was proposed in early 2021 and is focused primarily on algorithmic reporting, although it mandates advertising transparency as well.
PATA and DSOSA, drafted most recently, are also the most comprehensive. Both PATA and DSOSA contain provisions requiring transparency of advertising, high engagement content, algorithms and sensitive data. In many cases, however, they delegate decisions about procedures and details of transparency to various federal agencies.
Providing public access or, in some limited cases, mediated access to platform data would allow research, investigations, and knowledge-sharing that would help in the collective development of new tools and procedures to cut down on problematic content and misinformation. Some categories of information appear to have wider support than others, however. Support for some kind of ad transparency appears in every major piece of transparency legislation, although details differ in important ways. Algorithmic reporting requirements also appear in three of the major transparency legislative proposals (albeit with some significant differences in what, exactly, they require); only the the Honest Ads Act does not include some sort of algorithm transparency.
How Would Information Be Made Transparent?
Another crucial element to understanding various transparency proposals is how they make different types of content available. Requirements for platforms range from directives to post public transparency reports on websites, submit certain reports to regulators (such as the systemic risk reports that DSOSA mandates), provide public searchable interfaces and machine-readable formats, create data clean-room environments, to legal protections for independent researchers and journalists.
These different modes of transparency would not be equally useful for all audiences. For example, requiring that data be available in machine-readable format would be tremendously useful for researchers and journalists with data analytic skills, but less so for a nontechnical audience. For the general public, it would be more helpful to have access to a searchable dashboard or library, a mode of access that would be nearly useless for anyone doing analyses on bulk data. Public platform reports describing algorithmic processes in plain language, or reporting cases of take-downs for certain types of problematic content, such as vaccine misinformation or posts promoting disordered eating aimed at teens, would increase functional transparency to the broader public. This type of transparency could also be useful to social science and legal researchers with expertise to read content closely and understand the nuances of small differences in word choices.
Legal protections, or a safe harbor, for independent collection of platform data (a version of which is in both DSOSA and PATA) would increase transparency by removing legal uncertainties that currently inhibit researchers from collecting data themselves about content on platforms. Academics are already on record about the chilling effect the current environment of legal uncertainty has had on funding of independent data collection projects, so such a step has the potential to unlock a wave of new research.
Lastly, mediated access for sensitive information, such as the clean-room proposal in PATA, or access for certified researchers in DSOSA, can be thought of as the heavy artillery of transparency: expensive, onerous and powerful. Researchers would apply to a mediating agency for access to specific data, in a process that would likely be highly competitive. However, for approved research proposals, platforms would be required to provide access to nearly any type of data that did not directly identify users.
Areas of Convergence Toward Comprehensive Transparency
While the major transparency proposals for social media platforms differ, there are some overlapping principles that seem to be emerging. Several of these bills recognize some form of tiered access to transparency. DSOSA calls this out most explicitly, but it is implied in both PATA and the Algorithmic Justice bill. This likely reflects a sense that there are multiple important audiences for transparency, with different levels of technical literacy, as well as reasons for seeking information about messaging on platforms.
Another important principle is establishing a balance between transparency and privacy protections, although not all bills weigh these considerations in the same way. There does appear to be a common understanding, however, that broad transparency is needed so that the public can hold platforms accountable for their actions, or lack of action. However, greater transparency cannot come at the expense of exposing private, sensitive user information. Platforms argue that they can’t make anything transparent without risking user privacy, but even the staunchest privacy advocates would disagree. But some of the data that platforms hold, which might be vital for the research into reducing harm to vulnerable groups such as teens, is very sensitive and must be subject to strict access controls. Conversely, there is a great deal of data, either about public content such as advertising or about the functioning of the platform itself (such as algorithms), that could easily be made publicly transparent with no apparent privacy risks.
There also appears to be a growing consensus around certain types of content that are needed for transparency. For example, all four major proposals require transparency of advertising, or paid content. PATA and the Algorithmic Justice bill both call for such ad libraries to be public. DSOSA gives the FTC authority to determine which elements of ad transparency should be public, and calls for added elements for certified researchers. The Honest Ads Act requires public transparency but limits it to political ads. Other technology accountability proposals, such as the Kids Online Safety Act and the Social Media NUDGE Act, have also called for universal public ad transparency as part of their overall approach to increasing accountability.
Required reporting to the public of algorithmic processes is another concept that is present in several different proposals. This idea is the primary thrust of the Algorithmic Justice bill, along with additional provisions prohibiting discrimination. Algorithmic transparency is also required in some form in PATA and DSOSA. As with universal ad transparency, other proposals before Congress that are more focused on accountability than transparency, such as the Algorithmic Accountability Act, also include such provisions.
Finally, safe harbor, or mediated access to a wide range of sensitive platform data for vetted researchers, is called for by PATA and DSOSA. PATA extends this safe harbor to journalists as well as academic researchers. While this provision has not appeared in other proposals, it’s notable that it has appeared in the two most recent of the four major bills.
No doubt the future will bring challenges to the United States and the larger global community. The past several years alone have seen attacks on democratic systems, a global pandemic, and now a war that raises the nuclear threat level. But as society becomes more reliant on online platforms to communicate with each other, a major aspect of any of these crises will be online, whether via cyberattacks or information disorder. It will be necessary to continually evaluate approaches to mitigating harm in online spaces. To accomplish this, access to trustworthy information and data is essential. With major proposals before Congress, it’s possible to secure this needed transparency from online platforms. The sooner lawmakers and policymakers create mandates for platform transparency, the sooner researchers can get to work.