Is It Possible to Reconcile Encryption and Child Safety?
As the number of economies and lives that have moved online increases, so does the collective reliance on the plethora of global social media and communication services, and with that reliance comes certain privacy and security expectations from users. That reliance also engenders an expectation of user safety. As with any technology, these services are abused by malfeasants to cause harm to others—from cyber bullying to state-sponsored disinformation, from ransomware to online child sexual abuse. Child sexual abuse is a societal problem that was not created by the internet, and combating it requires an all-of-society response. However, online activity uniquely allows offenders to scale their activities but also enables entirely new online-only harms, the effects of which are just as catastrophic for the victims.
Today, we have released a paper that we hope will help the debate around combating child sexual abuse on end-to-end encrypted services. The paper describes the details and complexities of the problem (which is much more complex than other government needs, such as exceptional access) and provides a balanced framework for assessing the potential benefits and disbenefits of any solutions. The dual dystopian futures of “safe spaces for child abusers” and “insecurity by default” for all are neither necessary nor inevitable.
As members of the United Kingdom’s Government Communications Headquarters, we have spent many years working in the technical domains of cryptography and computer security, and we wrote this paper after also having spent many years working to combat child abuse. This paper is not U.K. government policy and is not intended to be a recipe for what governments could demand in the future. It is a genuine attempt to encourage debate and develop a common understanding of the problem, and the risks and benefits of any future technology changes.
Current State of Play
For many years, most mainstream social media and communication platforms have implemented technical mitigations that help protect some of their most vulnerable users, including children, from abuse taking place on the platforms and real-world abuse facilitated by the platforms. Many platforms perform analytics on user behavior and content to try to spot illegal actions, including looking for known child sexual abuse imagery using technologies such as PhotoDNA.
These technologies are used to detect potential child sexual abuse-related activity. Once flagged, the content is then often referred to a human moderator to confirm illegal content, before being passed to the relevant national body that is authorized to deal with such referrals. The majority of mainstream social media and communication platforms are U.S. based, so this body is usually the National Center for Missing & Exploited Children (NCMEC) via the center’s CyberTipline. NCMEC reviews the content and, if appropriate, reports it to the relevant authority. In the U.K., this is the National Crime Agency (NCA).
To identify what mitigations are appropriate, it is important to understand the scale of online child sexual abuse. One statistic often used to illustrate the breadth of the problem is the number of reports received by NCMEC from U.S. platforms, covering their global user base. This amounted to 29.4 million reports in 2021, up from 21.75 million in 2020. However, without context, this number—and the increase—provide little useful information and can be easily misinterpreted. In the same year, the NCA received 102,842 reports from NCMEC (accounting for the vast majority of reports from industry), but some of these were incomplete or, once investigated, not found to be child abuse. Of the 102,842 reports, 20,038 were referred to local police forces and started (or contributed to) investigations—in 2021 more than 6,500 individuals were arrested or made voluntary attendances due to offenses related to child abuse and more than 8,700 children were safeguarded.
These numbers more accurately illustrate the scale of the societal problem of child sexual abuse in the U.K., of which the online component is significant. We would like to be able to show the causal link between individual CyberTips and convictions. However, this is not currently possible—industry notifications may lead to a completely new investigation, provide new evidence to allow investigations into an existing suspect, or provide further evidence of the scale of an offender’s crimes to an existing prosecution—and we do not currently have the data to understand which of these outcomes has occurred in which cases. However, we hope to be able to provide more in-depth analysis over the coming years as data collection improves.
Recently, many of these same platforms have started to remove their ability to access a user’s content, through technologies including end-to-end encryption, ostensibly to provide privacy for their users. This shift fundamentally breaks most of the safety systems that protect users and that law enforcement relies on to help find and prosecute offenders. As a result, governments have vociferously raised the specter of “safe places” where child abusers can operate with impunity. At the same time, academics and privacy campaigners have raised the specter of a world of technology that is “insecure by default” where privacy and security are fundamentally impossible, with poor design choices justified through arguments starting with the exhortation “Think of the children!” Both potential futures are possible. We believe that neither is inevitable or desirable.
We have written previously on the subject of exceptional access—law enforcement and governments gaining access to user content with an appropriate lawful instrument—and provided simple principles that could be used in the design of any such system. Those principles are simple because the concept of exceptional access is also simple. Countering child sexual abuse online is as complex in principle as exceptional access is simple. Our exceptional access post repeated, “details matter”, but any analysis of how to counter abuse on encrypted platforms requires the consideration of many more details than exceptional access. This isn’t a unitary problem—harms accrue in different ways and through different offender and victim behaviors. We believe that one of the challenges with this particular policy debate is that governments and law enforcement have never clearly laid out the totality of the problem being tackled. In the absence of that, people infer a model that turns out to be incomplete or, in some cases, incorrect. In writing this post, we hope to correct that information asymmetry and engender a more informed debate.
There are existing typologies and taxonomies of child sexual abuse, but none that focuses on the specific ways in which the offenses manifest online. This narrower framing defines how these harms could be mitigated. Based on our experience, and in consultation with law enforcement experts, we have created “harm archetypes” to try to frame the problem in a new way:
Consensual peer-to-peer indecent image sharing. When two children or young people voluntarily exchange nude or explicit images with each other. This is still an exchange of illegal imagery, but the outcomes resulting from the identification of this type of content must be very different from the other harm archetypes.
Viral image sharing. When people (usually) without a sexual interest in children share child sexual abuse images or videos in disgust or misplaced humor. These can go viral, causing significant further harm to the victims. This harm archetype is the source for many of the reports to NCMEC, contributing approximately 3 million of the 29.4 million reports in 2021, but many of these are filtered before reaching the NCA so they are not included in the more than 100,000 referrals.
Offender-to-offender indecent image/video sharing. When offenders share illegal child abuse content with those offenders they’ve already had contact with.
Offender-to-victim grooming. When offenders attempt to contact children online and convince them to meet in the real world—possibly leading to contact abuse—or send explicit, illegal images of themselves—often escalating to blackmail, leading to more extreme demands from the offender. The second pathway of harm is relatively new. It has been enabled by the commodity of social media and communication platforms. It is worth noting that offenders will often use existing explicit images of children as a “trust token” with their victims, to try to convince the victim they’re legitimate.
Offender-to-offender communication, bilaterally and in groups. When offenders discover and communicate with each other to normalize their behavior, share tradecraft and techniques, and even plan real-world abuse. We separate the archetypes because of the different technical characteristics for detection and mitigation.
Streaming of on-demand contact abuse. When offenders, usually Western, pay to watch and direct the live abuse of children often located elsewhere in the world.
Whether or not the above behaviors take place on a given online platform will depend on several factors, including:
- The platform’s functionality
- The abuser’s technical capabilities
- How potential victims interact with the platform
For example, if a platform does not allow users to discover other, unknown users with particular characteristics, this is of limited use to an offender who wants to discover and contact children. By examining the technical characteristics and constraints of a platform, one can better understand how offenders exploit them and how to best combat each harm. Even when similar harms exist on similar platforms, they rarely manifest identically, meaning that detection and mitigation must be platform specific. In our paper, we describe the harm archetypes in more detail and examine the service characteristics we believe are important in understanding the specific mechanisms of harm on a given platform and, therefore, what may be necessary to combat that harm.
Many service providers go to significant lengths to try to prevent, detect, and refer child sexual abuse behavior on their platforms. Many have access to the content of their users’ messages, as well as significant volumes of complex metadata, which is often used for targeting advertising for those services with that business model. This access allows for relatively simple ways of combating child sexual abuse behavior online, since images can be checked by server infrastructure using techniques like PhotoDNA (to ensure they are not known child abuse images), text can be analyzed for risky language, and so on.
The metadata available to each platform provider varies significantly, with some holding vast swathes of complex metadata about each user and each interaction. Others hold almost nothing. Analysis of metadata may detect some types of behavior that could suggest child sexual abuse. For example, a new account contacting many accounts that appear to be children and having a high rate of rejection may indicate initial grooming contact. But it could equally be spam. If service providers are willing to terminate these accounts quickly, then it can be argued that the harm is averted. However, offenders are often persistent, and in most of the harm archetypes listed above, the risky behavior can be harder to spot with confidence—such as a child sending an explicit image to an offender. In these cases, more robust mitigations must be employed.
As service providers make design choices to include end-to-end encryption, the techniques in use today become less useful and some (such as servers checking whether images are known child sexual abuse material) simply cease to work. However, we do not think this rules out all opportunities for safe, private, comprehensive, and effective systems that ensure user safety and child safety in particular on social media and communication platforms.
In our paper, we explore a range of techniques that could be employed to help reduce harm on a given platform where the service provider does not have access to user content. Again, this is intended to be a menu of potential mitigations, rather than a checklist of things that must be implemented (since the technology the involved platforms use matters, as do the harm archetypes that will be prevalent on each one).
Researchers rightly point out that poor designs for safety systems could have catastrophic effects on user safety and security. However, we do not believe that the techniques necessary to provide user safety will inevitably lead to these outcomes.
For example, one of the approaches we propose is to have language models running entirely locally on the client to detect language associated with grooming. If the model suggests that a conversation is heading toward a risky outcome, the potential victim is warned and nudged to report the conversation for human moderation. Since the models can be tested and the user is involved in the provider’s access to content, we do not believe this sort of approach attracts the same vulnerabilities as others.
Even so, there may be risks to this approach that would need to be mitigated. The system will need to be designed and built properly, but that is true of the rest of the service. The more subtle sociotechnical risks (for example, the effect of a misclassification on the relationship between the parties) will need to be researched more to be fully understood and mitigated. However, we believe that a robust evidence-based approach to this problem can lead to balanced solutions that ensure privacy and safety for all. We also believe that a framework for evaluating the benefits and disbenefits is needed. We don’t provide one in this paper but note that the U.K.’s national Research Centre on Privacy, Harm Reduction and Adversarial Influence Online (REPHRAIN) is doing so as part of the U.K. government’s Safety Tech Challenge Fund, although this will require interpretation in the context of national data protection laws and, in the U.K., guidance from the Information Commissioner’s Office.
Intuitively, artificial intelligence (AI) systems acting on the “behavior” of accounts seem to be a solution to the problem at hand. AI systems can, in some circumstances, achieve highly accurate predictions and can work from data that will still be available once access to content is removed. However, in most of the harm archetypes, AI approaches that use only metadata are severely limited (with the possible exception of viral image sharing). In the paper, we explore the various reasons why AI techniques alone are unlikely to be the solution to this problem and also explore why access to verified illegal content is critical to law enforcement action. The alternatives being proposed do not give content to law enforcement and, instead, use metadata about a user’s account to establish a likelihood of offending. This means that law enforcement agencies will be expected to act on a tip that basically says, “Our AI says Person X is probably involved in something dodgy, but we can’t explain why and we can’t give you any evidence to back it up.” Any next steps that law enforcement could take—surveillance, arrest, and so on—are highly intrusive and so have a high threshold for authorization, which this approach almost certainly wouldn’t meet. Down this road lies the dystopian future depicted in the film “Minority Report.”
Addressing Client-Side Scanning
In the summer of 2021, Apple released a feature called NeuralHash that sought to detect known child sexual abuse images. It ran only on the user’s device where the images are unencrypted, rather than on the company servers, like most PhotoDNA implementations. This sort of technique is known as client-side scanning and has received significant attention from industry, academic researchers, and the media, even though it is only one of the many techniques needed in the future to provide user safety at scale. Intuitively, the removal of the service provider’s ability to detect known child sexual exploitation images on their servers can be mitigated by performing the scanning on the user’s device or client. However, this relatively simple change (of where a technique is run) fundamentally changes the security and privacy properties of the approach, including how an adversary might exploit this change. As is usual, security researchers sought to understand the system and its potential weaknesses and published their results. In our paper, we explore at length how these techniques could be made safe, but it is instructive to consider three key issues identified in those works, which broadly apply to any client-side scanning technique.
The first is that it is relatively simple to create completely benign images that generate false positives and are identified as potential child sexual exploitation images. False positives are a problem with any image classification technique, and the real question is therefore “what is the impact of an adversary exploiting this weakness?” The actual impact depends on the harm archetype being discussed. For example, offenders often send existing sexually explicit images of children to potential victims to try to engender trust (hoping that victims reciprocate by sending explicit images of themselves). In this case, there is no benefit whatsoever in an offender creating an image that is classified as child abuse material (but is not), since they are trying to affect the victim, not the system. This weakness could also be exploited by sending false-positive images to a target, hoping they are somehow investigated or tracked. This is mitigated by the reality of how the moderation and reporting process works, with multiple independent checks before any referral to law enforcement.
The second issue is that there is no way of proving which images a client-side scanning algorithm is seeking to detect, leaving the possibility of “mission creep” where other types of images (those not related to child sexual abuse) are also detected. We believe this is relatively simple to fix through a small change to how the global child protection non-governmental organizations operate. We would have a consistent list of known bad images, with cryptographic assurances that the databases contain only child sexual abuse images that can be attested to publicly and audited privately. We believe these legitimate privacy concerns can be mitigated technically and the legal and policy challenges are likely harder, but we believe they are soluble.
Finally, the issue of robustness is raised. That is, how easy is it for a motivated adversary to disable the detection on their device? Again, the impact of this varies between the harm archetypes. If used to combat the offender-to-offender image sharing archetype, this is indeed an issue, since we expect offenders to use modified clients that do not report honestly. But it is not an issue in the offender-to-victim grooming archetype, since the potential victim is highly unlikely to use a modified application that has the detection disabled. Even in the first use case, there are mechanisms to make real-world exploitation harder.
Through our research, we’ve found no reason as to why client-side scanning techniques cannot be implemented safely in many of the situations society will encounter. That is not to say that more work is not needed, but there are clear paths to implementation that would seem to have the requisite effectiveness, privacy, and security properties.
And responding to the challenge of “put your money where your mouth is”, we provide an analysis of two of the harm archetypes on a hypothetical but realistic service. While there is certainly more work to do on the techniques we describe, including a more complete security and privacy analysis, we believe that these examples demonstrate that it should be possible to provide strong user safety protections while ensuring that privacy and security are maintained for all. We have suggested avenues of further work that we believe are necessary and hope that this paper motivates such work and a more inclusive and informed debate.