Artificial Intelligence

Artificial Intelligence – A Counterintelligence Perspective: Part II

By Jim Baker
Wednesday, September 5, 2018, 9:58 AM

In the first part of this series on the counterintelligence implications of artificial intelligence (AI), I discussed AI and counterintelligence at a high level and described some features of each that I think are particularly relevant to understanding the intersection between the two fields. That general discussion leads naturally to one particular counterintelligence question related to AI: How do we identify, understand and protect our most valuable AI assets? To do that, it is important to remember that AI systems operate as part of a much larger digital ecosystem. My focus here is on AI assets in general rather than particular applications of AI. Obviously, certain AI systems, such as those used in military, intelligence and critical-infrastructure settings, require special attention from a counterintelligence perspective, but I won’t focus on those specifically in this post.

Here are a few key points about where AI fits in that larger digital ecosystem:

First, Digital = Real. We can no longer think of the digital world as somehow separate from the “real” world. The two are inextricably linked. To many people, this may seem obvious. But a person’s appreciation of this point depends, I think, on various demographic factors, such as the nature of their employment, their age, their socio-economic status and their geographic location. I don’t think everyone understands this idea well enough, or the implications that it already has, and will continue to have, for society.

What I mean by digital = real is that almost every person and organization exists today in the context of a complex and multifaceted global technology ecosystem. Nearly everyone depends on technology in some way, even those who don’t have a smartphone or internet access. For almost everyone, access to food, electricity, water and other staples requires technology in some way. We rely on electronic networks to transmit, store and process our most important communications and data. This is true no matter who you are.

Second, Digital = Identity. This one may still be a bit of an exaggeration, but it is less of an exaggeration with each passing day. Over the course of human history, our identities have been linked tightly with our physical bodies and the actions that we take (the work we do, the relationships we build, the art we produce, the harm we cause). Today and in the future, for an increasing number of people, our identities are closely linked with the digital devices with which we continually interact—especially smartphones and the networked applications and services that we use on those devices. I’m not the first to observe our close relationship with technology, but I still think most people don’t understand the concept (or reality) well enough. This personal interrelationship with digital devices will grow significantly in the near future as 5G technology enters wide use and facilitates an explosion of Internet of Things (IoT) devices. Increasingly, especially in developed countries, many of the things that people encounter will collect data about those interactions and will share that data with others via the internet, where it will be stored and analyzed to better understand you and others like you as fundamentally as possible.

The U.S. Supreme Court, among others, seems to recognizes this phenomenon, and some of the language in the Jones, Riley and Carpenter decisions reflects a growing concern among at least some justices that digital information that is deeply revealing of our identities as human beings requires heightened constitutional protection. Foreign intelligence services undoubtedly recognize this too.

Third, Big Data is key to AI. As I discussed before, in order to think clearly about AI, you first need to see it as a system with various component parts. AI developers write software code called algorithms. Those algorithms are programs that tell computers what to do and how to behave. In the field of AI, one of the things that developers want the programs to do is learn. AI systems can learn in various ways, and explaining how all of that works is beyond the scope of this post and probably beyond my own capabilities. There are, however, a few points worth making.

Some instances of AI are programmed to learn rapidly in supervised settings with structured data. For example, given the rules, AI can play a game against itself many times and learn the best moves possible. In a short period, the program can play the game significantly more times than any individual human ever could and it can rapidly learn a multitude of strategies for winning. Many folks have heard about how effective this type of learning has been in connection with programs that have learned how to play chess and Go. While impressive, these systems do only one thing really, really well in a limited context. They don’t necessarily apply that learning in other areas of cognition.

Another, and probably more important, way that machines learn is through interacting with data, especially very large data sets—what is often referred to as Big Data. The more data these AI algorithms are exposed to, the more they can learn, and the more they learn, the more effective they will be at analyzing still more data. One of AI’s strengths is that it can make sense out of large volumes of different types of data; look for patterns and trends; and thus understand individuals, groups and organizations in deep ways through advanced data analytics. Because AI developers in this context are trying to create software that can learn and adapt, they need to expose it to data, including data derived from “experiences” from which the programs can extract information through various sensors, such as driving around on the road and encountering different situations.

As I understand it, even algorithms that are not particularly sophisticated or elegant can excel if they have access to vast amounts of data. This may be one of the most significant things that you need to know about AI for counterintelligence purposes: that important AI systems depend on Big Data to improve.

Why is the link between AI and Big Data so important for counterintelligence purposes? Because it helps tell us what our adversaries likely want to obtain and thus what we need to protect. This link may explain why adversaries such as China are so voracious in stealing not only our intellectual property but also as much of our data as they can get their hands on. Let’s consider the data breach at the Office of Personnel Management (OPM), of which I was a victim. I use this example not to attribute the breach to a malicious cyber actor in particular (I can’t) but because the nexus between AI and Big Data may explain why such a large data theft is important beyond the immediate damage caused by compromising the identity of everyone in the U.S. government who has a security clearance. The data stolen from OPM was deep and rich information about the lives of many people who work for the government as well as their families, friends and associates. It represents a trove of data for an AI system to explore and learn about U.S. government officials and Americans more generally. It will help whoever stole it to understand how that subset of the population thinks and acts. That is hugely important for a variety of reasons, including understanding U.S. intelligence officials and operatives and defense contractors and in identifying behavioral patterns and weaknesses of those people.

Fourth, other really important AI and AI-related assets need protection as well. As I have explained, AI has many sub-disciplines and is closely related to several other fields and technologies, such as high-speed computing, robotics and sensor technology. AI systems can process data at a faster rate on high-speed computers (with the related programming that runs those computers). As a result, advances in high-speed computing (such as quantum computing) in the next few years are likely to spur further AI advances. Today, AI is increasingly deployed in robots, such as cars and drones, utilizing a variety of sensors that enable the AI to understand and act in an operational environment. The connection to sensor and robotics technology expands the AI ecosystem even further.

What this means from a counterintelligence perspective is that AI, advanced computer technologies, sensors and robotics are linked in ways that make them highly lucrative targets for adversaries—who will want to steal not only AI algorithms and Big Data but also our technology regarding high-speed computing, sensors and robotics.

Moreover, the people who are experts in all of these areas are also highly valuable. A limited number of very smart people create and operate AI systems. Countries and private organizations compete heavily to hire AI experts, who command huge salaries. It is essential that the United States protect those rare human assets from compromise. We should also implement education and immigration policies that facilitate our ability to develop home-grown talent and to attract and retain top talent from abroad. Once they are here, those people too need to be protected.

Adversaries will target AI experts in a variety of ways. They may try to hire them directly or through cut-outs, such as front companies or organizations, so that the AI experts don’t really know who they are working for. Adversaries likely will also seek to compromise or plant insiders in companies or organizations somewhere in the AI ecosystem. Organizations regularly face a range of internal and external threats to their assets, including their people, and should know how to deal with such issues to some degree. Given the transformational nature of AI, however, the threats to AI assets will probably be more intense than what is faced today with other high-value assets.

The bottom line is that adversaries will target all of those assets—human and technical—that are important to the development and implementation of AI, and counterintelligence officials will need to identify, understand and mitigate those threats.

Finally, fifth, our poor cybersecurity posture poses an existential threat. The United States and its allies face significant cyber risks from a broad range of exceptionally capable and highly motivated malicious cyber actors that we have failed to mitigate adequately. Our critical infrastructure faces risks. The networks and systems that the military relies on to protect us face risks. We face risks associated with botnets and ransomware. And we continue to hemorrhage intellectual property (IP) and data to malicious cyber actors at an alarming rate. The loss of data about the identities and activities of organizations, groups and individuals is a problem for lots of reasons; for the moment, I will focus on the cybersecurity issue.

As I have explained, given the importance of the relationship between AI and Big Data, we simply must do a better job of protecting our data from theft. Right now, developed countries such as the United States and its key allies are more dependent on, and connected to, technology than some adversaries. That means that we generate much more digital information about ourselves than people in many other countries do. Of course, there are so many people in China that the PRC government increasingly can make advances in AI by collecting the digital information that its own population generates. But the Chinese government also wants to know how we think and act and the extent that is or may be different from how people in China think and act. So it wants our data.

If the United States and its allies do not do a better job of protecting our data, we will be facilitating our adversaries’ development of AI. And that AI will be used to their political, economic, military and intelligence advantage—and, obviously, to our disadvantage. This is a problem today. But of longer-term concern is the possibility that their gobbling up of our data will enable them to reach Artificial General Intelligence (AGI)—that is, AI that is roughly as smart as humans across all dimensions of cognition—before we do. As I explained in part one of this series, although some experts believe that AGI is not possible to achieve anytime soon, if it were to happen, AGI would be a game-changer and could give the nation-state that develops it first a permanent advantage over all other competitors. AGI in the hands of an adversary would pose an existential threat to the United States and its allies. And that is on top of the existential threat already inherent in the vulnerability of U.S. critical infrastructure to cyberattacks intended to disrupt or destroy those systems. Even if AGI is not possible, the quest for it will drive behavior toward accumulating more and more data, and that will benefit other, less comprehensive but still important, AI systems. All of this means that we must get our act together and improve cyber defenses.

Many people, myself included, have warned about the problems associated with our poor cybersecurity status for years. Yet the U.S. and its allies are still failing at what we need to do to protect ourselves. Because of my work at the Justice Department and the FBI, some would say that I have been involved in efforts to undermine cybersecurity by advocating backdoors and “golden keys” that would break encryption. That is false. I recognize the benefits of encryption with respect to privacy, civil liberties, human rights, innovation, economic competitiveness, cybersecurity and AI. Without revisiting the whole encryption debate here, suffice it to say that I have always thought that we need to find a way to protect public safety and safeguard our communications and data at the same time. Anything less is not a solution.

Government and corporate leaders must understand what AI ecosystem assets their organizations have, how they use them, what threats they face, and how best to protect them. They need to think of AI assets broadly—focusing on the entire AI ecosystem, especially Big Data systems. And then they must put into place effective protective mechanisms commensurate with the huge value of the AI. This is consistent with the best practices that organizations must use to protect their people, intellectual property, funds, classified information and other valuable assets. The key is to understand what AI ecosystem technology and experts you actually have, that those assets are extremely valuable, and that others, including nation-states, will be particularly aggressive in trying to steal AI assets or corrupt them surreptitiously (such as by messing with AI programming or compromising the data used to teach the AI).

For the law enforcement and intelligence agencies of the United States and its allies, it is especially important to identify and protect key AI ecosystem assets, including Big Data. Governmental authorities must understand who is working on AI and AI-enabling technology in the public and private sectors in their geographic areas of responsibility, and prioritize the protection of those assets accordingly. This will not be easy, as companies understandably will want to zealously guard their intellectual property and data—and so they should.

 

Jim Baker is a visiting fellow at the Lawfare Institute, a visiting fellow in governance studies at the Brookings Institution, and a lecturer on law at Harvard Law School. He is also a former general counsel of the FBI. The views expressed here are his own and not necessarily those of any current or former employer.