At the end of November, the New York Times reported that President Donald Trump suggested to aides that he did not in fact make the offensive comments about women on the infamous “Access Hollywood” recording. The media storm that followed this reporting of his comments mainly involved the concern that this is the latest sign the president may be “cracking up,” including a letter from 27 psychiatrists warning about the president’s psychological instability given signs of losing touch with reality. Little analyzed, however, is what the story can teach us about the future of misinformation online.
The fake news debate to date has focused on misleading or blatantly false articles made to look like credible news reports that go viral online. The next frontier promises even more disruption. As the technology develops, the next wave of misinformation will involve not only written lies, but manipulated audio and visual content. The three-minute “Access Hollywood” recording from 2005 is precisely the kind of content that technology will be able to convincingly manufacture in a short timeframe, and frameworks for dealing with misinformation need to take this risk into account.
The Future of Fake News
Fake stories based on images created using basic photo manipulation software are already relatively common, such as the lie that Hillary Clinton met bin Laden or the mischief that there were sharks in Houston following Hurricane Harvey. But technology will soon allow the relatively cheap and easy creation of fake—but convincing—audio and video content as well. This kind of content is often instinctively more trusted because it appears to replicate the primary source of the information and not merely recount it. As Claire Wardle and Hossein Derakhshan state in a recent report for the Council of Europe, “the way we understand imagery is fundamentally different to how we understand text. Our brains process images at an incredible speed when compared with text. As a result, our critical reasoning skills are less likely to engage with what we’re seeing.”
The most effective of these use machine-learning techniques to generate content. “Generative adversarial networks” (GANs) are a type of machine-learning algorithm that learns the properties of a certain audio source (such as a particular voice) and then reproduce those properties in a different context. As the Economist suggests, “Putting words into the mouth of Mr. Trump, say, or of any other public figure, is a matter of feeding recordings of his speeches into the algorithmic hopper and then telling the trained software what you want that person to say.” While some recent attempts have made waves online, such as a faked video of an address by President Barack Obama that University of Washington researchers created and the popular podcast Radiolab mainstreamed, these attempts have remained unconvincing. However, Ian Goodfellow, who developed the first GANs in 2014, estimates that convincing YouTube content could be generated within three years. Wardle and Derakhshan call this “the biggest challenge” for people working for solutions to the disinformation threat.
Solving the Problems of Tomorrow
Trump’s purported disavowals of the “Access Hollywood” comments have not been taken seriously because shortly after the recording surfaced, he acknowledged it was his voice by issuing an apology. (There were also eight eyewitnesses who have not contested its veracity.) However, if fake audio and visual content become more widespread, those plagued by inconvenient tapes in the future may not be so quick to admit fault. And they will no doubt often be right to refrain from doing so. Based on our current information environment, public figures of all stripes will likely be the target of faked recordings in attempts to damage them. New technology must be developed to help identify this kind of false content. Wardle and Derakhshan suggest that the sharing of metadata between trusted partners might help verification processes. Currently, many images and video are stripped of metadata to protect privacy and conserve data but this can complicate the verification process. If companies who are responsible for the dissemination or promotion of content are trusted with this information, it could facilitate better fact-checking. For the moment, verification often relies on looking at shadows or seeing if audio syncs perfectly.
While awaiting better verification tools to be developed, regulatory responses to the fake news crisis need to be forward-looking. The current information pollutants are largely text-based, but this will not be the case for much longer. Technology will soon give an ominous new meaning to the old joke: “Who are you going to believe, me or your lying eyes?” There are increasing calls for platforms to regulate content that appears online as a reaction to the spread of Russian propaganda and other fake news stories during the 2016 election. Germany has just passed a law requiring social media companies to remove reported unlawful content within 24 hours. The EU and U.K. are both conducting public consultations with a view to increased regulation. Regulatory engagement with the problems of disinformation is essential, but it is also important to ensure that responses are not knee-jerk reactions to the most recent problems that do not anticipate the next ones. Calls for platforms to bear the responsibility for their products’ information hygiene need to acknowledge that this also makes them arbiters of truth in certain ways. There is no easy answer to the question of how content should be verified, but the answer needs to grapple with the fact that increasingly more reality will be contested.
Correction: This piece previously referred to GANs as “generative audio networks.” Rather, GAN stands for “generative adversarial network.”