The National Vetting Enterprise: Artificial Intelligence and Immigration Enforcement
In May 2018, facing widespread outrage, the Department of Homeland Security (DHS) backed away from a proposal for machine learning technology to monitor immigrants continuously. Instead, the department reimagined the vendor solicitation as a labor contract: Months later, in August, it awarded General Dynamics a contract valued at $113 million to carry out the Visa Lifecycle Vetting Initiative (VLVI), formally known as the Extreme Vetting Initiative. But while DHS did retract its request for machine learning technology in the face of criticism from technologists, rights organizations and members of Congress, it did not cite or acknowledge these concerns. Instead, the department concluded that the technology to automate its vetting functions did not exist yet.
The VLVI is a critical component of a larger shift in immigration enforcement carried out under a new framework, erected over the course of various presidential directives but ultimately dubbed the National Vetting Enterprise (NVE) in February 2018. Housed in the National Vetting Center (NVC), the NVE is ushering in new technologies and data sources to carry out its mission, including social media web scraping, algorithm-driven analytics, centralized databases and automated information sharing across agencies. This amounts to a moment-by-moment monitoring of immigrant activities during the lifecycle of their interactions with the United States, as opposed to the previous system of checkpoint-based immigration enforcement.
While immigration has always been intertwined with both law enforcement and national security goals, the NVE introduces an unprecedented level of collaboration between immigration and national security agencies. The heads of the government bodies charged with immigration, law enforcement, foreign affairs, national security and intelligence now work side by side in the NVC. The mission of the NVC is similar to that of the National Targeting Center, but the latter limits its vetting mission to foreigners attempting to cross U.S. border, whereas the former will screen all who “seek a visa, visa waiver, or an immigration benefit, or a protected status.” Collectively, each year, the United States hosts over 13 million permanent residents, two million non-immigrant visitors and 750,000 newly naturalized citizens. Each will be affected by the shift in immigration enforcement policy.
Although DHS has, for the time being, forfeited its dream of using AI for immigrant vetting, the remaining NVE apparatus effectuates new enforcement policy that presents causes for concern. DHS rhetoric and directives allude to the persistent interest in incorporating machine learning technology in the future in its immigration vetting functions.
The National Vetting Enterprise
Rather than a centralized agency, the NVE is a novel enforcement apparatus built piecemeal through various presidential directives and their implementation by government enforcement agencies. Though the president used the term “National Vetting Enterprise” for the first time in February, the enforcement apparatus was constructed through various presidential orders, the first of which was issued just days after President Trump’s inauguration.
The first piece of what would become the NVE came from Executive Order 13769, better known as the first travel ban, in January 2017. The order required the implementation of a uniform immigrant screening program to “evaluate the applicant’s likelihood of becoming an positively contributing member of society and the applicant’s ability to make contributions to the “national interest” and “to assess whether or not the applicant has the intent to commit criminal or terrorist acts after entering the United States.” It also advocated for the creation of a central repository of biographic and biometric information. In March 2017, Executive Order 13780, or the second travel ban, required the collection of all information that could aid “a rigorous evaluation of all grounds of inadmissibility or grounds for the denial of other immigration benefits,” which was interpreted to include “contextual data,” such as social media information, in a 2008 national security directive. The system was further fleshed out in a September 2017 proclamation—the third iteration of the travel ban—in which the president called for a comprehensive worldwide review of countries from which warrant entry into the United States should be suspended or limited.
Finally, in a February 2018 proclamation, the president mandated screening “on a recurrent basis, so as to identify activities, associations with known or suspected threat actors, and other relevant indicators that inform adjudications and determinations.” Additionally, he called for the analysis of “associations” and interactions between immigrants with known or suspected threat actors to identify threats to national security. It was in this proclamation that the president first referred to the “national vetting enterprise” as such and directed the establishment of the National Vetting Center.
While the NVC is an interagency body, each presidential directive contributing to the NVE specifically charged DHS with interpreting and implementing the executive’s policy. Three core components within DHS focus on immigration, and each plays its own role in implementing the NVE: The U.S. Citizenship and Immigration Services (USCIS) grants visas and citizenship; the Customs and Border Protection (CBP) secures the nation’s borders from dangerous persons and materials; and Immigration and Customs Enforcement (ICE) conducts Enforcement and Removal Operations (ERO) and Homeland Security Investigations (HSI) missions domestically and overseas.
Each component’s role in carrying out the NVE is informed by how the information provided by continuous monitoring can aid these broader goals. Essentially, USCIS is the judge, using the new contextual sources of information to decide on administrative immigration issues such as entry, removal, benefits and relief on an individual level. CBP is the gatekeeper, the first line of defense against any threat at points of entry, and will rely on the centralized trove of information to create intelligence reports based on trend analysis about national security threats and trends. ICE is the police force, and will use the new information and technology to identify specific individuals within the United States who are suspected of violating civil, criminal and immigration laws, or posing a threat to national security or interests.
USCIS: Using Social Media for Individualized Decision-making
Relying on the president’s call for the use of new “contextual” sources of information alongside the use of more types of information in making immigration decisions, USCIS will collect and retain private and public social media information to assist in making individual benefits decisions, including entry, removal and changes in immigrant status.
USCIS published a notice on the Federal Register in September 2017 announcing a policy that that “expand[ed] the categories of records” collected to include, among other things, “social media handles, aliases, associated identifiable information, and search results,” which it will be gathering from “publicly available information obtained from the internet,” “commercial data providers” and “information obtained and disclosed pursuant to information sharing agreements.” Though the notice did not cite any executive order or proclamation explicitly, it carries out the stated mission of the NVE in collecting contextual information for use in vetting—and notably, it went into effect on the same day as the third travel ban.
Previously, USCIS collected primarily biometric and biographic data in response to questions asked during the immigration process or provided by law enforcement agencies, should the individual in question have a criminal record or be suspected of a crime. Legislators have long recognized the potential intelligence value of social media at an individual level; for example, after law enforcement discovered that the social media profiles of the San Bernardino attackers contained warning signs of extremism, senators requested that DHS invest in “social media background checks.” Now, however, USCIS will collect social media information and search results from all immigrants entering the U.S. All of this information will be kept by DHS for 100 years in every immigrant’s Alien Files (A-Files)—that is, a centralized record of all their personal information received during interactions with the immigration arms of the government—including those of permanent residents and naturalized citizens.
While USCIS social media scraping would be limited to publicly available information, new visa policies require some immigrants to expose their private communications as well by handing over their account credentials. The State Department has used its authority in overseeing the visa application process, with the advice of DHS as required by the Homeland Security Act, to require the collection of social media credentials in the following ways. First, all individuals subject to increased scrutiny for being a member of a State Department-identified “risky population”—a designation based on the worldwide review mandated by the March travel ban—or having traveled to an Islamic State-controlled area, must hand over five years of phone number, email and social media account history as a condition of their visa application. Second, the State Department is currently proposing a policy change to expand this requirement to include all visa applicants. Third, as a matter of practice, consular officers and CBP officers have required individuals to hand over account credentials on a case-by-case basis in order to further investigate online activity beyond what is publicly available. Fourth, early in the administration former-Secretary of Homeland Security John Kelly indicated that he hoped to implement a policy change requiring immigrants to list identifiers and passwords for their online accounts as a condition of entry to the country and DHS has moved forward with the proposal to require collection of social media handles despite his departure. The proposal does not require applicants to list passwords, but consular officers can still demand the password any time an immigrant is at a port of entry.
Combining both sources of data, USCIS is now equipped with a deluge of new information it is required to use in deciding questions of benefits, status applications or appeals. For example, a recent leaked DHS proposal would create a negative presumption against individuals who receive any public benefit, including non-cash benefits such as school lunches, as likely to become public charges. The proposal calls for caseworkers to review a “totality of the circumstances,” in making their determination, which requires the use of all available and potentially relevant including social media.
CBP: Using Social Media for Intelligence Reporting
CBP’s intelligence-focused mission will use new contextual data sources to aid its existing analytic programs and ready its infrastructure for the introduction of machine learning, satisfying the presidential directive calling for continuous monitoring and improved identification of “associations” between immigrants for threat intelligence. As with the USCIS notice, these initiatives do not overtly mention the NVE or cite to specific presidential directives, but they effectuate the goals therein.
In October 2017, DHS announced the development of a new system, CBP Intelligence Records System (CIRS), to aggregate immigration, law enforcement, national security and publicly available data—including social media—in a central database. The database will sit in the CBP Office of Intelligence (OI). These new data sources will be analyzed using existing predictive tools such as the Analytical Framework for Intelligence (AFI) and the Intelligence Reporting System (IRS) to identify relationships between individuals, entities, threats and events in an automated fashion, and to save manual resources for DHS, other government agencies and the intelligence community.
These analyses were previously limited to biographic, biometric, criminal and investigatory records, and other such structured government documents. Social media information provides exponentially more raw data to these algorithms to identify associations between individuals, because social networks are fundamentally premised on user relationships. For that reason, social media can be especially useful in identifying undocumented immigrants based on their interactions with known legal immigrants.
Currently, the AFI and IRS are not machine learning technologies, but CBP has indicated that it hopes to incorporate deeper artificial intelligence capabilities into these systems to improve their ability to make associations. The CBP OI, where the CIRS will be housed, is shifting to a cloud-based shared services model specifically to adopt machine learning technologies.
ICE: Using Social Media for Immigrant Monitoring and Enforcement
ICE is tasked with carrying out the Visa Lifecycle Vetting Initiative (VLVI), the most robust component of the NVE. The new VLVI automates the visa application review process. It also enhances that process by creating a new centralized database to combine all federal intelligence and law enforcement databases with new “contextual” sources of information, such as court documents, license plate tracking data and publicly available data, including social media. This database will enable the executive branch’s effort to continuously monitor aliens domestically and abroad, during the lifecycle of their interaction with the country, to identify violators of law and threats to the country as leads for enforcement action. This initiative enacts the president’s call for “recurrent” review, which ICE interpreted as the continuous monitoring of immigrants in its Statement of Objectives (SOO) for its original solicitation for AI vetting technology. In January 2018 Senate testimony, DHS Secretary Kirstjen Nielsen described this approach as ensuring that immigrants are “continuously vetted against intelligence and criminal databases.”
As a preliminary step, ICE must first compile the contextual information that enables continuous monitoring to feed its lead-generation process, fill the CBP CIRS databases and relevant A-Files. While ICE can purchase access to some contextual information, like commercial databases, it must solicit a custom web-crawler to utilize publicly available information in the same way. ICE indicated that it wants to pull information from “media, blogs, public hearings, conferences, academic websites, social media websites such as Twitter, Facebook, and LinkedIn, radio, television, press, geospatial sources, internet sites, and specialized publications with intent to extract pertinent information regarding targets,” but only if the information is “derived only from free and publicly available sources through unattributed computers.” In a Q&A with industry vendors, ICE confirmed that it did not want to be “restrictive” and limit collection to “certain datasets.”
ICE is also tasked with automating its lead-generation functions in a manner consistent with the president’s goal of continuous monitoring of immigrants. Previously, immigration enforcement focused on preventing the entry of illegal immigrants, or exterior removals, and the deportation of visa overstays or serious criminals, or interior removals. With limited resources, this involved a policy of exempting certain classes from enforcement, such as immigrants protected from deportation under the Deferred Action for Childhood Arrivals (DACA) program, to pursue more egregious violators. Implementing the new presidential priorities, ICE reported in 2017 that it “no longer exempts any category of removable aliens from potential enforcement.” Now, ICE has developed a new focus on reopening cold cases against immigrants in the interior—including those who are otherwise legally in the country but violate the terms of their stay—and using predictive analytics to identify threats to the country before they manifest in actual action. Under these new policies, the agency reported historically low numbers of deportation at the border but historically high numbers of deportations from the interior, bringing the overall number of removals 37 percent higher than 2016.
Currently, DHS relies primarily on the Pre-Adjudicated Threat Recognition and Intelligence Operations Team (PATRIOT) system, a rules-based system that cross-references a visa applicant against various government databases for derogatory information and returns as an output either a red (recommendation to deny entry based on derogatory information) or a green light (no derogatory information unearthed) for the applicant. A HSI officer always reviews PATRIOT outputs before conveying them to consular officers for final decisions.
Until May 2018, ICE was soliciting machine-learning technology from industry to harness the power of social media big data to conduct continuous monitoring of immigrants, replacing PATRIOT. The system would have provided the same “red” or “green” light conclusions for entry applicants on the basis of the new presidential vetting criteria, including whether immigrants were likely to be positive contributors to society or a threat to general welfare. This would require algorithms to identify the characteristics, or data features, that best predict an immigrant’s value or risk to society, based on trial and error learning process, guided by an individual confirming that the model either accurately or inaccurately identified an immigrant as a red or green light. In a letter opposing the initiative, 52 technologists criticized the ability for an algorithm to make accurate and ethical non-rules-based determinations. ICE did not address these concerns directly, but withdrew its request for this technology regardless, stating that it could not find any “out of the box” solution that provided the quality of monitoring the agency wanted.
While ICE has moved away from its request for AI, it is likely by suggesting the need for these capabilities, the agency signaled demand for their development. DHS continues to invest in artificial intelligence capabilities elsewhere, such as funding a Silicon Valley-backed machine learning prototype to screen air travelers or exploring a facial-recognition machine learning system to identify persons of interest.
Now, ICE has retracted its vendor request for a single system to automate its visa lifecycle functions and has shifted to a labor contract. The labor contract—recently won by General Dynamics for $113 million—calls for contractor support in the form of 180 analysts, along with money for training and management, to screen the social media activity of leads after DHS identifies them. Essentially, ICE is not asking for technology it can house and use itself, but a contractor that will automate the functions with which ICE seeks assistance. Even without machine learning, ICE is introducing algorithms into its processes that can serve to increase its available computing power, centralize all federal databases available to it, design models that cross-reference these databases and provide push mechanisms to send information to other arms of the government. This will increase the efficiency and productivity of the immigration enforcements agencies.
The NVE’s Impact
Voices in civil society have raised concerns about the effectiveness, privacy and fairness of the new NVE apparatus. According to a February 2017 DHS Inspector General report, previous attempts by DHS to use social media data to aid immigrant screening were largely unsuccessful, due to the inconsistent nature of the underlying data. All four pilot programs were found to have either a poor ability to identify derogatory information, generate prioritized leads or even confidently find the accounts tied to the targeted applicants. The coalition letter from computer scientists suggests that this may be in part because the predictive value of aggregate social media is reduced by the lack of standardization in user-generated content, along with the inherent presence of misleading or inaccurate information on informal platforms. Although social media data has been used to detect crises or natural disasters, immigrant vetting uniquely involves subjective psychological or sociological characteristics about humans. Along these lines, a coalition of rights organizations voiced worries to Acting DHS Secretary Elaine Duke that interpreting aggregate trends in social media risks engender biased assumptions in interpreting content.
These concerns are particularly pointed because of the broad scope of the proposed changes; at multiple points during the immigration enforcement system, social media information could tip the balance against entry or for deportation. Social media information could be used to create the modicum of doubt necessary that a non-immigrant does not meet baseline admission criteria, barring their entry or creating grounds for removal. Social media could help reopen “cold cases” abandoned due to insufficient evidence. In immigration proceedings, an individual is not protected by statutes of limitation and can be retroactively prosecuted or automatically deported on a showing of “reasonable, substantial, and probative evidence.” This lower bar can be met by introducing social media content, even if illegally obtained, during administrative hearings in which exclusionary rules of evidence do not apply and a right to counsel is not triggered. Limits on the exclusionary rule in removal hearings evolved with government raids on physical locations in mind, not social media, which may caution against the use of this evidence regardless of its acquisition process. Additionally, although the extent to which Fourth Amendment protections attach to non-citizens remains unsettled, DHS said in an industry Q&A that it hopes to circumvent the legal issues the FBI encountered in a similar program by only targeting non-citizens and only collecting publicly available information. Finally, DHS is free to pass leads on to law enforcement instead of pursuing immigration enforcement, including handing off the case files they generate that include social media information and threat profiles that did not require legal process to obtain.
The executive typically enjoys broad discretion in immigration enforcement due to judicial restraint, deference to the political branches and Congress’s expansive delegation of power to the president. National security and immigration databases, such as the new CIRS, are generally exempted under several reporting requirements and restraints designed by Congress for better accountability.
But despite the high level of deference historically granted to the executive’s immigration enforcement decisions, Congress has recently backtracked some of its support. In March of this year, the Congressional Black Caucus and three Democrats on the House Committee on Homeland Security sent letters explicitly opposing the VLVI’s use of social media data and algorithms for lead generation. Where congressional demands once drove the adoption of social media surveillance, the legislature’s approval has now diminished.
While Congress may not approve of enforcement behavior with the same fervor it once had, it has yet to effectively curtail the administration’s enforcement in this area—though this may change now that the Democratic Party has taken control of the House of Representatives. There has been more activity at the state level, particularly in California, where state legislators have introduced a bill that would require public debate and compel law enforcement to obtain the permission of locally elected leaders before acquiring new surveillance technology. At the municipal level, the city of Richmond, Calif. adopted an ordinance blacklisting all vendors contracting with ICE to support extreme vetting in direct response to the proposed use of machine learning in the VLVI. The cities of Alameda, Berkeley and Oakland are considering similar proposals, even after the federal policy of using AI was rescinded.
The VLVI is not the first time that technology companies have become entwined with government initiatives. The lead-monitoring element of ICE’s enforcement strategy will be contracted out to one of these vendors, and while their involvement with ICE may have once been overlooked, these contracts now face heightened public scrutiny. In light of the call for AI-driven social media surveillance by immigration enforcement, civil rights groups urged IBM to opt out of bidding on the associated solicitation from vendors. In recent congressional hearings, Facebook CEO Mark Zuckerberg asserted that his platform would not cooperate with ICE requests related to “extreme vetting.”
Although strides have been made to improve the accuracy of these technologies, some argue that they are still not ready for use by the government. However, some companies have begun to address the ethical and accuracy concerns underpinning the public’s opposition to the employment of artificial intelligence technology by law enforcement. Microsoft has worked on technology to detect bias in algorithmic outputs proactively and IBM hopes to improve facial recognition machine learning with a better curated, more diverse dataset that counters race and gender biases. Government piloting of transformational technologies such as computers, the Internet or drones, even in their nascent stages, helped hone them over time, reducing their error rate exponentially as most systems improve on a steep learning curve. As technology improves, the government may once again seek to incorporate AI into immigration enforcement even more aggressively.