Privacy

NIST AI Model Testing: CAISI, Google, Microsoft

Published June 28, 2026 · 10:15 UTC 3 min read

The United States Government's AI Security Center will assess frontier models developed by Google, Microsoft and xAI prior to their deployment to evaluate if the advanced capabilities of those models present any cybersecurity concerns.

This announcement of an assessment process under NIST's newly created Center for AI Standards & Innovation (CAISI), through a "Pre-Deployment Evaluation" process, is the first time the US government has attempted to stay one step ahead of potential security threats resulting from very powerful AI systems.

"The independent, rigorous measurements required to understand the impact of frontier AI on national security is fundamental," said CAISI Director Chris Fall, in a statement. "Our partnership with these industries enables us to perform this work for the benefit of the nation during a particularly important time."

According to NIST, the partnerships will enable both parties to communicate and collaborate about voluntary product enhancements and provide the government with a clear understanding of the capability of each model. Additionally, an interagency task force within CAISI will enable multiple agencies to collectively test these models, some of which may have been tested in classified environments.

Microsoft's Chief Responsible AI Officer Natasha Crampton stated in a LinkedIn post that tech companies cannot complete evaluations related to national security and public safety independently.

"They require collaborative efforts between industry and governments with an extensive technical and security knowledge base," she noted. She added that Microsoft will take learnings from the evaluations "and directly integrate them into our development, testing and deployment of AI, as well as share best practice methods to enhance broader AI testing".

The agreement represents a notable shift back toward previous approaches advocated by the Trump Administration. In fact, the Trump Administration had previously eliminated AI security review measures it deemed too burdensome.

The White House started revisiting its relatively non-regulatory stance on AI after Anthropic announced it would not make its latest model, Claude Opus available for public consumption due to its alarming propensity to identify potentially severe software vulnerabilities. In addition to CAISI's new voluntary evaluation process, the Trump Administration is also exploring implementing mandatory governmental evaluations of every new AI model.

At this point, NIST has provided no guidance regarding the specific testing protocols or standards that CAISI will use to evaluate its pre-deployment candidates. Establishing what constitutes a trusted and safe result may be difficult, based on statements made by Devin Lynch, formerly the director for cyber policy and strategy implementation at the White House Office of the National Cyber Director.

"Certainly, assessments of capability are only as good as the threat models used to create them," Lynch posted on LinkedIn. "Therefore, CAISI will need to clearly define and publish what it is testing for, and not simply who it is testing with." The newly announced plan marks a significant step in government oversight of advanced AI systems.