3 matches found
Mythos and the Unverified Cage: Z3-Based Pre-Deployment Verification for Frontier-Model Sandbox Infrastructure
The April 2026 Claude Mythos sandbox escape exposed a critical weakness in frontier AI containment: the infrastructure surrounding advanced models remains susceptible to formally characterizable arithmetic vulnerabilities. Anthropic has not publicly characterized the escape vector; some secondary...
System Card: Claude Mythos Preview
This System Card describes Claude Mythos Preview, a large language model from Anthropic. Mythos Preview is their most capable frontier model to date, and shows a striking leap in scores on many evaluation benchmarks compared to their previous frontier model, Claude Opus 4.6. This System Card...
Evaluating the Critical Risks of Amazon'S Nova Premier under the Frontier Model Safety Framework
Nova Premier is Amazon's most capable multimodal foundation model and teacher for model distillation. It processes text, images, and video with a one-million-token context window, enabling analysis of large codebases, 400-page documents, and 90-minute videos in a single prompt. We present the fir...