FORTRESS: Frontier Risk Evaluation for National Security and Public Safety
The rapid advancement of large language models LLMs introduces dual-use capabilities that could both threaten and bolster national security and public safety NSPS. Models implement safeguards to protect against potential misuse relevant to NSPS and allow for benign users to receive helpful...