3 matches found
Breaking to Build: a Threat Model of Prompt-Based Attacks for Securing LLMs
The proliferation of Large Language Models LLMs has introduced critical security challenges, where adversarial actors can manipulate input prompts to cause significant harm and circumvent safety alignments. These prompt-based attacks exploit vulnerabilities in a model's design, training, and...
When LLMs Copy to Think: Uncovering Copy-Guided Attacks in Reasoning LLMs
Large Language Models LLMs have become integral to automated code analysis, enabling tasks such as vulnerability detection and code comprehension. However, their integration introduces novel attack surfaces. In this paper, we identify and investigate a new class of prompt-based attacks, termed...
Security Assessment of DeepSeek and GPT Series Models against Jailbreak Attacks
The widespread deployment of large language models LLMs has raised critical concerns over their vulnerability to jailbreak attacks, i.e., adversarial prompts that bypass alignment mechanisms and elicit harmful or policy-violating outputs. While proprietary models like GPT-4 have undergone extensi...