4 matches found
Cisco Finds Open-Weight AI Models Easy to Exploit in Long Chats
Cisco’s new research shows that open-weight AI models, while driving innovation, face serious security risks as multi-turn attacks, including conversational persistence, can bypass safeguards and expose data...
STAC: When Innocent Tools Form Dangerous Chains to Jailbreak LLM Agents
As LLMs advance into autonomous agents with tool-use capabilities, they introduce security challenges that extend beyond traditional content-based LLM safety concerns. This paper introduces Sequential Tool Attack Chaining STAC, a novel multi-turn attack framework that exploits agent tool use. STA...
Between a Rock and a Hard Place: Exploiting Ethical Reasoning to Jailbreak LLMs
Large language models LLMs have undergone safety alignment efforts to mitigate harmful outputs. However, as LLMs become more sophisticated in reasoning, their intelligence may introduce new security risks. While traditional jailbreak attacks relied on singlestep attacks, multi-turn jailbreak...
Echo Chamber Jailbreak Tricks LLMs Like OpenAI and Google into Generating Harmful Content
Cybersecurity researchers are calling attention to a new jailbreaking method called Echo Chamber that could be leveraged to trick popular large language models LLMs into generating undesirable responses, irrespective of the safeguards put in place. "Unlike traditional jailbreaks that rely on...