3 matches found
AutoDAN-Reasoning: Enhancing Strategies Exploration Based Jailbreak Attacks with Test-Time Scaling
Recent advancements in jailbreaking large language models LLMs, such as AutoDAN-Turbo, have demonstrated the power of automated strategy discovery. AutoDAN-Turbo employs a lifelong learning agent to build a rich library of attack strategies from scratch. While highly effective, its test-time...
ARMOR: Aligning Secure and Safe Large Language Models Via Meticulous Reasoning
Large Language Models LLMs have demonstrated remarkable generative capabilities. However, their susceptibility to misuse has raised significant safety concerns. While post-training safety alignment methods have been widely adopted, LLMs remain vulnerable to malicious instructions that can bypass...
ReCopilot: Reverse Engineering Copilot in Binary Analysis
Binary analysis plays a pivotal role in security domains such as malware detection and vulnerability discovery, yet it remains labor-intensive and heavily reliant on expert knowledge. General-purpose large language models LLMs perform well in programming analysis on source code, while...