2 matches found
AgentVisor: Defending LLM Agents against Prompt Injection Via Semantic Virtualization
Large Language Model LLM agents are increasingly used to automate complex workflows, but integrating untrusted external data with privileged execution exposes them to severe security risks, particularly direct and indirect prompt injection. Existing defenses face significant challenges in balanci...
PoTS: Proof-Of-Training-Steps for Backdoor Detection in Large Language Models
As Large Language Models LLMs gain traction across critical domains, ensuring secure and trustworthy training processes has become a major concern. Backdoor attacks, where malicious actors inject hidden triggers into training data, are particularly insidious and difficult to detect. Existing...