2 matches found
Sentinel: SOTA Model to Protect against Prompt Injections
Large Language Models LLMs are increasingly powerful but remain vulnerable to prompt injection attacks, where malicious inputs cause the model to deviate from its intended instructions. This paper introduces Sentinel, a novel detection model, qualifire/prompt-injection-sentinel, based on the...
MADCAT: Combating Malware Detection under Concept Drift with Test-Time Adaptation
We present MADCAT, a self-supervised approach designed to address the concept drift problem in malware detection. MADCAT employs an encoder-decoder architecture and works by test-time training of the encoder on a small, balanced subset of the test-time data using a self-supervised objective. Duri...