Lucene search
K

5 matches found

Packet Storm News
Packet Storm News
added 2026/06/01 12:0 a.m.7 views

Patcher: Post-Hoc Patching of Backdoored Large Language Models

Large language models remain vulnerable to jailbreak backdoor attacks, where adversaries poison safety alignment data to embed hidden triggers that bypass safety mechanisms. Existing defenses often require comprehensive attack information or multiple triggered examples, making them impractical wh...

5.8AI score
Exploits0
Packet Storm News
Packet Storm News
added 2025/06/11 12:0 a.m.5 views

Learning Obfuscations of LLM Embedding Sequences: Stained Glass Transform

The high cost of ownership of AI compute infrastructure and challenges of robust serving of large language models LLMs has led to a surge in managed Model-as-a-service deployments. Even when enterprises choose on-premises deployments, the compute infrastructure is typically shared across many tea...

6.6AI score
Exploits0
Packet Storm News
Packet Storm News
added 2025/06/02 12:0 a.m.6 views

CSVAR: Enhancing Visual Privacy in Federated Learning Via Adaptive Shuffling against Overfitting

Although federated learning preserves training data within local privacy domains, the aggregated model parameters may still reveal private characteristics. This vulnerability stems from clients' limited training data, which predisposes models to overfitting. Such overfitting enables models to...

6.6AI score
Exploits0
Packet Storm News
Packet Storm News
added 2025/05/23 12:0 a.m.4 views

A Critical Evaluation of Defenses against Prompt Injection Attacks

Large Language Models LLMs are vulnerable to prompt injection attacks, and several defenses have recently been proposed, often claiming to mitigate these attacks successfully. However, we argue that existing studies lack a principled approach to evaluating these defenses. In this paper, we argue...

7.5AI score
Exploits0
Packet Storm News
Packet Storm News
added 2025/05/07 12:0 a.m.4 views

OBLIVIATE: Robust and Practical Machine Unlearning for Large Language Models

Large language models LLMs trained over extensive corpora risk memorizing sensitive, copyrighted, or toxic content. To address this, we propose OBLIVIATE, a robust unlearning framework that removes targeted data while preserving model utility. The framework follows a structured process: extractin...

7.1AI score
Exploits0
Rows per page
Query Builder