3 matches found
Swiss-Bench 003: Evaluating LLM Reliability and Adversarial Security for Swiss Regulatory Contexts
The deployment of large language models LLMs in Swiss financial and regulatory contexts demands empirical evidence of both production reliability and adversarial security, dimensions not jointly operationalized in existing Swiss-focused evaluation frameworks. This paper introduces Swiss-Bench 003...
Whose Narrative Is It Anyway? A KV Cache Manipulation Attack
The Key ValueKV cache is an important component for efficient inference in autoregressive Large Language Models LLMs, but its role as a representation of the model's internal state makes it a potential target for integrity attacks. This paper introduces "History Swapping," a novel block-level...
Black-Box Guardrail Reverse-Engineering Attack
Large language models LLMs increasingly employ guardrails to enforce ethical, legal, and application-specific constraints on their outputs. While effective at mitigating harmful responses, these guardrails introduce a new class of vulnerabilities by exposing observable decision patterns. In this...