7 matches found
RobPI: Robust Private Inference against Malicious Client
The increased deployment of machine learning inference in various applications has sparked privacy concerns. In response, private inference PI protocols have been created to allow parties to perform inference without revealing their sensitive data. Despite recent advances in the efficiency of PI,...
LLM Jailbreak Detection for (Almost) Free!
Large language models LLMs enhance security through alignment when widely used, but remain susceptible to jailbreak attacks capable of producing inappropriate content. Jailbreak detection methods show promise in mitigating jailbreak attacks through the assistance of other models or multiple model...
Busting the Paper Ballot: Voting Meets Adversarial Machine Learning
We show the security risk associated with using machine learning classifiers in United States election tabulators. The central classification task in election tabulation is deciding whether a mark does or does not appear on a bubble associated to an alternative in a contest on the ballot. Barrett...
An End-To-End Model for Logits Based Large Language Models Watermarking
The rise of LLMs has increased concerns over source tracing and copyright protection for AIGC, highlighting the need for advanced detection technologies. Passive detection methods usually face high false positives, while active watermarking techniques using logits or sampling manipulation offer...
From Trade-Off to Synergy: a Versatile Symbiotic Watermarking Framework for Large Language Models
The rise of Large Language Models LLMs has heightened concerns about the misuse of AI-generated text, making watermarking a promising solution. Mainstream watermarking schemes for LLMs fall into two categories: logits-based and sampling-based. However, current schemes entail trade-offs among...
Diverging Towards Hallucination: Detection of Failures in Vision-Language Models Via Multi-Token Aggregation
Vision-language models VLMs now rival human performance on many multimodal tasks, yet they still hallucinate objects or generate unsafe text. Current hallucination detectors, e.g., single-token linear probing SLP and PTrue, typically analyze only the logit of the first generated token or just its...
Allocation of Resources Without Limits or Throttling
Overview vllm is an A high-throughput and memory-efficient inference and serving engine for LLMs Affected versions of this package are vulnerable to Allocation of Resources Without Limits or Throttling in outlineslogitsprocessors.py module, which uses a local cache with unbounded size by default...