6 matches found
Can You Keep a Secret? Involuntary Information Leakage in Language Model Writing
Language models are deployed in settings that require compartmentalization: system prompts should not be disclosed, chain-of-thought reasoning is hidden from users, and sensitive data passes through shared contexts. We test whether models can keep prompted information out of their writing. We giv...
How to Disable Google's Gemini in Chrome
Chrome users were caught off guard by a 4-GB Google AI model baked into Chrome, sparking privacy concerns. The good news: You can easily uninstall it. The bad? You might not want to...
Mind the Gap: Evaluating LLMs for High-Level Malicious Package Detection Vs. Fine-Grained Indicator Identification
The prevalence of malicious packages in open-source repositories, such as PyPI, poses a critical threat to the software supply chain. While Large Language Models LLMs have emerged as a promising tool for automated security tasks, their effectiveness in detecting malicious packages and indicators...
Accuracy and Efficiency Trade-Offs in LLM-Based Malware Detection and Explanation: A Comparative Study of Parameter Tuning Vs. Full Fine-Tuning
This study examines whether Low-Rank Adaptation LoRA fine-tuned Large Language Models LLMs can approximate the performance of fully fine-tuned models in generating human-interpretable decisions and explanations for malware classification. Achieving trustworthy malware detection, particularly when...
RLCracker: Exposing the Vulnerability of LLM Watermarks with Adaptive RL Attacks
Large Language Models LLMs watermarking has shown promise in detecting AI-generated content and mitigating misuse, with prior work claiming robustness against paraphrasing and text editing. In this paper, we argue that existing evaluations are not sufficiently adversarial, obscuring critical...
AiXamine: Simplified LLM Safety and Security
Evaluating Large Language Models LLMs for safety and security remains a complex task, often requiring users to navigate a fragmented landscape of ad hoc benchmarks, datasets, metrics, and reporting formats. To address this challenge, we present aiXamine, a comprehensive black-box evaluation...