408 matches found
exploit-labs
exploit-labs Companion code for the Windows-security blog at...
Quality-Diversity Evolution for Discovering Diverse Vulnerabilities in LLM Safety
Current approaches to LLM adversarial testing suffer from coverage gaps: manual red-teaming does not scale, LLM-as-attacker methods exhibit mode collapse, and gradient-based approaches produce uninterpretable gibberish. We introduce a quality-diversity evolutionary framework that operates at the...
How to Compare the Security of Code Written by Humans to LLM-Generated Code
Large language models LLMs are rapidly transforming how software is created and maintained. Comparing LLM-generated code against human-written standards is essential to determine whether these new tools uphold or erode the security baselines established by professional developers. Yet, we lack a...
The Fault in Our Drafts: Vulnerabilities in RPKI Specification and Software
The Resource Public Key Infrastructure RPKI secures the Internet's routing system by defining a complex trust and validation framework for certificates, Route Origin Authorizations ROAs, manifests, and Certificate Revocation Lists CRLs. These mechanisms are specified across dozens of RFCs. This...
Unlocking Apple's Private Cloud Compute: An Analysis of Privacy-Preserving Artificial Intelligence
Many existing Artificial Intelligence AI solutions on mobile devices rely on an extensive collection of sensitive data, raising privacy concerns and often requiring storage for both context and model improvement. Apple's Private Cloud Compute PCC aims to address this by emphasizing mobile device...
FuzzingBrain V2: A Multi-Agent LLM System for Automated Vulnerability Discovery and Reproduction
Software vulnerabilities pose critical security threats, with nearly 50,000 CVEs reported in 2025. While Large Language Models LLMs show promise for automated vulnerability detection, three key challenges remain. First, LLM-generated vulnerability reports suffer from high false positive rates and...
LITMUS: Benchmarking Behavioral Jailbreaks of LLM Agents in Real OS Environments
The rapid proliferation of LLM-based autonomous agents in real operating system environments introduces a new category of safety risk beyond content safety: behavior jailbreak, where an adversary induces an agent to execute dangerous OS-level operations with irreversible consequences. Existing...
Beyond Collection: Measuring the Detection Efficacy of Modern Security Logging Standards
Effective security logging is crucial for the timely and accurate detection of cyber threats; however, the relative effectiveness of various industry-standard logging frameworks remains understudied. This paper addresses this critical gap by presenting the first systematic evaluation of modern...
Exploit for Incorrect Resource Transfer Between Spheres in Linux Linux_Kernel
Copy Fail ARM64 Research CVE-2026-31431 Analysis and ARM64...
Observability for Post-Quantum TLS Readiness: A Multi-Surface Evidence Framework
Post-quantum migration in Transport Layer Security TLS requires evidence-aware measurements that distinguish session negotiation, endpoint capability, certificate-chain evidence, and the provenance of missing observations. This distinction is essential under TLS 1.3 encryption, resumption, mutual...
OpenSOC-AI: Democratizing Security Operations with Parameter Efficient LLM Log Analysis
Small and medium sized businesses SMBs face an escalating cybersecurity threat landscape, yet most lack the resources to staff full Security Operations Centers SOCs or deploy enterprise grade detection platforms. This paper presents OpenSOC-AI, a lightweight log analysis framework that uses...
A Ground-Truth-Based Evaluation of Vulnerability Detection across Multiple Ecosystems
Automated vulnerability detection tools are widely used to identify security vulnerabilities in software dependencies. However, the evaluation of such tools remains challenging due to the heterogeneous structure of vulnerability data sources, inconsistent identifier schemes, and ambiguities in...
NetSecBed: A Container-Native Testbed for Reproducible Cybersecurity Experimentation
Cybersecurity research increasingly depends on reproducible evidence, such as traffic traces, logs, and labeled datasets, yet most public datasets remain static and offer limited support for controlled re-execution and traceability, especially in heterogeneous multi-protocol environments. This...
ChainFuzzer: Greybox Fuzzing for Workflow-Level Multi-Tool Vulnerabilities in LLM Agents
Tool-augmented LLM agents increasingly rely on multi-step, multi-tool workflows to complete real tasks. This design expands the attack surface, because data produced by one tool can be persisted and later reused as input to another tool, enabling exploitable source-to-sink dataflows that only...
CAM-LDS: Cyber Attack Manifestations for Automatic Interpretation of System Logs and Security Alerts
Log data are essential for intrusion detection and forensic investigations. However, manual log analysis is tedious due to high data volumes, heterogeneous event formats, and unstructured messages. Even though many automated methods for log analysis exist, they usually still rely on domain-specif...
Jailbreak Foundry: From Papers to Runnable Attacks for Reproducible Benchmarking
Jailbreak techniques for large language models LLMs evolve faster than benchmarks, making robustness estimates stale and difficult to compare across papers due to drift in datasets, harnesses, and judging protocols. We introduce JAILBREAK FOUNDRY JBF, a system that addresses this gap via a...
VibeCode-injectproof
🛡️ VibeCode-InjectProof Deep SQLi verification engine for...
A Unified Evaluation of Learning-Based Similarity Techniques for Malware Detection
Cryptographic digests e.g., MD5, SHA-256 are designed to provide exact identity. Any single-bit change in the input produces a completely different hash, which is ideal for integrity verification but limits their usefulness in many real-world tasks like threat hunting, malware analysis and digita...
WiFiPenTester: Advancing Wireless Ethical Hacking with Governed GenAI
Wireless ethical hacking relies heavily on skilled practitioners manually interpreting reconnaissance results and executing complex, time-sensitive sequences of commands to identify vulnerable targets, capture authentication handshakes, and assess password resilience; a process that is inherently...
Azure Linux 3.0 Security Update: gnutls (CVE-2024-28834)
The version of gnutls installed on the remote Azure Linux 3.0 host is prior to tested version. It is, therefore, affected by a vulnerability as referenced in the CVE-2024-28834 advisory. - A flaw was found in GnuTLS. The Minerva attack is a cryptographic vulnerability that exploits deterministic...