14 matches found
Keras 3.13.0 HDF5 Shape Fuzzing for Robustness Testing
This script performs fuzz testing against Keras version 3.13.0 on randomly generated tensor shapes using NumPy and HDF5 to evaluate stability and error handling in file creation workflows...
DNG File Fuzzer for Robustness
This Python script is a mutation-based fuzzing tool designed to test the robustness of DNG Digital Negative / TIFF-based file parsers by generating large numbers of corrupted or semi-valid image files. It works by starting from a minimal valid DNG structure, then applying random mutations to...
VisualLeakBench: Auditing the Fragility of Large Vision-Language Models against PII Leakage and Social Engineering
As Large Vision-Language Models LVLMs are increasingly deployed in agent-integrated workflows and other deployment-relevant settings, their robustness against semantic visual attacks remains under-evaluated -- alignment is typically tested on explicit harmful content rather than privacy-critical...
Extending the OWASP Multi-Agentic System Threat Modeling Guide: Insights from Multi-Agent Security Research
We propose an extension to the OWASP Multi-Agentic System MAS Threat Modeling Guide, translating recent anticipatory research in multi-agent security MASEC into practical guidance for addressing challenges unique to large language model LLM-driven multi-agent architectures. Although OWASP's...
Prompt Injection Vulnerability of Consensus Generating Applications in Digital Democracy
Large Language Models LLMs are gaining traction as a method to generate consensus statements and aggregate preferences in digital democracy experiments. Yet, LLMs may introduce critical vulnerabilities in these systems. Here, we explore the impact of prompt-injection attacks targeting consensus...
Probing the Robustness of Large Language Models Safety to Latent Perturbations
Safety alignment is a key requirement for building reliable Artificial General Intelligence. Despite significant advances in safety alignment, we observe that minor latent shifts can still trigger unsafe responses in aligned models. We argue that this stems from the shallow nature of existing...
Mal-D2GAN: Double-Detector Based GAN for Malware Generation
Machine learning ML has been developed to detect malware in recent years. Most researchers focused their efforts on improving the detection performance but ignored the robustness of the ML models. In addition, many machine learning algorithms are very vulnerable to intentional attacks. To solve...
Alignment under Pressure: the Case for Informed Adversaries When Evaluating LLM Defenses
Large language models LLMs are rapidly deployed in real-world applications ranging from chatbots to agentic systems. Alignment is one of the main approaches used to defend against attacks such as prompt injection and jailbreaks. Recent defenses report near-zero Attack Success Rates ASR even again...
Evaluating the Robustness of Adversarial Defenses in Malware Detection Systems
Machine learning is a key tool for Android malware detection, effectively identifying malicious patterns in apps. However, ML-based detectors are vulnerable to evasion attacks, where small, crafted changes bypass detection. Despite progress in adversarial defenses, the lack of comprehensive...
AGATE: Stealthy Black-Box Watermarking for Multimodal Model Copyright Protection
Recent advancement in large-scale Artificial Intelligence AI models offering multimodal services have become foundational in AI systems, making them prime targets for model theft. Existing methods select Out-of-Distribution OoD data as backdoor watermarks and retrain the original model for...
FCGHunter: Towards Evaluating Robustness of Graph-Based Android Malware Detection
Graph-based detection methods leveraging Function Call Graphs FCGs have shown promise for Android malware detection AMD due to their semantic insights. However, the deployment of malware detectors in dynamic and hostile environments raises significant concerns about their robustness. While recent...
Radamsa - A General-Purpose Fuzzer
Radamsa is a test case generator for robustness testing, a.k.a. a fuzzer. It is typically used to test how well a program can withstand malformed and potentially malicious inputs. It works by reading sample files of valid data and generating interestringly different outputs from them. The main...
What the Fuzz: Radamsa
What the Fuzz: Radamsa Radamsa is a test case generator for robustness testing, a.k.a. a fuzzer. It is typically used to test how well a program can withstand malformed and potentially malicious inputs. It works by reading sample files of valid data and generating interestingly different outputs...
General Purpose Fuzzer: Radamsa
Radamsa is a test case generator for robustness testing, a.k.a. a fuzzer. It is typically used to test how well a program can withstand malformed and potentially malicious inputs. It works by reading sample files of valid data and generating interestringly different outputs from them. The main...