20 matches found
Agent Security Is a Systems Problem
We take the position that agent security must be approached as a systems problem: the AI model powering the agent must be treated as an untrusted component, and security invariants must be enforced at the system level. Through this lens, efforts to increase model robustness the dominant viewpoint...
A Systematic Literature Review on LLM Defenses against Prompt Injection and Jailbreaking: Expanding NIST Taxonomy
The rapid advancement and widespread adoption of generative artificial intelligence GenAI and large language models LLMs has been accompanied by the emergence of new security vulnerabilities and challenges, such as jailbreaking and other prompt injection attacks. These maliciously crafted inputs...
Leveraging Trustworthy AI for Automotive Security in Multi-Domain Operations: Towards a Responsive Human-AI Multi-Domain Task Force for Cyber Social Security
Multi-Domain Operations MDOs emphasize cross-domain defense against complex and synergistic threats, with civilian infrastructures like smart cities and Connected Autonomous Vehicles CAVs emerging as primary targets. As dual-use assets, CAVs are vulnerable to Multi-Surface Threats MSTs,...
Quantum Support Vector Regression for Robust Anomaly Detection
Anomaly Detection AD is critical in data analysis, particularly within the domain of IT security. In recent years, Machine Learning ML algorithms have emerged as a powerful tool for AD in large-scale data. In this study, we explore the potential of quantum ML approaches, specifically quantum kern...
Bypassing Prompt Injection and Jailbreak Detection in LLM Guardrails
Large Language Models LLMs guardrail systems are designed to protect against prompt injection and jailbreak attacks. However, they remain vulnerable to evasion techniques. We demonstrate two approaches for bypassing LLM prompt injection and jailbreak detection systems via traditional character...
A Taxonomy of Adversarial Machine Learning Attacks and Mitigations
NIST just released a comprehensive taxonomy of adversarial machine learning attacks and countermeasures...
Microsoft AI Red Team building future of safer AI
An essential part of shipping software securely is red teaming. It broadly refers to the practice of emulating real-world adversaries and their tools, tactics, and procedures to identify risks, uncover blind spots, validate assumptions, and improve the overall security posture of systems. Microso...
Security Risks of AI
Stanford and Georgetown have a new report on the security risks of AI--particularly adversarial machine learning--based on a workshop they held on the topic. Jim Dempsey, one of the workshop organizers, wrote a blog post on the report: As a first step, our report recommends the inclusion of AI...
WAF-A-MoLE - A Guided Mutation-Based Fuzzer For ML-based Web Application Firewalls
A guided mutation-based fuzzer for ML-based Web Application Firewalls, inspired by AFL and based on the FuzzingBook by Andreas Zeller et al. Given an input SQL injection query, it tries to produce a semantic invariant query that is able to bypass the target WAF. You can use this tool for assessin...
The Supreme Court Narrowed the CFAA
In a 6-3 ruling, the Supreme Court just narrowed the scope of the Computer Fraud and Abuse Act: In a ruling delivered today, the court sided with Van Buren and overturned his 18-month conviction. In a 37-page opinion written and delivered by Justice Amy Coney Barrett, the court explained that the...
New Framework Released to Protect Machine Learning Systems From Adversarial Attacks
Microsoft, in collaboration with MITRE, IBM, NVIDIA, and Bosch, has released a new open framework that aims to help security analysts detect, respond to, and remediate adversarial attacks against machine learning ML systems. Called the Adversarial ML Threat Matrix, the initiative is an attempt to...
Adversarial Machine Learning and the CFAA
I just co-authored a paper on the legal risks of doing machine learning research, given the current state of the Computer Fraud and Abuse Act: Abstract: Adversarial Machine Learning is booming with ML researchers increasingly targeting commercial ML systems such as those used in Facebook, Tesla,...
Fooling NLP Systems Through Word Swapping
MIT researchers have built a system that fools natural-language processing systems by swapping words with synonyms: The software, developed by a team at MIT, looks for the words in a sentence that are most important to an NLP classifier and replaces them with a synonym that a human would find...
Introduction and Application of Model Hacking
ARCHIVED STORY Introduction and Application of Model Hacking By Steve Povolny · Febraury 19, 2020 Catherine Huang, Ph.D., and Shivangee Trivedi contributed to this blog. The term “Adversarial Machine Learning” AML is a mouthful! The term describes a research field regarding the study and design o...
Model Hacking ADAS to Pave Safer Roads for Autonomous Vehicles
ARCHIVED STORY Model Hacking ADAS to Pave Safer Roads for Autonomous Vehicles Steve Povolny · FEB 19, 2020 The last several years have been fascinating for those of us who have been eagerly observing the steady move towards autonomous driving. While semi-autonomous vehicles have existed for many...
Introduction and Application of Model Hacking
ARCHIVED STORY Introduction and Application of Model Hacking By Steve Povolny · Febraury 19, 2020 Catherine Huang, Ph.D., and Shivangee Trivedi contributed to this blog. The term “Adversarial Machine Learning” AML is a mouthful! The term describes a research field regarding the study and design o...
Fooling Automated Surveillance Cameras with Patchwork Color Printout
Nice bit of adversarial machine learning. The image from this news article is most of what you need to know, but here's the research paper...
Adversarial Machine Learning against Tesla's Autopilot
Researchers have been able to fool Tesla's autopilot in a variety of ways, including convincing it to drive into oncoming traffic. It requires the placement of stickers on the road. Abstract: Keen Security Lab has maintained the security research work on Tesla vehicle and shared our research...
Cybersecurity for the Public Interest
The Crypto Wars have been waging off-and-on for a quarter-century. On one side is law enforcement, which wants to be able to break encryption, to access devices and communications of terrorists and criminals. On the other are almost every cryptographer and computer security expert, repeatedly...
Metasploit for Machine Learning: Deep-Pwning
Deep-pwning is a lightweight framework for experimenting with machine learning models with the goal of evaluating their robustness against a motivated adversary. Note that deep-pwning in its current state is no where close to maturity or completion. It is meant to be experimented with, expanded...