685 matches found
Prompt Injection Attacks on Large Language Models
This is a good survey on prompt injection attacks on large language models like ChatGPT. Abstract: We are currently witnessing dramatic advances in the capabilities of Large Language Models LLMs. They are already being adopted in practice and integrated into many systems, including integrated...
Jailbreaking ChatGPT and other large language models while we can
The introduction of ChatGPT launched an arms race between tech giants. The rush to be the first to incorporate a similar large language model LLM into their own offerings read: search engines may have left a lot of opportunities to bypass the active restrictions such as bias, privacy concerns, an...
New Study Uncovers Text-to-SQL Model Vulnerabilities Allowing Data Theft and DoS Attacks
A group of academics has demonstrated novel attacks that leverage Text-to-SQL models to produce malicious code that could enable adversaries to glean sensitive information and stage denial-of-service DoS attacks. "To better interact with users, a wide range of database applications employ AI...
Extracting Personal Information from Large Language Models Like GPT-2
Researchers have been able to find all sorts of personal information within GPT-2. This information was part of the training data, and can be extracted with the right sorts of queries. Paper: "Extracting Training Data from Large Language Models." Abstract: It has become common to publish large...
Attention is All They Need: Combatting Social Media Information Operations With Neural Language Models
Information operations have flourished on social media in part because they can be conducted cheaply, are relatively low risk, have immediate global reach, and can exploit the type of viral amplification incentivized by platforms. Using networks of coordinated accounts, social media-driven...