2 matches found
Scaling Patterns in Adversarial Alignment: Evidence from Multi-LLM Jailbreak Experiments
Large language models LLMs increasingly operate in multi-agent and safety-critical settings, raising open questions about how their vulnerabilities scale when models interact adversarially. This study examines whether larger models can systematically jailbreak smaller ones - eliciting harmful or...
Evaluation Empirique De La Sécurisation Et De L'Alignement De ChatGPT Et Gemini: Analyse Comparative Des Vulnérabilités Par Expérimentations De Jailbreaks
Large Language models LLMs are transforming digital usage, particularly in text generation, image creation, information retrieval and code development. ChatGPT, launched by OpenAI in November 2022, quickly became a reference, prompting the emergence of competitors such as Google's Gemini. However...