KG-DF: A Black-Box Defense Framework against Jailbreak Attacks Based on Knowledge Graphs
With the widespread application of large language models LLMs in various fields, the security challenges they face have become increasingly prominent, especially the issue of jailbreak. These attacks induce the model to generate erroneous or uncontrolled outputs through crafted inputs, threatenin...