ChineseHarm-Bench: a Chinese Harmful Content Detection Benchmark
Large language models LLMs have been increasingly applied to automated harmful content detection tasks, assisting moderators in identifying policy violations and improving the overall efficiency and accuracy of content review. However, existing resources for harmful content detection are...