On the Feasibility of Poisoning Text-To-Image AI Models Via Adversarial Mislabeling
Today's text-to-image generative models are trained on millions of images sourced from the Internet, each paired with a detailed caption produced by Vision-Language Models VLMs. This part of the training pipeline is critical for supplying the models with large volumes of high-quality image-captio...