vLLM uses Python 3.12 built-in hash() which leads to predictable hash collisions in prefix cache
Summary Maliciously constructed prompts can lead to hash collisions, resulting in prefix cache reuse, which can interfere with subsequent responses and cause unintended behavior. Details vLLM's prefix caching makes use of Python's built-in hash function. As of Python 3.12, the behavior of hashNon...