2 matches found
Honeypot Protocol
Trusted monitoring, the standard defense in AI control, is vulnerable to adaptive attacks, collusion, and strategic attack selection. All of these exploit the fact that monitoring is passive: it observes model behavior but never probes whether the model would behave differently under different...
OS Floor 19H1
Evaluates to true if client machine is running Win10 OS with build number 17764 and greater...