Can In-Context Reinforcement Learning Recover from Reward Poisoning Attacks?
We study the corruption-robustness of in-context reinforcement learning ICRL, focusing on the Decision-Pretrained Transformer DPT, Lee et al., 2023. To address the challenge of reward poisoning attacks targeting the DPT, we propose a novel adversarial training framework, called Adversarially...