Reinforcement learning based on self-play enables AI agents to surpass human expert-level performance in the popular computer game Dota and board games such as chess and Go. Despite the strong performance results, recent studies suggest that self-play may not be as strong as previously thought. A question naturally arises: Are such self-playing agents vulnerable to adversary attacks?
In the new role Adversarial Policy Beats Professional-Level Go AIs, a research team from MIT, UC Berkeley, and FAR AI uses a novel adversarial policy to attack the state-of-the-art AI Go system KataGo. The team believes theirs is the first successful end-to-end attack against an AI Go system playing at the level of a human professional.
The team summarizes their main contributions as follows:
- We propose a new attack method, which hybridizes the attack of Gleave et al. (2020) and AlphaZero-style training (Silver et al., 2018).
- We demonstrate the existence of adversarial policies against the state-of-the-art Go AI system, KataGo.
- We see that the enemy continues a simple strategy that deceives the victim into predicting victory, causing it to pass prematurely.
This work focuses on exploiting professional-level AI Go policies with a discrete action space. The team attacked the most powerful publicly available AI Go system, KataGo, although not at its full strength state. Unlike KataGo, which is trained through self-played games, the team trained their agent on games played against a specific victim agent, using only data from turns where it was the move of enemy. This “victim play” training method encourages the model of exploiting the victim, not imitating it.
The team also introduced two separate families of Adversarial Monte Carlo tree search (A-MCTS) — Sample (A-MCTS-S) and Recursive (A-MCTS-R) — to avoid agent modeling of adversary actions. its own policy. network. Instead of using a random start, the team uses a curriculum that trains the agent against successively stronger versions of the victim.
In their empirical studies, the team used their adversarial policy to attack KataGo without searching (the level of a top 100 European player), and 64 visits to KataGo (“close to superhuman level “). The proposed policy achieved more than 99 percent win rate without searching and a more than 50 percent win rate against 64 KataGo visits.
While this work suggests that learning by playing itself is not as robust as expected and that adversarial policies can be used to defeat previous Go AI systems, the results questioned by the machine learning and Go communities. Discussions on Reddit involving the authors of the paper and the developers of KataGo focused on the particularities of the Tromp-Taylor scoring system used in the experiments – while the proposed agent gets the wins by “cheating in KataGo to end the game prematurely,” it argued that this tactic would lead. to devastating losses under the more common rules of Go.
The open-source implementation is available on GitHub, and example games are available on the project’s webpage. The paper Adversarial Policy Beats Professional-Level Go AIs is in the arXiv.
Author: Hecate She | Editor: Michael Sarazen
We know you don’t want to miss any news or research breakthroughs. Subscribe to our popular newsletter Synced Global AI Weekly to get weekly AI updates.