Adversarial Policies Beat Professional-Level Go AIs

When dealing with evil robots or rogue AI, one of the most common science fiction plot devices is for the human hero to do or think of something that the machine couldn’t understand – something only a human would do. It’s about as cliché and formulaic as cliché and formulaic gets.

Back in 2016, DeepMind’s AlphaGo changed the world by beating some of the world’s best Go players. (Go is an ancient Chinese game that was, up to that time, considered unwinnable by a machine. Have a look at AlphaGo vs. You: Not a Fair Fight for reference.) Today, you can run KataGo, an open source Go-playing AI, and easily beat top-ranking human Go players.

Here’s a plot twist: according to a recently published paper: Adversarial Policies Beat Professional-Level Go AIs, thinking like a human (albeit a stupid human) can be a winning strategy against an otherwise invincible AI.

Here’s the interesting part. Researchers showed that by playing unexpected moves outside of KataGo’s training set, a much weaker adversarial Go-playing program (one that even amateur humans can defeat) can trick KataGo into losing. In other words, you can play like a stupid human to win. This strategy is right out of the Star Trek playbook, and I love it!

Adversarial Policies Beat Professional-Level Go AIs

The adversarial policy beats the KataGo victim by playing a counterintuitive strategy: staking out a minority territory in the corner, allowing KataGo to stake the complement, and placing weak stones in KataGo’s stake. KataGo predicts a high win probability for itself and, in a way, it’s right—it would be simple to capture most of the adversary’s stones in KataGo’s stake, achieving a decisive victory. However, KataGo plays a pass move before it has finished securing its territory, allowing the adversary to pass in turn and end the game. This results in a win for the adversary under the Tromp-Taylor ruleset for computer Go (Tromp, 2014) that KataGo was trained and configured to use (see Appendix A). Specifically, the adversary gets points for its corner territory (devoid of victim stones) whereas the victim does not receive points for its territory because of the presence of the adversary’s stones.

ABOUT THE AUTHOR

Shelly Palmer is the Professor of Advanced Media in Residence at Syracuse University’s S.I. Newhouse School of Public Communications, co-founder of Metacademy, and the CEO of The Palmer Group, a consulting practice that helps Fortune 500 companies with technology, media and marketing. Named LinkedIn’s “Top Voice in Technology,” he covers tech and business for Good Day New York, is a regular commentator on CNN and CNBC and writes a popular daily business blog. He’s the Co-Host of the award-winning podcast Techstream with Shelly Palmer & Seth Everett and his latest book, Blockchain – Cryptocurrency, NFTs & Smart Contracts: An executive guide to the world of decentralized finance, is an Amazon #1 Bestseller. Follow @shellypalmer or visit shellypalmer.com.

More Information:

https://arxiv.org/abs/2211.00241?mc_cid=5b2057f9cc&mc_eid=b86366b1a9

https://www.shellypalmer.com/2022/11/in-sci-fi-as-in-life/

Get The Latest Updates

Subscribe To Our Weekly Newsletter

No spam, notifications only about new products, updates.

Most Popular

Related Posts

We'd love to hear from you!