Enhancing reinforcement learning one game at a time

Reading Time: 2 minutes |

September 20, 2022

|

AI

WhatNext

Enhancing reinforcement learning one game at a time

Competitive video games have been the major proving grounds for artificial intelligence research since the beginning of the field. Chess was an early challenge and the IBM Deep Blue successful match with Chess grand-master Gary Kasparov was widely covered in the popular media. It can also be credited to bring artificial intelligence in the consciousness of business executives and the public at large.

Chess although seems complex, has a set of rules to play by and only 64 squares to move on. Therefore, attention was turned too many complex games like Go and poker. In recent years, a number of breakthroughs in AI have been made in these domains by combining deep reinforcement learning (RL) with self-play, achieving superhuman performance at Go and Poker. The UK based Artificial Intelligence company, Deepmind which is now acquired by Google has been at the forefront of this reinforcement learning research. Traditional machine learning needs a large set of training data-set which is feasible for some applications like image recognition and classification but is definitely not a way forward for much complex decision-making scenarios.

Deepmind very recently in January 2019, demonstrated the Alphastar which had impressive wins against the wildly popular real-time strategy game StarCraft II. It was considered a consensus challenge for AI research which the new learning technique from Deepmind help master. After this, Deepmind has turned its research efforts to soccer! One longstanding challenge in AI has been robot soccer including simulated leagues, which has been tackled with machine learning techniques but not yet mastered by end-to-end reinforcement learning.

In a paper released last week on Arxiv, Deepmind tries to address this problem. Soccer provides a situation where various different players have to co-operate to achieve a goal which creates complex combinations which is very different from one person trying to achieve a goal. In continuous control domains, competitive games possess a natural curriculum property, where complex behaviours have the potential to emerge in simple environments as a result of competition between agents, rather than due to the increasing difficulty of manually designed tasks. Challenging collaborative-competitive multi-agent environments have only recently been addressed using end-to-end RL and this is the first time they have been applied to soccer.

The impressive result obtained was the emergence of cooperative behaviours in reinforcement learning agents by introducing a challenging competitive multi-agent soccer environment with continuous simulated physics. DeepMind researchers combined Stochastic Value Gradients, a reinforcement learning algorithm for continuous control; and Population-based training, a method to optimize hyper-parameters in a population of simultaneously learning agents. Ten different simulated robot soccer teams were generated, each trained with 25 billion frames of a learning experience. Researchers then simulated one million tournament matches between the ten squads. As the training progressed, there was a progression in the agent’s behaviour from simple ball chasing, and finally showing evidence cooperation.

This is a major breakthrough in multi-agent reinforcement learning and further research could lead to major breakthroughs in different fields of artificial intelligence. Of course, everybody’s still expecting the EA sports FIFA match first!

To deep dive into global innovation in Artificial Intelligence and learn more about their applications in your industry, test drive WhatNext now!

How useful was this post?

Click on a star to rate it!

Average rating 0 / 5. Vote count: 0

No votes so far! Be the first to rate this post.

Leave a Comment

Your email address will not be published. Required fields are marked *

Related Insights

Food Supply Chain - WhatNext

Food Supply Chain and Internet of Things

Driver Monitoring using AI -WhatNext

Driver Monitoring using Artificial Intelligence

Quantum Computing - WhatNext

Quantum Computing in Car Manufacturing

Sustainable Agriculture - WhatNext

Sustainable Agriculture using Synthetic Biology

Potential of Living Medicines - WhatNext

Potential of Living Medicines