Artificial Intelligence Playing Games
Neuroevolution is hardly a new idea in the artificial intelligence landscape, but it did become something of a forgotten one. Deep learning has been stealing headlines for years, but neuroevolution is making a comeback and major AI researchers are now using it to create more powerful machine learning tools.
In AI, neuroevolution remains a beautiful concept that can spur exciting research and results. That the power of this subfield is only now being realized in 2020 is interesting, but several mainstream demonstrations show how neuroevolution could change the way we live our lives. It is certainly shaping how games of the future will be played.
If you’re unfamiliar with neuroevolution, it is an AI concept for machine learning (ML) that seeks to allow evolutionary processes to take place within a model. It is the concept of growth through learned behavior we have seen in biology translated to computers.
As we move more into an AI world, the idea of evolving models makes sense. Building one and done AI solutions is not cost-effective or time effective. Building models that can to some degree improve over time is clearly a way to extend the lifespan and value of ML agents.
Interestingly, the world of gaming provides a perfect snapshot of neuroevolution in action. In recent years, programmers have built AI models using neuroevolution principles for playing and completing popular classic video games. Below we will look at some of those models and how they function in the game of choice.
Note: It’s important to point out that AI models that can play and beat games is not new. However, there is a fundamental difference between neuroevolution models. Specifically, these machine learning programs teach themselves to play games and are not programmed to play them.
Marl/O – Mario
Perhaps the most famous game of all-time, the original Mario is as true a classic as it’s possible to get. YouTuber SethBling developed a program known as “Marl/O” that could teach itself to play the original Mario. By tapping into neural networks, the program can test and understand which buttons to press to perform actions and then learn from the results.
Furthermore, the model is able to know when obstacles are on the map and avoid them by jumping. Marl/O uses a sort of AI fitness tracking to decide how “fit” Mario is at the end of each run or level. Only the networks that result in the highest fitness are used on the next run. This means the model did fail a lot, but each time was learning and soon perfected the first level.
Sonic the Hedgehog 2 Level Completion
While the next neuroevolution AI does not have a name, it too has tackled a platforming classic… Sonic. Starting as part of an OpenAI contest, the ML model was designed to be able to complete levels on Sonic in the fastest time. The developers argue this is how humans play games, not often seeking level perfection.
The AI agent does have some pre-training. Specifically, it was able to train on other levels before tackling the Aquatic Ruin Zone. If you have played Sonic 2 you will know this is the third zone in the game. Taking its pre-training, the AI used two hours to essentially practice on the level after seeing it for the first time.
The project is built on what’s known as joint Proximal Policy Optimization (PPO), which is a model for allowing AI to learn online from experience.
Multi-Agent Hide and Seek
Another OpenAI project, Hide and Seek is an amazing training ground for neuroevolution AI agents. It takes a multi-agent approach, meaning ML models effectively play against each other. Through gameplay, models can find more complex ways to play the game, such as using tools and understanding hiding places.
OpenAI admits the agents performed above expectations. Some were able to create six strategies and counterstrategies for Hide-and-Seek. Amazingly, researchers say they did not even know some of these strategies were available in the game. This is a pointer to the future ability of neuroevolution AI to produce extremely complex behavior.
You Shall Not Pass
AI agents trained to play You Shall Not Pass show how machine learning models can train to self-play and win under most circumstances. So-called adversarial AI means the model conflicts with potential outcomes and fails to find a result. For example, an attacker cannot modify what another AI agent is doing or seeing.
However, in shared-agent environments one AI may create something solely to distract the other AI into making an observation. It is an amazing development in AI capabilities that suggests AI can figure out ways to win games even if they have never played them before.
Steal a Banana
TairaGames shows a concept of neuroevolution learning that challenges dumb AI with stealing a banana. The AI is dumb because it has no prior knowledge of the best way to get the banana and must learn how to do it. There are also rules in place about the movements the AI can take and hitting a boundary wall.
What’s interesting is the AI does not take a guns-blazing approach that results in many instant failures, such as running straight ahead to hit the wall. Instead, the stupid starting AI takes tiny micro-steps – 81,000 in fact – and over 1.5 hours to get to the banana for the first time. This non-evolved AI will also not hold onto the banana when it grabs it.
After 800,000 steps of learning, the bots can move from knowing nothing to learning enough to move to the banana more directly. They have also learned to hold onto the banana once they pick it up.