Canadian researchers at Maluuba have set a new Ms. Pac-Man world record — well, sort of.
Using what they call a “divide-and-conquer” method, the team at Maluuba was able to build an artificial intelligence capable of learning Ms. Pac-Man well enough to achieve a maximum possible score of 999,990.
Maluuba was able to achieve this score in much the same way that psychologists believe human brains undergo a problem-solving process: different neural agents competing for priority.
Maluuba calls its method “Hybrid Reward Architecture.”
“Some [AI] agents got rewarded for successfully finding one specific pellet, while other were tasked with staying out of the way of ghosts.”
“The method… used more than 150 agents, each of which worked in parallel with the other agents to master Ms. Pac-Man,” reads an excerpt from a media release. “For example, some agents got rewarded for successfully finding one specific pellet, while other were tasked with staying out of the way of ghosts.”
In addition to the 163 agents — 154 for pellets, four for ghosts, four more for edible, blue ghosts, and one more fruit — that govern the on-screen rewards, Maluuba implemented a “top agent” that governed the movement of Ms. Pac-Man herself.
This top agent was able to keep track of each individual agent’s priorities, and was able to weight Ms. Pac-Man’s movement based on the immediacy of each priority.
“For example, if 100 agents wanted to go right because that was the best path to their pellet, but three wanted to go left because there was a deadly ghost to the right, it would give more weight to the one who had noticed the ghost and go left,” reads another excerpt.
According to Harm Van Seijen — a research manager at Maluuba and the lead author of the paper published about the company’s results — the best results were achieved when each agent acted in its own favour.
Alone, this egotism would probably have caused an individual agent to kill Ms. Pac-Man quite quickly, but because the top agent was able to filter through all of the agents’ priorities, the program was able to successfully navigate through the game.
“There’s this nice interplay between how they have to, on the one hand, cooperate based on the preferences of all the agents, but at the same time each agent cares only about one particular problem,” said Van Seijen. “It benefits the whole.”
The team used Ms. Pac-Man versus the original Pac-Man because the former was made to be less predictable.
Van Seijen also believes that the divide-and-conquer approach can be used to further AI development.
Van Seijen now believes that Maluuba’s platform can be used in many situations where a top-level boss is tasked with prioritizing the needs of a collection of lower level agents.
Of course, Van Seijen also believes that the divide-and-conquer approach can be used to further AI development.
Maluuba is a Quebec-based AI research firm. The company launched in 2011, out of the University of Waterloo.
The company was later acquired by Microsoft in 2017.
Image credit: Wikimedia Commons