How an artificial intelligence learned to play Minecraft from 70,000 hours of YouTube videos

Will an artificial intelligence one day be able to play just as well – or even better – than a human? Minecraft ? And most importantly, will she learn much faster soon thanks to simple videos posted on the web? At least that is the goal of OpenAI, which has just presented the results of promising initial research.

Master Minecraft is a major challenge for artificial intelligence. Because Mojang’s title is a game much more complex to assimilate for a computer than chess or go (games where the AI ​​​​now atomizes us). It’s a very open game with flexible rules, and that’s what makes it so charming! It offers almost total freedom to the player who can explore, Handymanto dig, to build at will … In short, to chain extremely diverse and complex actions that are difficult to teach unfortunate computer programs.

But that’s exactly what a team of OpenAI researchers did. They’ve trained a model from the ground up… that manages to play “right”. Minecraft. Better, your model plays strictly like a human, that is, it uses the user interface and the traditional controls of the game: keystrokes and mouse movements.

Let’s agree: released in Minecraft, OpenAI’s artificial intelligence is incapable of building a dream home from scratch, let alone reproducing King’s Landing. She is content with much more humble tasks, but her performance is far from ridiculous. She manages to build a simple shelter, craft tools, and explore a village to open chests… On a few occasions, she even managed to craft a diamond pickaxe, which according to OpenAI is a world first. It is indeed a far from easy tool to obtain, requiring many complex steps of researching, crafting and combining objects.

There is an AI that plays Minecraft

An AI fed to YouTube

How did the OpenAI researchers come to such a conclusion? The answer is almost in one word: YouTube. They capitalized on the incredible richness and variety of videos that explore it Minecraft on the internet to “feed” her model, who was then inspired by what he saw to learn to play.

Well, of course it’s not quite that simple. Far from there. Their technique, which they called VPT (eg video pre-training) consisted first of all in collecting 70,000 hours (!) of game videos aimed specifically at beginners: tutorials, guides, etc. Which is not easy: for this the researchers had to resort to “little hands” recruited from Amazon Mechanical Turk. They used a screenshot of each video to verify that the selected content was actually usable for the project, from a seed corpus of 270,000 hours!
For example, it was necessary to put aside videos shot in creative mode or those that contained logos or artifacts that could have impeded their understanding by the machine.
Finally, a little refinement: To make it easier for their baby to start the game, the researchers also extracted from these thousands of videos a subset that only contained the start of the game.

Above all, the researchers did not stand still: they also created a model (Inverse Dynamics Model, IDM) with a second series of videos Minecraft. It’s a much smaller corpus (only 2,000 hours) but has the advantage of being very accurate. Because all these parts were specially made for the experiment by several players Minecraft seasoned. The researchers were thus able to record all their mouse movements and all manipulations made on their keyboard.

The first training phases // Credit: OpenAI

They then essentially applied this model to the 70,000 hours of video they had previously collected from the Internet. This allowed their AI to “guess” and draw inspiration from the mouse and keyboard movements being made in those videos. Clever.

After the AI ​​has been trained in this way, it is already capable of performing some basic in-game tasks such as: B. Chopping wood into logs, then into planks, and then into a table. boat with four panels. A trivial step for a human player, of course, but according to OpenAI it is almost impossible to achieve with simple reinforcement learning.

It’s good, but not enough

That’s true, but not enough for our researchers. So they tweaked their AI to try and give it more Capabilities, successfully with two different techniques. First, they asked their human experts to do it Minecraft Conducting 10 minute sessions in which their goal was to build a small house out of simple materials.
By incorporating these parts into the model, they realized that the AI ​​could not only build a humble shelter for itself, but could also go much further in creating complex objects (such as a stone pickaxe).

Finally, they also used reinforcement learning, requiring their AI to get a diamond pickaxe on an empty map in ten minutes of gameplay, and rewarding them for their efforts. Not so easy, because that includes collecting, combining and Handyman a long line of consecutive objects and not easy to find. But he still managed a couple of times, or 2.5% of the 10-minute games played.

But why teach an AI Minecraft?

Of course, OpenAI didn’t do all of this to create a “superhuman” player Minecraft, although its researchers estimate they could accumulate a million game hours to perfect their model. No, VPT above all could pave the way for a new way of teaching an AI to “act” like a human step by step.

“The results presented in this paper help pave the way for exploiting the abundance of unlabeled data on the Web for sequential decision domains. » we can read in the conclusion of the OpenAI scientific paper.

To better understand this somewhat boring phrase, let’s Minecraft to take a look at Photoshop. One can well imagine a similar artificial intelligence trained in VPT using the thousands of Photoshop tutorials on the web and would learn to move through application menus, click, apply filters, retouch a photo… It would a lot make work of some graphic designers easier!

Source :

Open AI

Leave a Comment