Action Learning Test

The goal of an agent in the test is to learn to recognize an action based on two pictures, “before” and “after”. Estimated complexity of the test for humans is 6-7 years old.

 

The test consist of several steps, on each step an agent is presented with two pictures which display some situation as it develops in time.

Picture 1 displays a ball falling on some surface

 

After showing a two-part picture the platform asks an agent “What is it?”. The agent has either to name a recognized action or say “I don’t know”. In the first case the agent is given one point if the name of action was correct. If the agent’s guess was incorrect, the platform responds with “No, it is <right action>”.

 

One test session consists of some number of different actions, say 20 actions distributed over 50 steps. In each session actions and their variations are randomly sampled from the overall set containing 100 actions, so an agent is never given a chance to pass the same session twice. This eliminates the possibility of a developer to fine tune the agent’s algorithm on a concrete test set.

 

Here is a 10-step demo run of the test

Step 1:

 

Platform: What is it?

Agent: I don’t know

Platform: it’s falling

 

 

Step 2:

 

Platform: What is it?

Agent: I don’t know

Platform: it’s toppling

 

 

Step 3:

 

Platform: What is it?

Agent: It's falling

Platform: correct!

 

 

Step 4:

 

Platform: What is it?

Agent: It's falling

Platform: no, it's bouncing

 

 

Step 5:

 

Platform: What is it?

Agent: It's toppling

Platform: correct!

 

 

Step 6:

 

Platform: What is it?

Agent: I don't know

Platform: it's explosion

 

 

Step 7:

 

Platform: What is it?

Agent: It's bouncing

Platform: correct!

 

 

Step 8:

 

Platform: What is it?

Agent: It's explosion

Platform: correct!

 

 

Step 9:

 

Platform: What is it?

Agent: I don't know

Platform: it's rolling down

 

 

Step 10:

 

Platform: What is it?

Agent: I don't know

Platform: it's recovery from explosion

 

 

In this session the agent got 4 points, as it correctly guessed an action 4 times. The goals is to gain the maximum number of points