fchollet / arc Goto Github PK

View Code? Open in Web Editor NEW

2.2K 2.2K 405.0 471 KB

The Abstraction and Reasoning Corpus

License: Apache License 2.0

CSS 14.41% JavaScript 61.74% HTML 23.85%

artificial-intelligence intelligence-testing program-synthesis psychometrics

arc's People

Contributors

Stargazers

Watchers

Forkers

ahmadsenna ashr panchul bagelflower linhduongtuan mrgoogol pandinosaurus andreacossu stjordanis jcassiojr yagwar varadhodiyil hhy5277 shaunstanislauslau humingwu nunofernandes-plight jmmcd leo-arm dragomirradev susannev chikaobuah chomolungma codeaudit hauwenc xuannadi aerialintro ssitb ankitforcode amit2014 docallaghan jsphconnell johnjjung troideah deanhynes1 rakshaprao kbines zhuzijing michael-wzhu ageron yash-6795 techdem wizeman ahmedelhadidy leapingllamas swatideshpande2228 emmetlee sekmet moizsm gideon94 tarsbase ian0 naraek jeyavigneshr pturdaibay saibhemcc samprithahm sheex2018 mkhalaquzzaman praneethjakkaraju conor-mel1 james-kinsella rhl713 bodhi8 synchronicity89 sedherthe leiloong kfriesth sachinkarmani todun jaylinlee m3at ronanmmurphy riszin6 sdpenguin k2009 barrymonahan emildi abhinavjain13 automata arijitthegame aakashofficial dschonholtz battyone luciany so2jia jbdatascience juliaatkaggle-zz atthom da505819 zero0-1one pjbrito idrwish shubhamshaswat wtesler cerenaut barardo geraldleedoesthings arunkumarramanan bogobogo gtanansi

arc's Issues

79fb03f4 test is unsolvable, water flow

Problem

ARC-Interactive

My first prediction

My reasoning.

It seems water can flow backwards into holes. maybe water can flow forward into a hole.

My second prediction

My reasoning.

Guide the water above the hole obstacle, without sending any water below the obstacle.

At this point I looked at what was the expected result.

Expected

I'm quite surprised to see that the water flows into the hole. But not around the entire structure. It seems like a mistake.

There is no example among the train pairs, that shows this behavior.

Solution

Tweak one of the train pairs, so it demonstrates what to do in similar scenarios.

0d87d2a6 is ambiguous

In all three examples, the blue lines cut through the red boxes. But in the "correct" output, a box is colored blue, despite the fact that the blue line was merely adjacent.

80af3007 grid size mismatch between train and test

In 80af3007.json, all train grids are 18x16, but the test grid is 19x17. Is that intentional?

9edfc990 incorrect pixel in second output

pixel (13, 6) should be blue

2dee498d train 0 input breaks the pattern

it seems to me that 2dee498d train 0 input is inconsistent with the other inputs of this task.

train 1, train 2, and test 0 consist of a simple 1x3 tiling of the output. The central block of 1x3 tiling of train 0 is reversed, thus breaking the pattern across the inputs.

Using the "Random task" button does not show the json filename

Load a file normally:

Click "Random task":

Apparently, 6 tasks seem to have the same implicit goal

Perhaps I'm simply misunderstanding the paper, but on page 46 it says the following regarding this repository:

Focus on measuring developer-aware generalization, rather than task-specific skill, by
only featuring novel tasks in the evaluation set

It also says:

All tasks are unique, and the set of test tasks and the set of training tasks are disjoint.

I'm not entirely sure what is meant by "disjoint", "novel" or "unique" here (e.g. unique in the sense of the inputs/output pairs being different or unique in the sense that the implicit goal is different?), but I'm guessing the latter.

If so, then I think I found 6 tasks that have the same implicit goal.
They are the following:

data/training/ff805c23.json
data/training/dc0a314f.json
data/training/9ecd008a.json
data/evaluation/67b4a34d.json
data/evaluation/f4081712.json
data/evaluation/e66aafb8.json

The implicit goal can be described as something like "fill the largest homogeneously-colored rectangle with the pattern that is symmetrically opposite to its location", i.e. basically what is shown on figure 4 of the paper.

I found two of those tasks manually by chance, and to confirm that there was repetition I created a program that solves this particular algorithm and applied it to all the tasks in the public training and evaluation sets.

The program I wrote only tries to solve the task if:

All the demonstration inputs have only one homogenously-colored rectangle with the largest area (i.e. it skips the task if multiple homogeneous rectangles with the same largest area are found in one of the demonstration inputs), and
The demonstration outputs match (in grid size and grid contents) the output that is expected by this implicit goal.

This relatively simple skipping criteria (which seems to be required anyway, otherwise it would surely fail to solve the test input unless it matched by pure random chance) seems to successfully skip all the publicly-available tasks that are not meant to be solved with this algorithm, as the program never fails on the test inputs that it tries to solve.

If you want to, you can find the actual source code and instructions for compiling and running it here: https://github.com/wizeman/ARC/tree/solve/ocaml

Important note: the data/evaluation/e66aafb8.json file seems to be the same task that is shown on figure 4 of the paper, so if you end up deciding to delete any of these files for not being unique, I guess you wouldn't want to delete this one!

Training task 025d127b.json is unclear

The red box should be 3x3 instead of 3x4
I can't see why the read box goes down for 2 units?

Solution to task ef135b50.json

Could you please explain what is the rule used to generate output in the following task https://github.com/fchollet/ARC/blob/master/data/training/ef135b50.json ?

Demonstrations 1 and 2 suggest "Paint all black fields lying between two red fields dark red", but then demonstration 3 would look differently and solution to the actual task would look differently as well.

Neural Abstract Reasoner: claims that they has achieved 80% accuracy on this dataset

I have found https://arxiv.org/pdf/2011.09860.pdf Neural Abstract Reasoner research. They has claims tabout 80% accuracy on this dataset. @fchollet How do you think are they realy got 80% accuracy on a such hard dataset? Look sucpitious to me. Can't find code, submissions on kagle, or references on their paper.

Relation to Bongard problems

is ARC more complex than Bongard problems ?
how do they relate ?
can a program solving ARC solve BPs ?

bd14c3bf test input top left object should be red

task bd14c3bf from the evaluation dataset picks one object with color red and all similar objects are colored to red in the output. However on the test input image all objects are blue.

B7CB93AC.json is not unambiguous

Not exactly an issue, but most of the task in the dataset seems to have single unambiguous solutions while this and couple of others doesn't.

According to the rule we have to combine shapes in a way such that we get complete rectangle. Rotation is allowed. With that in mind solution like this should also be correct.

423a55dc test is unsolvable, skew

Problem

ARC-Interactive

The reasoning seems to be like this:

The bottom line of the orange shape stays where it is.
One line above the bottom, is moved one cell towards the left.
Two lines above the bottom, is moved 2 cells towards the left.
3 lines above the bottom, is moved 3 cells towards the left.
There is no wrap around, so when the orange shape overflows the left edge, the shape doesn't reappear on the right side.

Following the reasoning, the prediction is:

However the expected output is supposed to be the following. I have highlighted the problem area.

Solution

Replace the test output, with my prediction. So this task is solvable. Otherwise I doubt this task will be solvable.

c35c1b4c train output 3 is asymmetric

ARC-Interactive

The shape has to be horizontal symmetric.

However in train output 3, it's asymmetric due to a single pixel.

Credit to user a on lab42 Discord for reporting this.

Version ARC

@fchollet as people are now coding against this, and as "we intend to keep adding new tasks to ARC in the future", can we please version/tag this repo, ideally with release notes listing ids of new or modified tasks.

Please allow to go to test input 1/2, after going to test input 2/2

58E15B12.json has mistake in test output

It seems that by the rule when green and aqua lines are intersect they should produce magenta color. But in the provided test output single magenta cell is missing:

8FBCA751.json seems to have two correct solutions

The following image is an intended solution for the task. It assumes that bottom and top blocks fully connected between themselves. But if we don't assume that, following solution also seems feasible.

Scale and motion towards the observer

A number of the examples imply motion in 2D plane. I wonder if it would be useful to include simple examples that imply motion in 3D. That is, imagine a input grid with an object and where the output contains the same object scaled up (each pixel of input object become n*n in output, or vice versa) to simulate motion towards or away from the observer.

This could also be combined with 2D motion

Does any one solve any task?

I can't find any research on this.

9aec4887.json demonstration seems too vague to solve in one attempt

My attempt:

Actual solution:

It isn't that the other solution isn't valid, but that the demonstration didn't show that the different colors can skip black boxes or move more than one block.

A8610EF7.json has mistake in test output

Here is train pairs of this task.

And here an expected solution for test pair. Obviously all cyan cells should be grey.

b230c067 make less ambiguous, so it doesn't require 2 attempts

Problem

ARC-Interactive

I used 2 attempts at solving this task. By making it more explicit, I think it can be reduced to 1 attempt.

Attempt 1 - incorrect

At first glance, it seems the top-right object should be colored red, like this. But it's incorrect.

Attempt 2 - correct

My second take. Objects with the identical shape gets colored blue, and odd-shapes colored red. This is correct.

Solution

To make it less-ambiguous, I suggest rotating by 180 degrees on the first training pair, so it looks like this. This rules out the failed attempt 1.

4852f2fa train output - move one pixel

ARC-Interactive

This issue is also reported here: volotat/ARC-Game#4

The output repeats the shape. But the repeat is inconsistent in the first training pair.

The light blue pixel in the bottom of the light blue shape, is misplaced by 1 pixel.

This task has many training pairs to learn from, so no worries. The real world is rarely perfect.

310f3251 test is unsolvable, wrap around

Problem

ARC-Interactive

I was unable to solve this task.

My reasoning.

Maintain same repeating tile.
For every non-black pixel, insert a red pixel at relative position -1, -1.

My predicted solution

The expected solution

Difference

My mistake was that I assumed maintain same repeating tile, thus assuming that the red pixels would wrap around the edge.

Solution

Add a train pair, that demonstrates that the tiles are non-repeating.

Adding customised tasks ?

In the app, would it be possible to let us create our own tasks ?
I guess such a tool was used to create the tasks, could it be possible to share it ?

Add URL to the playground

I think it'd be a great idea to encourage people to take the test, and showcase this link somewhere: http://arc-editor.lab42.global

Add Kaggle Competition Link to README.md

The Kaggle Competition to solve this dataset has now finished but contains a wealth of knowledge regarding different approaches for trying to solve it. It is probably worth including a link in the README.md.

https://www.kaggle.com/c/abstraction-and-reasoning-challenge

Choose File not responding

The ARC testing interface does not load a file if it was loaded before selecting a random task.

To replicate, perform the following twice:

Click "Random task"
Click "Choose File" and select a file (must be same file both times)

The last step should load the selected file, but instead does nothing.

Abstraction and Reasoning Corpus --> bigbench?

Hi There!

This is a really cool corpus, @fchollet :-)

I'm wondering if a version of this task could be adapted to this ICLR workshop, centered on the construction of a big set of sequence-to-sequences benchmarks?:

https://iclr.cc/Conferences/2021/Schedule?showEvent=2147
https://github.com/google/BIG-bench/

I think I could adapt it in a few hours, and it could be a good candidate for addition, but I wanted to check to see if that would be okay with you.

Jack

Evaluation Task ad7e01d0 - Wrong Pixel

The blue pixel in the first train output should be black.

Recursion Error After Certain Sequence of Operations

While porting the ARC testing framework to an OpenAI Gym environment, my random operation testing revealed an issue with the Fill Operation. Given a certain combination of operations, the environment runs into a recursion error.

This has been validated in both my Python implementation and the JS web interface of ARC testing (which was ported straight from the JS environment).

The commands on each line correspond to either fill or edit, followed by the row and col (indexed by 0) and the color symbol selections from the bottom pallete (also indexed by 0).

EDIT 0 2 1
EDIT 0 2 2
FILL 0 1 5
EDIT 2 1 5
EDIT 0 1 6
FILL 2 1 3
FILL 2 1 4
FILL 2 1 2
EDIT 0 2 3
FILL 0 0 6
EDIT 0 0 3
FILL 0 1 5
EDIT 2 0 3
> maximum recursion depth exceeded in comparison

20818e16 and b0f4d537 has mistype

my model told me that:
evaluation/20818e16.json ['train'][2]['output'] - size(9,8) => (8,8)
evaluation/b0f4d537.json ['train'][0]['output'] - middle line need move 1px righter
seems its true =)

can not access google

Evaluation Task 48131b3c - Wrong pixel

In the last train task of 48131b3c, one pixel in the output is wrong. This yellow pixel should be black.

Vary background color

@fchollet An intelligent agent should be able to recognize focal objects versus background, even if background color might vary among training examples.

There are various examples of deep-learned systems that recognize B&W digits well but fail spectacularly when colors are simply inverted. I would like to see an intelligent ARC-solver that is robust to this.

All the ARC examples I have seen so far have black backgrounds. Thus, I propose that ARC is augmented with examples with different background colors, sometimes yellow, sometimes white etc.

Request: Add gh-pages branch

I've created a gh-pages branch for this repo at:
https://github.com/iRyanBell/ARC/tree/gh-pages

GitHub Pages: https://iryanbell.github.io/ARC/

ARC in real world use

Hi,

Is there a plan to make a software which would convert real world scenes/tasks to/from ARC tasks?

I guess this is not trivial to do but it would be worth as AI researchers would be much more interested in ARC if they saw that ARC helps different general AI algos to be tested in real world usage via the converter + ARC.

br
Lajos

9def23fe train[0] output

There are two extra red pixel in the first train output. As I understand in this task the red square emits horizontal and vertical rays unless there is a non-red and non-black pixel that prevents it.

Train Task a9f96cdd.json - Problem with One Point

The fourth training output includes the red point from the input. From the other 3 training/output pairs and the test input/output pair, it seems the task is to add the 4 colored points centered around the red point and then remove the red point (make it black).

import json
filename = "../ARC/data/training/dc433765.json"
with open(filename, "r+") as jsonFile:
    data = json.load(jsonFile)
print(data['train'][5]['input'] == data['test'][0]['input'])
print(data['train'][5]['output'] == data['test'][0]['output'])

produces

True
True

b4a43f3b test is unsolvable, missing 2 green pixels

Problem

ARC-Interactive

Prediction

I manually tried to solve this task, and made this drawing.

Expected

However the expected is slightly different.

Difference

The test output is missing two green pixels.

Solution

Insert 2 green pixels in the expected test output.

fchollet / arc Goto Github PK

arc's People

Contributors

Stargazers

Watchers

Forkers

arc's Issues

Problem

My first prediction

My second prediction

Expected

Solution

Problem

Solution

Problem

Attempt 1 - incorrect

Attempt 2 - correct

Solution

Problem

My predicted solution

The expected solution

Difference

Solution

Problem

Prediction

Expected

Difference

Solution

Recommend Projects

Recommend Topics

Recommend Org