fchollet / arc Goto Github PK
View Code? Open in Web Editor NEWThe Abstraction and Reasoning Corpus
License: Apache License 2.0
The Abstraction and Reasoning Corpus
License: Apache License 2.0
My reasoning.
My reasoning.
At this point I looked at what was the expected result.
I'm quite surprised to see that the water flows into the hole. But not around the entire structure. It seems like a mistake.
There is no example among the train
pairs, that shows this behavior.
Tweak one of the train
pairs, so it demonstrates what to do in similar scenarios.
In 80af3007.json, all train grids are 18x16, but the test grid is 19x17. Is that intentional?
Perhaps I'm simply misunderstanding the paper, but on page 46 it says the following regarding this repository:
- Focus on measuring developer-aware generalization, rather than task-specific skill, by
only featuring novel tasks in the evaluation set
It also says:
All tasks are unique, and the set of test tasks and the set of training tasks are disjoint.
I'm not entirely sure what is meant by "disjoint", "novel" or "unique" here (e.g. unique in the sense of the inputs/output pairs being different or unique in the sense that the implicit goal is different?), but I'm guessing the latter.
If so, then I think I found 6 tasks that have the same implicit goal.
They are the following:
data/training/ff805c23.json
data/training/dc0a314f.json
data/training/9ecd008a.json
data/evaluation/67b4a34d.json
data/evaluation/f4081712.json
data/evaluation/e66aafb8.json
The implicit goal can be described as something like "fill the largest homogeneously-colored rectangle with the pattern that is symmetrically opposite to its location", i.e. basically what is shown on figure 4 of the paper.
I found two of those tasks manually by chance, and to confirm that there was repetition I created a program that solves this particular algorithm and applied it to all the tasks in the public training and evaluation sets.
The program I wrote only tries to solve the task if:
This relatively simple skipping criteria (which seems to be required anyway, otherwise it would surely fail to solve the test input unless it matched by pure random chance) seems to successfully skip all the publicly-available tasks that are not meant to be solved with this algorithm, as the program never fails on the test inputs that it tries to solve.
If you want to, you can find the actual source code and instructions for compiling and running it here: https://github.com/wizeman/ARC/tree/solve/ocaml
Important note: the data/evaluation/e66aafb8.json
file seems to be the same task that is shown on figure 4 of the paper, so if you end up deciding to delete any of these files for not being unique, I guess you wouldn't want to delete this one!
Could you please explain what is the rule used to generate output in the following task https://github.com/fchollet/ARC/blob/master/data/training/ef135b50.json ?
Demonstrations 1 and 2 suggest "Paint all black fields lying between two red fields dark red", but then demonstration 3 would look differently and solution to the actual task would look differently as well.
I have found https://arxiv.org/pdf/2011.09860.pdf Neural Abstract Reasoner research. They has claims tabout 80% accuracy on this dataset. @fchollet How do you think are they realy got 80% accuracy on a such hard dataset? Look sucpitious to me. Can't find code, submissions on kagle, or references on their paper.
Not exactly an issue, but most of the task in the dataset seems to have single unambiguous solutions while this and couple of others doesn't.
According to the rule we have to combine shapes in a way such that we get complete rectangle. Rotation is allowed. With that in mind solution like this should also be correct.
The reasoning seems to be like this:
Following the reasoning, the prediction is:
However the expected output is supposed to be the following. I have highlighted the problem area.
Replace the test output, with my prediction. So this task is solvable. Otherwise I doubt this task will be solvable.
The shape has to be horizontal symmetric.
However in train output 3, it's asymmetric due to a single pixel.
Credit to user a
on lab42 Discord for reporting this.
@fchollet as people are now coding against this, and as "we intend to keep adding new tasks to ARC in the future", can we please version/tag this repo, ideally with release notes listing ids of new or modified tasks.
A number of the examples imply motion in 2D plane. I wonder if it would be useful to include simple examples that imply motion in 3D. That is, imagine a input grid with an object and where the output contains the same object scaled up (each pixel of input object become n*n in output, or vice versa) to simulate motion towards or away from the observer.
This could also be combined with 2D motion
I can't find any research on this.
I used 2 attempts at solving this task. By making it more explicit, I think it can be reduced to 1 attempt.
At first glance, it seems the top-right object should be colored red
, like this. But it's incorrect.
My second take. Objects with the identical shape gets colored blue
, and odd-shapes colored red
. This is correct.
To make it less-ambiguous, I suggest rotating by 180 degrees on the first training pair, so it looks like this. This rules out the failed attempt 1.
This issue is also reported here: volotat/ARC-Game#4
The output repeats the shape. But the repeat is inconsistent in the first training pair.
The light blue pixel in the bottom of the light blue shape, is misplaced by 1 pixel.
This task has many training pairs to learn from, so no worries. The real world is rarely perfect.
I was unable to solve this task.
My reasoning.
My mistake was that I assumed maintain same repeating tile
, thus assuming that the red pixels would wrap around the edge.
Add a train
pair, that demonstrates that the tiles are non-repeating.
In the app, would it be possible to let us create our own tasks ?
I guess such a tool was used to create the tasks, could it be possible to share it ?
I think it'd be a great idea to encourage people to take the test, and showcase this link somewhere: http://arc-editor.lab42.global
The Kaggle Competition to solve this dataset has now finished but contains a wealth of knowledge regarding different approaches for trying to solve it. It is probably worth including a link in the README.md.
The ARC testing interface does not load a file if it was loaded before selecting a random task.
To replicate, perform the following twice:
The last step should load the selected file, but instead does nothing.
Hi There!
This is a really cool corpus, @fchollet :-)
I'm wondering if a version of this task could be adapted to this ICLR workshop, centered on the construction of a big set of sequence-to-sequences benchmarks?:
https://iclr.cc/Conferences/2021/Schedule?showEvent=2147
https://github.com/google/BIG-bench/
I think I could adapt it in a few hours, and it could be a good candidate for addition, but I wanted to check to see if that would be okay with you.
Jack
While porting the ARC testing framework to an OpenAI Gym environment, my random operation testing revealed an issue with the Fill Operation. Given a certain combination of operations, the environment runs into a recursion error.
This has been validated in both my Python implementation and the JS web interface of ARC testing (which was ported straight from the JS environment).
The commands on each line correspond to either fill or edit, followed by the row and col (indexed by 0) and the color symbol selections from the bottom pallete (also indexed by 0).
EDIT 0 2 1
EDIT 0 2 2
FILL 0 1 5
EDIT 2 1 5
EDIT 0 1 6
FILL 2 1 3
FILL 2 1 4
FILL 2 1 2
EDIT 0 2 3
FILL 0 0 6
EDIT 0 0 3
FILL 0 1 5
EDIT 2 0 3
> maximum recursion depth exceeded in comparison
my model told me that:
evaluation/20818e16.json ['train'][2]['output'] - size(9,8) => (8,8)
evaluation/b0f4d537.json ['train'][0]['output'] - middle line need move 1px righter
seems its true =)
@fchollet An intelligent agent should be able to recognize focal objects versus background, even if background color might vary among training examples.
There are various examples of deep-learned systems that recognize B&W digits well but fail spectacularly when colors are simply inverted. I would like to see an intelligent ARC-solver that is robust to this.
All the ARC examples I have seen so far have black backgrounds. Thus, I propose that ARC is augmented with examples with different background colors, sometimes yellow, sometimes white etc.
I've created a gh-pages branch for this repo at:
https://github.com/iRyanBell/ARC/tree/gh-pages
GitHub Pages: https://iryanbell.github.io/ARC/
Hi,
Is there a plan to make a software which would convert real world scenes/tasks to/from ARC tasks?
I guess this is not trivial to do but it would be worth as AI researchers would be much more interested in ARC if they saw that ARC helps different general AI algos to be tested in real world usage via the converter + ARC.
br
Lajos
Clicking on apps/testing_interface.html just shows me the html code as a github page.
Since this repo seems unmaintained at the moment, with quite a few pending PRs and open issues relating to bugs in the tasks -- is anyone aware of another repo which applies all of these fixes?
It seems that dc433765's train 5 == test 0:
import json
filename = "../ARC/data/training/dc433765.json"
with open(filename, "r+") as jsonFile:
data = json.load(jsonFile)
print(data['train'][5]['input'] == data['test'][0]['input'])
print(data['train'][5]['output'] == data['test'][0]['output'])
produces
True
True
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.