kzl / universal-computation Goto Github PK

View Code? Open in Web Editor NEW

242.0 14.0 30.0 86 KB

Official codebase for Pretrained Transformers as Universal Computation Engines.

Home Page: https://arxiv.org/abs/2103.05247

License: MIT License

Jupyter Notebook 30.90% Python 69.10%

universal-computation's People

Contributors

Stargazers

Watchers

universal-computation's Issues

Not converging in xor calculation task

when i run demo.py notebook provided in this repository, calculating xor task converges.
but when i create and run a custom notebook, it doesnot converge and it is almost 2x slower

CIFAR10 evaluation

I was wondering if you could give more details about the evaluation in CIFAR10. I looked into the Trainer class and noticed that the test evaluation is based on the parameter test_step line 81. I would like to know how did you obtain your results in Table 1 based on this code (i.e., the running script). Additionally, CIFAR10 class has a small bug, the drop_last (Line 42) should be False on the testing set, otherwise, it is not the original testing set.

Remote Homology Dataset size

In the datasets/remote_homology.py notebook, it says that there are 236224 examples (out of 242560 -- 97.39%). The dataset size according to the paper is 12312. I am a bit confused about the number you have "236224", where it comes from.

ListOps brackets are not tokenized

Hi,

A ListOps input of "[MAX 4 3 [ MIN 2 3 ] 1 0 ])" will get encoded as "MAX 4 3 MIN 2 3 1 0" so all brackets are removed, which makes the task unsolvable.
This is also described here google-research/long-range-arena#20

How I got aware of this: In the paper, page 3 under ListOps you write "models are fed 512 tokens of dimension 15".
However there are 4 operations, 2 brackets and 10 numbers which would require dimension 16.
Checking the dataset code, there is one unused UNK token, 10 numbers, 4 operations which equals to a vocabulary length of 15.

Your code reproduces the ~38% accuracy of ListOps described in the paper correctly.

Best

Vision pretraining

Thanks for sharing your code! I was wondering if you could share the settings for reproducing the results with ViT initialization mentioned in section 3.2

kzl / universal-computation Goto Github PK

universal-computation's People

Contributors

Stargazers

Watchers

Forkers

universal-computation's Issues

Not converging in xor calculation task

CIFAR10 evaluation

Remote Homology Dataset size

ListOps brackets are not tokenized

Vision pretraining

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent