kzl / universal-computation Goto Github PK
View Code? Open in Web Editor NEWOfficial codebase for Pretrained Transformers as Universal Computation Engines.
Home Page: https://arxiv.org/abs/2103.05247
License: MIT License
Official codebase for Pretrained Transformers as Universal Computation Engines.
Home Page: https://arxiv.org/abs/2103.05247
License: MIT License
when i run demo.py
notebook provided in this repository, calculating xor task converges.
but when i create and run a custom notebook, it doesnot converge and it is almost 2x slower
I was wondering if you could give more details about the evaluation in CIFAR10. I looked into the Trainer class and noticed that the test evaluation is based on the parameter test_step
line 81. I would like to know how did you obtain your results in Table 1 based on this code (i.e., the running script). Additionally, CIFAR10 class has a small bug, the drop_last
(Line 42) should be False
on the testing set, otherwise, it is not the original testing set.
In the datasets/remote_homology.py notebook, it says that there are 236224 examples (out of 242560 -- 97.39%). The dataset size according to the paper is 12312. I am a bit confused about the number you have "236224", where it comes from.
Hi,
A ListOps input of "[MAX 4 3 [ MIN 2 3 ] 1 0 ])" will get encoded as "MAX 4 3 MIN 2 3 1 0" so all brackets are removed, which makes the task unsolvable.
This is also described here google-research/long-range-arena#20
How I got aware of this: In the paper, page 3 under ListOps you write "models are fed 512 tokens of dimension 15".
However there are 4 operations, 2 brackets and 10 numbers which would require dimension 16.
Checking the dataset code, there is one unused UNK token, 10 numbers, 4 operations which equals to a vocabulary length of 15.
Your code reproduces the ~38% accuracy of ListOps described in the paper correctly.
Best
Thanks for sharing your code! I was wondering if you could share the settings for reproducing the results with ViT initialization mentioned in section 3.2
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.