aks2203 / deep-thinking Goto Github PK
View Code? Open in Web Editor NEWA centralized place for deep thinking code and experiments
License: MIT License
A centralized place for deep thinking code and experiments
License: MIT License
The calls to eval() were removed, so adding problems, datasets, and models is done slightly differently now. This should be reflected in the README.
Hi, congrats on the NeurIPS pub! I was fiddling around with the codebase, but it appears that there is an assert which blocks running the model for test iterations greater than max_iters
so the model can't be run for more iterations it was trained for.
Additionally, its present only for the feedforward_1D
or 2D
net - DTNet
seems fine. Do you have any idea why this was there, because I assume it would prevent reproducibility for OOD generalization..
How I found it: I was testing a modified 2D model in a jupyter notebook. load_model_from_checkpoint
brought up the error:
actual error: RuntimeError: Error(s) in loading state_dict for DTNet:
Missing key(s) in state_dict: "projection.0.weight", "recur_block.0.0.conv1.weight", "recur_block.0.0.conv2.weight", "recur_block.0.1.conv1.weight", "recur_block.0.1.conv2.weight", "head.0.weight", "head.2.weight", "head.4.weight".
Unexpected key(s) in state_dict: "module.projection.0.weight", "module.recur_block.0.0.conv1.weight", "module.recur_block.0.0.conv2.weight", "module.recur_block.0.1.conv1.weight", "module.recur_block.0.1.conv2.weight", "module.head.0.weight", "module.head.2.weight", "module.head.4.weight".
We looked into it, and believe this is relevant: https://discuss.pytorch.org/t/missing-keys-unexpected-keys-in-state-dict-when-loading-self-trained-model/22379
We resolved it by modifying load_model_from_checkpoint
. This may be an issue with the other models.
The python function eval()
is used in common.py
, training_utils.py
, and testing_utils.py
. This should be changed.
deep-thinking/deepthinking/utils/training.py
Lines 42 to 47 in 6b8aa99
Subtle bug, but you sample two random variables - n
and k
. The distribution of the sum, n + k
will NOT be uniform.
Example, for max_iters=10
, the distribution of the sum is heavily skewed making the model worse at generalizing iteration wise, especially for harder tasks which require learning harder dynamics
One solution might be to skew the sampling of k
s.t the sum approaches a random distribution:
def get_skewed_n_and_k(max_iters: int) -> Tuple[int, int]:
uniform_random = uniform(0, 1)
skew = randrange(10, 50)
n = randrange(0, max_iters)
# Apply skewing transformation
skewed_k = 1 + (max_iters - n) * uniform_random ** skew
return n, int(skewed_k)
Which yields:
Additionally, because the skew
is also randomly sampled, the distribution shifts a little bit each time so as to provide better coverage for all iterations
.
LMK if you want a PR @aks2203! ๐
Have a great day!
Neel
Context: I'm trying to produce plots for prefix sums. I successfully created a pivot table (after installing tabulate
) by running
python deepthinking/data_analysis/make_table.py results/output_default/
However, running the command python deepthinking/data_analysis/make_schoop.py results/output_default/
produces the following error:
It looks like there is a missing column test_acc_sem
, not sure why the pivot table code isn't adding that.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.