Comments (17)
Sure! This is the link to this particular run:
https://wandb.ai/thomaswang/lambo_replicate/runs/3shmruby?workspace=user-thomaswang
from lambo.
thanks for sharing. there's three things that account for the discrepancy here
-
For consistency with PyMOO I followed the convention in the code that all objectives are minimized, so you need to account for the sign difference for maximized properties like penalized logP.
-
the
candidates/obj_val_*
field shows the best objective value within each query batch over time, so to transform into a plot like the one shown in the paper you need to apply acummin
transform to show the best-so-far as a function of time -
I recall there being pretty substantial variance in performance across seeds (which is why I plotted quantiles), so you'd want to run for at least 5 trials, apply the
cummin
transform and compute the quantiles to reproduce the plot in the paper.
I will see what I can do about getting you a notebook to reproduce, but it may have to wait a bit while I deal with other work on my plate. In any case, I'm delighted you're taking the time to reproduce these experiments, if you have any further questions don't hesitate to ask :)
from lambo.
The reason I raise this issue is that I tried to train a single-objective LaMBO model with the exact command from README:
python scripts/black_box_opt.py optimizer=lambo optimizer.encoder_obj=lanmt task=chem_lsbo tokenizer=selfies surrogate=single_task_svgp acquisition=ei encoder=lanmt_cnn
but this is the wandb logging I get wrt the penalized logp metric:
the bb evaluations totaled 64*50=3.2k, but the best score was just above 6, which is different from the results in figure 10, so I wonder if I have missed some extra processing steps to reproduce the results. Thanks!
from lambo.
Sorry for the delayed response, would you mind sharing the link to the wandb data for your run?
from lambo.
Thank you for your suggestion! I will first fix these differences.
from lambo.
sounds good. note that the seed is fixed in the config, so you'll want to be sure to override it, e.g.
python scripts/black_box_opt.py -m optimizer=lambo optimizer.encoder_obj=lanmt task=chem_lsbo tokenizer=selfies surrogate=single_task_svgp acquisition=ei encoder=lanmt_cnn seed=1,2,3,4
from lambo.
one more thing, obj_val_0
is actually the negative penalized logP, so you'll want to either apply cummin
and negate, or negate then apply cummax
. I've edited my previous response to reflect this.
https://github.com/samuelstanton/lambo/blob/main/lambo/tasks/chem/chem.py#L105
from lambo.
Sure, thanks for the reminder! The experiments are still running, and I also wonder if it's reasonable that the single-objective experiments need 1 day 12 hours to finish, while the multi-objective experiments need just 5 hours? Intuitively the single-objective runs should be faster than the multi-objective ones?
from lambo.
fair question. the single-objective experiment collects bigger batches of data over more rounds than the multi-objective experiments, so using exact GP inference would require a lot of GPU memory and would likely be numerically unstable. Instead for this task I use a variational GP, which has constant memory footprint and is more numerically stable for large datasets. Unfortunately variational GPs are fairly slow to train, which leads to the dramatic increase in runtime. There probably is room for optimization here, the current training recipe is optimized more for stability than speed.
from lambo.
Thanks for the reply! It makes sense now.
However, as I re-ran the experiments with seed 1, 2, 3, 4, I found that the optimization performance is still under expectation. The wandb loggings are the ones with id 12, 13, 14, 15 of the project:
https://wandb.ai/thomaswang/lambo_replicate/groups/test/table?workspace=user-thomaswang
I did not do cummin operations for the log outputs, but we can see that the min values for the obj_val_0 are around -7 in all runs, which are 7 for penalized logp.
I wonder if there are some problems with the default configurations of the setting? Would it be possible for you to double check the configurations? On my side, I will also double check if there is something wrong with my reproduction.
Thanks very much!
from lambo.
hm ok I'll take a look, thanks for raising the issue
from lambo.
Hi! I am also intereested in the single objective use case for LaMBO. Is there any update on reproducing the published numbers?
from lambo.
@samuelstanton I'm also having some trouble reproducing the results, I ran the following line:
python scripts/black_box_opt.py -m optimizer=lambo optimizer.encoder_obj=lanmt task=chem_lsbo tokenizer=selfies surrogate=single_task_svgp acquisition=ei encoder=lanmt_cnn seed=1,2,3,4
and, while the script is still running, I'm getting results similar to @Thomaswbt above (in fact slightly worse)
It would be great to get an update on this, thank you!
from lambo.
Thank you all for your patience. I've determined that the some of the default hyperparameters were indeed misconfigured and have updated the command in the README. That being said the results I'm getting now are not quite what I expect and I will continue to investigate. Here's what I'm getting now
40%, 60%, and 80% quantiles across 5 seeds (0-4)
While this is much better than the results you were seeing and the algorithm does "solve" the problem for 3/5 seeds (i.e. learn to output long hydrocarbon chains), this is not as good as what I was seeing before and is more sensitive to the random seed than I'd like. In any case I wanted to share an update while I continue looking in to this. I've also pushed the notebook I used to create these plots to notebooks/plot_lsbo_comparison.ipynb
.
from lambo.
The major hypers that have been corrected are:
optimizer.window_size=1
-->optimizer.window_size=8
this hyperparameter controls how many corruptions are made to the seed sequence and can have a major effect when the optimal solution requires large increases to the sequence length.surrogate.bs=32
-->surrogate.bs=256
with a larger dataset increasing the batch size decreases the run time significantly, I was seeing about 6 hours per seed on an A100 after this changeoptimizer.resampling_weight=1.0
-->optimizer.resampling_weight=0.5
this change makes the optimizer sample "good" seeds more aggressively when constructing batches of candidates.
from lambo.
Increasing the max context length to 256 (task.max_len=256
) improves performance on this benchmark, as I noted in the paper, but variance across seeds is still an issue.
from lambo.
Sorry for the late response. Thank you for your effort! Previously I also found that the choice for the starting sequences matters a lot to the final results. I think I will close the issue.
from lambo.
Related Issues (11)
- 'NoneType' object HOT 2
- A typo on the pre-processing - pH values for RFPs HOT 1
- Installation and experiment replication [MacOS (M1/ARM)] HOT 3
- Issue with Acquisition Function Calculation in LAMBO Implementation HOT 2
- Caluclation error in gpytorch HOT 8
- Device error HOT 7
- mask_ratio during mutation HOT 2
- sampler INF NAN HOT 2
- Problem with installing requirements HOT 3
- Error when using `chem_lsbo` with `mlm_cnn` HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from lambo.