Giter VIP home page Giter VIP logo

Comments (17)

Thomaswbt avatar Thomaswbt commented on June 26, 2024 1

Sure! This is the link to this particular run:
https://wandb.ai/thomaswang/lambo_replicate/runs/3shmruby?workspace=user-thomaswang

from lambo.

samuelstanton avatar samuelstanton commented on June 26, 2024 1

thanks for sharing. there's three things that account for the discrepancy here

  1. For consistency with PyMOO I followed the convention in the code that all objectives are minimized, so you need to account for the sign difference for maximized properties like penalized logP.

  2. the candidates/obj_val_* field shows the best objective value within each query batch over time, so to transform into a plot like the one shown in the paper you need to apply a cummin transform to show the best-so-far as a function of time

  3. I recall there being pretty substantial variance in performance across seeds (which is why I plotted quantiles), so you'd want to run for at least 5 trials, apply the cummin transform and compute the quantiles to reproduce the plot in the paper.

I will see what I can do about getting you a notebook to reproduce, but it may have to wait a bit while I deal with other work on my plate. In any case, I'm delighted you're taking the time to reproduce these experiments, if you have any further questions don't hesitate to ask :)

from lambo.

Thomaswbt avatar Thomaswbt commented on June 26, 2024

The reason I raise this issue is that I tried to train a single-objective LaMBO model with the exact command from README:
python scripts/black_box_opt.py optimizer=lambo optimizer.encoder_obj=lanmt task=chem_lsbo tokenizer=selfies surrogate=single_task_svgp acquisition=ei encoder=lanmt_cnn

but this is the wandb logging I get wrt the penalized logp metric:
屏幕快照 2023-03-07 下午3 41 23

the bb evaluations totaled 64*50=3.2k, but the best score was just above 6, which is different from the results in figure 10, so I wonder if I have missed some extra processing steps to reproduce the results. Thanks!

from lambo.

samuelstanton avatar samuelstanton commented on June 26, 2024

Sorry for the delayed response, would you mind sharing the link to the wandb data for your run?

from lambo.

Thomaswbt avatar Thomaswbt commented on June 26, 2024

Thank you for your suggestion! I will first fix these differences.

from lambo.

samuelstanton avatar samuelstanton commented on June 26, 2024

sounds good. note that the seed is fixed in the config, so you'll want to be sure to override it, e.g.

python scripts/black_box_opt.py -m optimizer=lambo optimizer.encoder_obj=lanmt task=chem_lsbo tokenizer=selfies surrogate=single_task_svgp acquisition=ei encoder=lanmt_cnn seed=1,2,3,4

from lambo.

samuelstanton avatar samuelstanton commented on June 26, 2024

one more thing, obj_val_0 is actually the negative penalized logP, so you'll want to either apply cummin and negate, or negate then apply cummax. I've edited my previous response to reflect this.

https://github.com/samuelstanton/lambo/blob/main/lambo/tasks/chem/chem.py#L105

from lambo.

Thomaswbt avatar Thomaswbt commented on June 26, 2024

Sure, thanks for the reminder! The experiments are still running, and I also wonder if it's reasonable that the single-objective experiments need 1 day 12 hours to finish, while the multi-objective experiments need just 5 hours? Intuitively the single-objective runs should be faster than the multi-objective ones?

from lambo.

samuelstanton avatar samuelstanton commented on June 26, 2024

fair question. the single-objective experiment collects bigger batches of data over more rounds than the multi-objective experiments, so using exact GP inference would require a lot of GPU memory and would likely be numerically unstable. Instead for this task I use a variational GP, which has constant memory footprint and is more numerically stable for large datasets. Unfortunately variational GPs are fairly slow to train, which leads to the dramatic increase in runtime. There probably is room for optimization here, the current training recipe is optimized more for stability than speed.

from lambo.

Thomaswbt avatar Thomaswbt commented on June 26, 2024

Thanks for the reply! It makes sense now.

However, as I re-ran the experiments with seed 1, 2, 3, 4, I found that the optimization performance is still under expectation. The wandb loggings are the ones with id 12, 13, 14, 15 of the project:

https://wandb.ai/thomaswang/lambo_replicate/groups/test/table?workspace=user-thomaswang

I did not do cummin operations for the log outputs, but we can see that the min values for the obj_val_0 are around -7 in all runs, which are 7 for penalized logp.

I wonder if there are some problems with the default configurations of the setting? Would it be possible for you to double check the configurations? On my side, I will also double check if there is something wrong with my reproduction.

Thanks very much!

from lambo.

samuelstanton avatar samuelstanton commented on June 26, 2024

hm ok I'll take a look, thanks for raising the issue

from lambo.

jasonkyuyim avatar jasonkyuyim commented on June 26, 2024

Hi! I am also intereested in the single objective use case for LaMBO. Is there any update on reproducing the published numbers?

from lambo.

kirjner avatar kirjner commented on June 26, 2024

@samuelstanton I'm also having some trouble reproducing the results, I ran the following line:
python scripts/black_box_opt.py -m optimizer=lambo optimizer.encoder_obj=lanmt task=chem_lsbo tokenizer=selfies surrogate=single_task_svgp acquisition=ei encoder=lanmt_cnn seed=1,2,3,4
and, while the script is still running, I'm getting results similar to @Thomaswbt above (in fact slightly worse)
Screen Shot 2023-03-27 at 10 26 15 AM

It would be great to get an update on this, thank you!

from lambo.

samuelstanton avatar samuelstanton commented on June 26, 2024

Thank you all for your patience. I've determined that the some of the default hyperparameters were indeed misconfigured and have updated the command in the README. That being said the results I'm getting now are not quite what I expect and I will continue to investigate. Here's what I'm getting now

40%, 60%, and 80% quantiles across 5 seeds (0-4)
image

Performance by seed
image

While this is much better than the results you were seeing and the algorithm does "solve" the problem for 3/5 seeds (i.e. learn to output long hydrocarbon chains), this is not as good as what I was seeing before and is more sensitive to the random seed than I'd like. In any case I wanted to share an update while I continue looking in to this. I've also pushed the notebook I used to create these plots to notebooks/plot_lsbo_comparison.ipynb.

from lambo.

samuelstanton avatar samuelstanton commented on June 26, 2024

The major hypers that have been corrected are:

  • optimizer.window_size=1 --> optimizer.window_size=8 this hyperparameter controls how many corruptions are made to the seed sequence and can have a major effect when the optimal solution requires large increases to the sequence length.
  • surrogate.bs=32 --> surrogate.bs=256 with a larger dataset increasing the batch size decreases the run time significantly, I was seeing about 6 hours per seed on an A100 after this change
  • optimizer.resampling_weight=1.0 --> optimizer.resampling_weight=0.5 this change makes the optimizer sample "good" seeds more aggressively when constructing batches of candidates.

from lambo.

samuelstanton avatar samuelstanton commented on June 26, 2024

Increasing the max context length to 256 (task.max_len=256) improves performance on this benchmark, as I noted in the paper, but variance across seeds is still an issue.

image

from lambo.

Thomaswbt avatar Thomaswbt commented on June 26, 2024

Sorry for the late response. Thank you for your effort! Previously I also found that the choice for the starting sequences matters a lot to the final results. I think I will close the issue.

from lambo.

Related Issues (11)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.