Comments (7)
Can you tell me more about your setup? are you running the code on a machine with more than one GPU?
from lambo.
Hi,
I followed the steps in the README to install it and ran it on a machine with only one GPU.
And after debugging, I found that in the place where the error was reported:
File "/home/bcell/anaconda3/envs/lambo-env/lib/python3.8/site-packages/torch/distributions/lkj_cholesky.py", line 117, in log_prob
unnormalized_log_pdf = torch.sum(order * diag_elems.log(), dim=-1)
The device of order
is cpu
and the device of diag_elems
is cuda:0
. I think that's the problem.
Thank you.
from lambo.
can you post the full stack trace?
from lambo.
Sure, the following is the full stack trace:
logger:
_target_: upcycle.logging.DataFrameLogger
log_dir: data/experiments/test/vibrant-flower-23/2022-06-17_09-42-45
task:
_target_: lambo.tasks.regex.RegexTask
regex_list:
- (?=AV)
- (?=VC)
- (?=CA)
obj_dim: 3
log_prefix: regex
min_len: 32
max_len: 36
num_start_examples: 512
batch_size: 16
max_num_edits: null
max_ngram_size: 1
allow_len_change: true
acquisition:
_target_: lambo.acquisitions.ehvi.NoisyEHVI
num_samples: 2
batch_size: 16
encoder:
_target_: lambo.models.lm_elements.LanguageModel
name: mlm_cnn
model:
_target_: lambo.models.shared_elements.mCNN
tokenizer:
_target_: lambo.utils.ResidueTokenizer
max_len: 36
embed_dim: 64
latent_dim: 16
out_dim: 16
kernel_size: 5
p: 0.0
layernorm: true
max_len_delta: 0
batch_size: 32
num_epochs: 128
patience: 32
lr: 0.001
max_shift: 0
mask_ratio: 0.125
optimizer:
_target_: lambo.optimizers.pymoo.ModelBasedGeneticOptimizer
_recursive_: false
num_rounds: 64
num_gens: 32
seed: 0
concentrate_pool: 1
residue_sampler: uniform
resampling_weight: 1.0
encoder_obj: mll
algorithm:
_target_: pymoo.algorithms.soo.nonconvex.ga.GA
pop_size: 16
n_offsprings: null
sampling:
_target_: lambo.optimizers.sampler.BatchSampler
batch_size: 16
crossover:
_target_: lambo.optimizers.crossover.BatchCrossover
prob: 0.25
prob_per_query: 0.25
mutation:
_target_: lambo.optimizers.mutation.LocalMutation
prob: 1.0
eta: 16
safe_mut: false
eliminate_duplicates: true
tokenizer:
_target_: lambo.utils.ResidueTokenizer
surrogate:
_target_: lambo.models.gp_models.MultiTaskExactGP
max_shift: 0
mask_size: 0
bootstrap_ratio: null
min_num_train: 128
task_noise_init: 0.25
gp_lr: 0.005
enc_lr: 0.005
bs: 32
eval_bs: 16
num_epochs: 256
holdout_ratio: 0.2
early_stopping: true
patience: 32
eval_period: 2
out_dim: 3
feature_dim: 16
encoder_wd: 0.0001
rank: null
task_covar_prior:
_target_: gpytorch.priors.LKJCovariancePrior
'n': 3
eta: 2.0
sd_prior:
_target_: gpytorch.priors.SmoothedBoxPrior
a: 0.0001
b: 1.0
data_covar_module:
_target_: gpytorch.kernels.MaternKernel
ard_num_dims: 16
lengthscale_prior:
_target_: gpytorch.priors.NormalPrior
loc: 0.7
scale: 0.01
likelihood:
_target_: gpytorch.likelihoods.MultitaskGaussianLikelihood
num_tasks: 3
has_global_noise: false
noise_constraint:
_target_: gpytorch.constraints.GreaterThan
lower_bound: 0.0001
seed: 0
trial_id: 0
project_name: lambo
version: v0.2.1
data_dir: data/experiments
exp_name: test
job_name: vibrant-flower-23
timestamp: 2022-06-17_09-42-45
log_dir: data/experiments/test
wandb_mode: online
GPU available: True
| | round_idx | hypervol_abs | hypervol_rel | num_bb_evals | time_elapsed |
|---:|------------:|---------------:|---------------:|---------------:|---------------:|
| 0 | 0 | 2.048 | 1 | 0 | 0.00963354 |
best candidates
| | obj_val_0 | obj_val_1 | obj_val_2 |
|---:|------------:|------------:|------------:|
| 0 | -2.0000 | -2.0000 | -2.0000 |
active set contracted to 4 pareto points
active set augmented with 12 random points
402 train, 56 val, 54 test
---- preparing checkpoint ----
starting val NLL: 1.6021
---- fitting all params ----
[2022-06-17 10:09:25,165][root][ERROR] - Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!
Traceback (most recent call last):
File "/home/bcell/home/lambo/scripts/black_box_opt.py", line 55, in main
metrics = optimizer.optimize(
File "/home/bcell/home/lambo/lambo/optimizers/pymoo.py", line 189, in optimize
problem = self._create_inner_task(
File "/home/bcell/home/lambo/lambo/optimizers/pymoo.py", line 389, in _create_inner_task
records = self.surrogate_model.fit(
File "/home/bcell/home/lambo/lambo/models/gp_models.py", line 321, in fit
return fit_gp_surrogate(**fit_kwargs)
File "/home/bcell/home/lambo/lambo/models/gp_utils.py", line 208, in fit_gp_surrogate
enc_sup_loss = fit_encoder_only(
File "/home/bcell/home/lambo/lambo/models/gp_utils.py", line 76, in fit_encoder_only
loss = gp_train_step(surrogate, optimizer, inputs, targets, mll)
File "/home/bcell/home/lambo/lambo/models/gp_utils.py", line 60, in gp_train_step
loss = -mll(output, targets).mean()
File "/home/bcell/anaconda3/envs/lambo-env/lib/python3.8/site-packages/gpytorch/module.py", line 30, in __call__
outputs = self.forward(*inputs, **kwargs)
File "/home/bcell/anaconda3/envs/lambo-env/lib/python3.8/site-packages/gpytorch/mlls/exact_marginal_log_likelihood.py", line 63, in forward
res = self._add_other_terms(res, params)
File "/home/bcell/anaconda3/envs/lambo-env/lib/python3.8/site-packages/gpytorch/mlls/exact_marginal_log_likelihood.py", line 43, in _add_other_terms
res.add_(prior.log_prob(closure(module)).sum())
File "/home/bcell/anaconda3/envs/lambo-env/lib/python3.8/site-packages/gpytorch/priors/lkj_prior.py", line 105, in log_prob
log_prob_corr = self.correlation_prior.log_prob(correlations)
File "/home/bcell/anaconda3/envs/lambo-env/lib/python3.8/site-packages/gpytorch/priors/lkj_prior.py", line 62, in log_prob
return super().log_prob(X_cholesky)
File "/home/bcell/anaconda3/envs/lambo-env/lib/python3.8/site-packages/gpytorch/priors/prior.py", line 27, in log_prob
return super(Prior, self).log_prob(self.transform(x))
File "/home/bcell/anaconda3/envs/lambo-env/lib/python3.8/site-packages/torch/distributions/lkj_cholesky.py", line 117, in log_prob
unnormalized_log_pdf = torch.sum(order * diag_elems.log(), dim=-1)
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!
from lambo.
I just tested things again starting with a fresh conda env, following the installation instructions in the README, and I wasn't able to reproduce the issue. Mostly likely this is a problem with your virtual python environment. What versions of pytorch
, gpytorch
and cudatoolkit
are installed? You can use conda list
to see what versions you're using. What version of CUDA do you have installed? What OS are you running on?
from lambo.
closing due to inactivity, feel free to reopen if you have further questions
from lambo.
@yuyang-0825 you might be interested to know I was able to reproduce your error while investigating #7, and I can confirm the device error occurs if you have gpytorch>=1.7
installed. If you install gpytorch using the pinned commit hash in requirements.txt
the program should run as intended.
from lambo.
Related Issues (11)
- 'NoneType' object HOT 2
- A typo on the pre-processing - pH values for RFPs HOT 1
- Installation and experiment replication [MacOS (M1/ARM)] HOT 3
- Issue with Acquisition Function Calculation in LAMBO Implementation HOT 2
- Caluclation error in gpytorch HOT 8
- mask_ratio during mutation HOT 2
- sampler INF NAN HOT 2
- Problem with installing requirements HOT 3
- Could you release the notebook to reproduce figure 10 of the paper? HOT 17
- Error when using `chem_lsbo` with `mlm_cnn` HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from lambo.