yandex-research / tabular-dl-tabr Goto Github PK
View Code? Open in Web Editor NEWThe implementation of "TabR: Unlocking the Power of Retrieval-Augmented Tabular Deep Learning"
Home Page: https://arxiv.org/abs/2307.14338
License: MIT License
The implementation of "TabR: Unlocking the Power of Retrieval-Augmented Tabular Deep Learning"
Home Page: https://arxiv.org/abs/2307.14338
License: MIT License
Hi, thanks for sharing this repo.
I tried to setup an environment by following the instruction in README: micromamba create -f environment.yaml
However, I got the following errors.
I was able to resolve the issues of cudatoolkit, panel and bokeh by modifying the version, but not for pytorch.
Could you help me to address this issue?
nvidia/linux-64 No change
nvidia/noarch No change
conda-forge/noarch No change
conda-forge/linux-64 No change
pytorch/noarch No change
pytorch/linux-64 No change
pyviz/linux-64 No change
pyviz/noarch No change
warning libmamba Problem type not implemented SOLVER_RULE_STRICT_REPO_PRIORITY
warning libmamba Problem type not implemented SOLVER_RULE_STRICT_REPO_PRIORITY
warning libmamba Problem type not implemented SOLVER_RULE_STRICT_REPO_PRIORITY
warning libmamba Problem type not implemented SOLVER_RULE_STRICT_REPO_PRIORITY
warning libmamba Problem type not implemented SOLVER_RULE_STRICT_REPO_PRIORITY
warning libmamba Problem type not implemented SOLVER_RULE_STRICT_REPO_PRIORITY
warning libmamba Problem type not implemented SOLVER_RULE_STRICT_REPO_PRIORITY
warning libmamba Problem type not implemented SOLVER_RULE_STRICT_REPO_PRIORITY
warning libmamba Problem type not implemented SOLVER_RULE_STRICT_REPO_PRIORITY
warning libmamba Problem type not implemented SOLVER_RULE_STRICT_REPO_PRIORITY
warning libmamba Problem type not implemented SOLVER_RULE_STRICT_REPO_PRIORITY
warning libmamba Problem type not implemented SOLVER_RULE_STRICT_REPO_PRIORITY
warning libmamba Problem type not implemented SOLVER_RULE_STRICT_REPO_PRIORITY
warning libmamba Problem type not implemented SOLVER_RULE_STRICT_REPO_PRIORITY
warning libmamba Problem type not implemented SOLVER_RULE_STRICT_REPO_PRIORITY
warning libmamba Problem type not implemented SOLVER_RULE_STRICT_REPO_PRIORITY
warning libmamba Problem type not implemented SOLVER_RULE_STRICT_REPO_PRIORITY
error libmamba Could not solve for environment specs
The following packages are incompatible
├─ bokeh 3.0.3** is requested and can be installed;
├─ cudatoolkit 11.8.0** is not installable because it conflicts with any installable versions previously reported;
├─ panel 0.10.3** is not installable because there are no viable options
│ ├─ panel 0.10.3 would require
│ │ └─ bokeh >=2.2,<2.3 , which conflicts with any installable versions previously reported;
│ └─ panel 0.10.3 conflicts with any installable versions previously reported;
└─ pytorch 1.13.1* is not installable because it conflicts with any installable versions previously reported.
critical libmamba Could not solve for environment specs
‘context_y_emb = self.label_encoder(candidate_y[context_idx][..., None])’ is W(yi) of Step-1. Adding context labels. Here candidate_y[context_idx] are the labels of retrieval samples,
When classification task, n_classes>1, candidate_y[context_idx] value are integers.,
so self.label_encoder do:
else nn.Sequential(
nn.Embedding(n_classes, d_main), delu.nn.Lambda(lambda x: x.squeeze(-2))
)
I would like to know how to make this nn.Sequential operate on non-integer numbers?
For example, label goes from {Tensor:(512,96,)}=tensor([[7, 7, 7, ..., 8, 0, 7],
[8, 8, 8, ..., 8, 8, 8],
[2, 2, 2, ..., 6, 2, 1],
...,
[5, 5, 5, ..., 5, 5, 5],
[5, 5, 5, ..., 5, 5, 5],
[5, 5, 5, ..., 5, 5, 5]], device='cuda:0') to
{Tensor:(512,96,)}=tensor([[7.2188, 7.2188, 7.2188, ..., 7.7451, 0.0000, 7.2188],
[7.7451, 7.7451, 7.7451, ..., 7.7451, 7.7451, 7.7451],
[1.9530, 1.9530, 1.9530, ..., 6.3834, 1.9530, 0.9778],
...,
[5.5154, 5.5154, 5.5154, ..., 5.5154, 5.5154, 5.5154],
[5.5154, 5.5154, 5.5154, ..., 5.5154, 5.5154, 5.5154],
[5.5154, 5.5154, 5.5154, ..., 5.5154, 5.5154, 5.5154]],
device='cuda:0').
err: Expected tensor for argument #1 'indices' to have one of the following scalar types: Long, Int; but got torch.cuda.FloatTensor instead (while checking arguments for embedding)
Thanks~
Could you please share the code in order to transform the new dataset into X_num_train.npy, X_num_val.npy, X_num_test.npy...?
Hello,
My understanding is:
when tune.py finished, will get checkpoint.pt, DONE, report.json, summay.json, but only report.json and DONE(which is empty) are used by evaluate.py, which provide lr, weight_dalay,dropout parameters. The checkpoint.pt is not used.
Usually train.py gets all those parameters, just trail=1 and the early stop mechanism is used, doesn't apply trail 100.
So would you please explain the relationship between tune.py, evaluate.py,ensemble.py in more detail?
thanks very much for this great work.
I am tring to understanding the code and use it in my research.
I encounter an error and don't know how to fix it. Any suggestions would be greatly appreciated.
data = {
"X_num": {
"train": X_train,
"val": X_test
},
"Y": {
"train": y_train,
"val": y_test
}
}
dataset = Dataset(
data=data,
task_type=TaskType.REGRESSION,
score='rmse',
y_info=None,
_Y_numpy=None
)
seed = 42
model = {'num_embeddings': None, # Example embedding configuration
'd_main': 64,
'd_multiplier': 1.0,
'encoder_n_blocks': 2,
'predictor_n_blocks': 2,
'mixer_normalization': False,
'context_dropout': 0.1,
'dropout0': 0.1,
'dropout1': 0.1,
'normalization': 'BatchNorm1d',
'activation': 'ReLU'
}
config = Config(
seed=seed,
data=dataset,
model=model,
context_size=5,
optimizer={'type': 'Adam', 'lr': 0.001},
batch_size=64,
patience=10,
n_epochs=10,
)
output_path = "./output"
force = True
report = main(config, output_path, force=force)
RuntimeError Traceback (most recent call last)
File /Users/hjyu/Library/Mobile Documents/comappleCloudDocs/Code/Transfer_Learning_Tabular/TabR/tabr_test.py:4
2 output_path = "./output"
3 force = True
----> 4 report = main(config, output_path, force=force)
File ~/Library/Mobile Documents/comappleCloudDocs/Code/Transfer_Learning_Tabular/TabR/bin/tabr.py:508, in main(config, output, force)
503 epoch_losses = []
504 for batch_idx in tqdm(
505 lib.make_random_batches(train_size, C.batch_size, device),
506 desc=f'Epoch {epoch}',
507 ):
--> 508 loss, new_chunk_size = lib.train_step(
509 optimizer,
510 lambda idx: loss_fn(apply_model('train', idx, True), Y_train[idx]),
511 batch_idx,
512 chunk_size or C.batch_size,
513 )
514 epoch_losses.append(loss.detach())
515 if new_chunk_size and new_chunk_size < (chunk_size or C.batch_size):
File ~/Library/Mobile Documents/comappleCloudDocs/Code/Transfer_Learning_Tabular/TabR/lib/deep.py:447, in train_step(optimizer, step_fn, batch, chunk_size)
445 optimizer.zero_grad()
446 if batch_size <= chunk_size:
--> 447 loss = step_fn(batch)
448 loss.backward()
449 else:
File ~/Library/Mobile Documents/comappleCloudDocs/Code/Transfer_Learning_Tabular/TabR/bin/tabr.py:510, in main..(idx)
503 epoch_losses = []
504 for batch_idx in tqdm(
505 lib.make_random_batches(train_size, C.batch_size, device),
506 desc=f'Epoch {epoch}',
507 ):
508 loss, new_chunk_size = lib.train_step(
509 optimizer,
--> 510 lambda idx: loss_fn(apply_model('train', idx, True), Y_train[idx]),
511 batch_idx,
512 chunk_size or C.batch_size,
513 )
514 epoch_losses.append(loss.detach())
515 if new_chunk_size and new_chunk_size < (chunk_size or C.batch_size):
File ~/Library/Mobile Documents/comappleCloudDocs/Code/Transfer_Learning_Tabular/TabR/bin/tabr.py:436, in main..apply_model(part, idx, training)
428 candidate_indices = candidate_indices[~torch.isin(candidate_indices, idx)]
429 candidate_x, candidate_y = get_Xy(
430 'train',
431 # This condition is here for historical reasons, it could be just
432 # the unconditional candidate_indices
.
433 None if candidate_indices is train_indices else candidate_indices,
434 )
--> 436 return model(
437 x_=x,
438 y=y if is_train else None,
439 candidate_x_=candidate_x,
440 candidate_y=candidate_y,
441 context_size=C.context_size,
442 is_train=is_train,
443 ).squeeze(-1)
File ~/anaconda3/envs/tabr/lib/python3.9/site-packages/torch/nn/modules/module.py:1194, in Module._call_impl(self, *input, **kwargs)
1190 # If we don't have any hooks, we want to skip the rest of the logic in
1191 # this function, and just call forward.
1192 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
1193 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1194 return forward_call(*input, **kwargs)
1195 # Do not call functions when jit is used
1196 full_backward_hooks, non_full_backward_hooks = [], []
File ~/Library/Mobile Documents/comappleCloudDocs/Code/Transfer_Learning_Tabular/TabR/bin/tabr.py:243, in Model.forward(self, x_, y, candidate_x_, candidate_y, context_size, is_train)
212 def forward(
213 self,
214 *,
(...)
221 ) -> Tensor:
222 # >>>
223 with torch.set_grad_enabled(
224 torch.is_grad_enabled() and not self.memory_efficient
225 ):
(...)
240 # performed without gradients.
241 # Later, it is recomputed with gradients only for the context objects.
242 candidate_k = (
--> 243 self.encode(candidate_x)[1]
244 if self.candidate_encoding_batch_size is None
245 else torch.cat(
246 [
247 self.encode(x)[1]
248 for x in delu.iter_batches(
249 candidate_x, self.candidate_encoding_batch_size
250 )
251 ]
252 )
253 )
254 x, k = self.encode(x)
255 if is_train:
256 # NOTE: here, we add the training batch back to the candidates after the
257 # function apply_model
removed them. The further code relies
258 # on the fact that the first batch_size candidates come from the
259 # training batch.
File ~/Library/Mobile Documents/comappleCloudDocs/Code/Transfer_Learning_Tabular/TabR/bin/tabr.py:206, in Model._encode(failed resolving arguments)
203 assert x # 断言列表x不为空,这可能是为了确保输入数据的正确性
204 x = torch.cat(x, dim=1)
--> 206 x = self.linear(x)
207 for block in self.blocks0:
208 x = x + block(x)
File ~/anaconda3/envs/tabr/lib/python3.9/site-packages/torch/nn/modules/module.py:1194, in Module._call_impl(self, *input, **kwargs)
1190 # If we don't have any hooks, we want to skip the rest of the logic in
1191 # this function, and just call forward.
1192 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
1193 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1194 return forward_call(*input, **kwargs)
1195 # Do not call functions when jit is used
1196 full_backward_hooks, non_full_backward_hooks = [], []
File ~/anaconda3/envs/tabr/lib/python3.9/site-packages/torch/nn/modules/linear.py:114, in Linear.forward(self, input)
113 def forward(self, input: Tensor) -> Tensor:
--> 114 return F.linear(input, self.weight, self.bias)
RuntimeError: mat1 and mat2 must have the same dtype
Thanks for your interesting work, which gives us new hope for tabular data research!
I have some questions to ask:
(1)(params_with_wd if needs_wd else params_with_wd)['params'].append(parameter) in deep.py,is there a mistake here? Why if...else... are all connected to params_with_wd?
(2)This bug occurs during training when trial_0 is completed and trial_1 is in progress, I observed that the gpu usage increases as trial increases, is this normal? Is it possible to release the gpu after trial_0 is completed and then proceed to trial_1? Currently, we only have 12 G of gpu resources.
[...] tmp22mqfk7j_trial_1/output | 0:03:04.846372
Epoch 39: 100%|███████████████████████████████████████████████████████████████████| 52/52 [00:04<00:00, 12.
(val) -0.426 (test) -0.414 (loss) 0.11597█████████████████████████████████████████| 52/52 [00:04<00:00, 13.
[W 2023-10-24 17:37:27,179] Trial 1 failed with parameters: {'model.d_main': 353, 'model.context_dropout': 76563006176, 'model.dropout0': 0.2300649112954666, 'optimizer.lr': 0.0004817508474772368, '?optimizer.weigh': True, 'optimizer.weight_decay': 7.098936257405907e-05} because of the following error: AttributeError("mtorch.cuda' has no attribute 'OutOfMemoryError'").
I have now the first results for TabR on my custom dataset! Thanks for your repo so far!
However I still have some problem with the current implementation of TabR.
This is what the paper stats:
"Figure 4: A simplified illustration of the retrieval module R, introduced in Figure 2. For the target object’s representation x˜, the module takes the m nearest neighbors among the candidates {x˜i} according to the similarity module S and aggregates their values produced by the value module V"
This approach is good for non time dependent datasets like the titanic dataset where each element is independent of another.
However, we have data from a auto completion usecase where the column "CREATIONDATE" is the date column which massively affects the results. Knowledge of information of future dates leaks into elements of the past. This is why the train and test split is split in the following way:
df_train = df[(df.CREATIONDATE >= '20190101') & (df.CREATIONDATE <= '20191231')]
df_test = df[(df.CREATIONDATE >= '20200101') & (df.CREATIONDATE <= '20200229')]
You see, the test set is strictly after the train set on the time line. And without the model learning this during training, the test results are not very well.
We somehow also need to achieve this inside the train set during training. It means when predicting the class of one row during train time, we need to make sure that only elements of the train set out of the past (so with CREATIONDATE_candidates < CREATIONDATE_train_element_we_want_to_predict).
Where do I need to change this logic in the code in the best possible way?
Thank you for your interesting work, it inspires me a lot.There are a couple of questions that have been bugging me
The work initially uses the entire training set as the fixed set of candidates for all objects.
1.The function 'apply_model' removes them.
2.However,when computing the forward output, it adds the current batch to the candidate set and predicts the output.Should the current batch be added to the candidate set after predicting the output?
3.Additionally, when adding to the candidate set and retrieving context samples, it is guaranteed to retrieve samples that match the target. The related index is then removed from the obtained index.
I can't understand the role and connection of these three operations, can you provide some suggestions
Hi! The following line in method make_parameter_goups
looks very much like a mistype
params_with_wd if needs_wd else params_with_wd
because of it we never add anything to params_without_wd
which defeats the purpose of zero_weight_decay_condition
Hello,
From your code, it is clear that n_trails=100, but I find that the last trail result is not optimal, and the paper doesn't seem to go into detail about this, so may I ask what the results reported in the paper do with these 100 trail results? Optimal or average?
And what are the 15 random seeds used in the test, and are they not used in the train?
Thanks~
After the training of an epoch, when evaluating, eval_batch_size=32768, meaning that all the validation sets are treated as a batch, when evaluating the otto dataset, there is insufficient GPU memory, how to improve this? When evaluating the otto dataset, the GPU is running out of memory? bug as:
RuntimeError: CUDA out of memory. Tried to allocate 2.25 GiB (GPU 0; 11.17 GiB total capacity; 5.74 GiB already allocated; 2.04 GiB free; 7.13 GiB reserved in total by PyTorch)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/anaconda3/envs/torch/lib/python3.9/site-packages/optuna/study/_optimize.py", line 200, in _run_trial/bin/tune.py", line 160, in objective
value_or_values = func(trial)
File "
report = function(raw_config, Path(tmp) / 'output') #the objective function, in turn, calls the function = "bin.tabr.main"
File "/bin/tabr.py", line 592, in main/anaconda3/envs/torch/lib/python3.9/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
metrics, predictions, eval_batch_size = evaluate(
File "
return func(*args, **kwargs)
File /bin/tabr.py", line 537, in evaluate/lib/util.py", line 493, in is_oom_exception
if not lib.is_oom_exception(err):
File "
return isinstance(err, torch.cuda.OutOfMemoryError) or any(
AttributeError: module 'torch.cuda' has no attribute 'OutOfMemoryError'
[W 2023-10-31 20:35:10,691] Trial 0 failed with value None.
thanks~
Hello, I trained your model on my dataset, thank you a lot for this brilliant work.
But I don't understand, how to make prediction on my X_test without y_test.(I put 50% of validation instead of real X_test)
Hi! Thank you for your interesting work.
I faced some problems because of this function (
Line 117 in d628ec7
torch.atleast_2d(torch.as_tensor(value)).to(device)
instead of torch.as_tensor(value).to(device)
solved this problem.at least fit predict example would be great
Hi, bothering you again~
For example, in the wine quality dataset, the ensemble performance is 0.620±0.007, and it is known that go.py
will get three sets of scores, so how to get this ±0.007?
If I use this work as a backbone network, and add my own modules, then publish an academic paper, does that constitute copyright infringement? Of course, I would state that I am citing your work.
Hi. Is OneHotEncoder betters working with suggested architecture? Did you test others (OrdinalEncoder, etc)?
When the classification task is performed, n_classes>1,call self.label_encoder of tabr.py, and there is a bug:
ValueError: fn must be a function from torch
or a method of torch.Tensor
, but ...
How do I fix this?
os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID"
os.environ["CUDA_VISIBLE_DEVICES"] = "3,4"
torch.cuda.empty_cache()
print('os.environ["CUDA_VISIBLE_DEVICES"]', os.environ["CUDA_VISIBLE_DEVICES"])
print('Free gpu memory: torch.cuda.empty_cache()')
When I use 2 gpus and train with tabr on otto dataset, the following bug occurs.
How can I debug it?
ps: Training with one gpu is ok, two gpus in parallel is bug.
bug as:
...
Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds"
failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [6055,0,0], thread: [22,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds"
failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [6055,0,0], thread: [23,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds"
failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [6055,0,0], thread: [24,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds"
failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [6055,0,0], thread: [25,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds"
failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [6055,0,0], thread: [26,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds"
failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [6055,0,0], thread: [27,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds"
failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [6055,0,0], thread: [28,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds"
failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [6055,0,0], thread: [29,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds"
failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [6055,0,0], thread: [30,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds"
failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [6055,0,0], thread: [31,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds"
failed.
Faiss assertion 'err == cudaSuccess' failed in virtual void faiss::gpu::StandardGpuResourcesImpl::deallocMemory(int, void*) at /project/faiss/faiss/gpu/StandardGpuResources.cpp:518; details: Failed to cudaFree pointer 0x420b046000 (error 59 device-side assert triggered)
Aborted (core dumped)
Thanks~
Hello,
From your code, I check the evaluation report.json of MLP on regression-cat-medium-0-OnlineNewsPopularity. And the best epoch metrics is as follows:
"n_parameters": 495793,
"prediction_type": null,
"best_epoch": 26,
"metrics": {
"train": {
"rmse": 0.8142614908186779,
"mae": 0.5985163409301961,
"r2": 0.23417308630867428,
"score": -0.8142614908186779
},
"val": {
"rmse": 0.844946381250874,
"mae": 0.6250255374955493,
"r2": 0.15331082859100664,
"score": -0.844946381250874
},
"test": {
"rmse": 0.8618776869166989,
"mae": 0.6317802140393205,
"r2": 0.14868952248971512,
"score": -0.8618776869166989
}
}
Generally, lower values for RMSE and MAE are desirable, and R² closer to 1 indicates better explanatory power of the model. Based on the provided results, the model performs relatively poorly on the validation and test sets, and the R² values suggest a limited explanatory capability. Is further optimization of the model or consideration of alternative improvement strategies still necessary? In addition, Tensorboard is provided in the project. How can we analyze this model based on the provided Tensorboard?"
Thanks~
what the code I change as follow:
cp -r exp/tabr/why/regression-num-medium-2-wine_quality/ exp/tabr/why/regression-num-medium-3-wine_quality/
cp -r data/regression-num-medium-2-wine_quality/ data/regression-num-medium-3-wine_quality/
cp exp/tabr/why/regression-num-medium-3-wine_quality/0-tuning.toml exp/tabr/why/regression-num-medium-3-wine_quality/1-tuning.toml
In 'data/regression-num-medium-3-wine_quality/info.json', change 'regression-num-medium-2-wine_quality' to 'regression-num-medium-3-wine_quality'
In 'exp/tabr/why/regression-num-medium-3-wine_quality/1-tuning.toml', change ''data/regression-num-medium-2-wine_quality'' to ''data/regression-num-medium-3-wine_quality''
python bin/tune.py exp/tabr/why/regression-num-medium-2-wine_quality/1-tuning.toml
python bin/tune.py exp/tabr/why/regression-num-medium-3-wine_quality/1-tuning.toml
Then 'python bin/tune.py exp/tabr/why/regression-num-medium-2-wine_quality/1-tuning.toml' is ok, but 'python bin/tune.py exp/tabr/why/regression-num-medium-3-wine_quality/1-tuning.toml',
[W 2023-11-17 18:09:18,793] Trial 0 failed with value None.
0%| | 0/100 [00:48<?, ?it/s]
Traceback (most recent call last):
File "~/bin/tune.py", line 216, in <module>
lib.run_Function_cli(main)
File "~/tabR_lzd/lib/util.py", line 276, in run_Function_cli
function(
File "~/tabR_lzd/bin/tune.py", line 202, in main
study.optimize(
File "~/anaconda3/envs/torch/lib/python3.9/site-packages/optuna/study/study.py", line 442, in optimize
_optimize(
File "~/anaconda3/envs/torch/lib/python3.9/site-packages/optuna/study/_optimize.py", line 66, in _optimize
_optimize_sequential(
File "~/anaconda3/envs/torch/lib/python3.9/site-packages/optuna/study/_optimize.py", line 163, in _optimize_sequential
frozen_trial = _run_trial(study, func, catch)
File "~/anaconda3/envs/torch/lib/python3.9/site-packages/optuna/study/_optimize.py", line 251, in _run_trial
raise func_err
File "~/anaconda3/envs/torch/lib/python3.9/site-packages/optuna/study/_optimize.py", line 200, in _run_trial
value_or_values = func(trial)
File "~/tabR_lzd/bin/tune.py", line 161, in objective
report = function(raw_config, Path(tmp) / 'output') #the objective function, in turn, calls the function = "bin.tabr.main"
File "~/tabR_lzd/bin/tabr.py", line 595, in main
metrics, predictions, eval_batch_size = evaluate(
File "~/anaconda3/envs/torch/lib/python3.9/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "~/tabR_lzd/bin/tabr.py", line 549, in evaluate
dataset.calculate_metrics(predictions, report['prediction_type'])
File "~/tabR_lzd/lib/data.py", line 235, in calculate_metrics
metrics = {
File "~/tabR_lzd/lib/data.py", line 236, in <dictcomp>
part: calculate_metrics_(
File "~/tabR_lzd/lib/metrics.py", line 58, in calculate_metrics
'rmse': sklearn.metrics.mean_squared_error(y_true, y_pred) ** 0.5 * y_std,
File "~/anaconda3/envs/torch/lib/python3.9/site-packages/sklearn/utils/_param_validation.py", line 211, in wrapper
return func(*args, **kwargs)
File "~/anaconda3/envs/torch/lib/python3.9/site-packages/sklearn/metrics/_regression.py", line 474, in mean_squared_error
y_type, y_true, y_pred, multioutput = _check_reg_targets(
File "~/anaconda3/envs/torch/lib/python3.9/site-packages/sklearn/metrics/_regression.py", line 101, in _check_reg_targets
y_pred = check_array(y_pred, ensure_2d=False, dtype=dtype)
File "~/anaconda3/envs/torch/lib/python3.9/site-packages/sklearn/utils/validation.py", line 951, in check_array
raise ValueError(
ValueError: Found array with dim 3. None expected <= 2.
How to debug?
Thanks!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.