lumia-group / rasat Goto Github PK

This project forked from servicenow/picard

The official implementation of the paper "RASAT: Integrating Relational Structures into Pretrained Seq2Seq Model for Text-to-SQL"(EMNLP 2022)

Home Page: https://arxiv.org/abs/2205.06983

License: Apache License 2.0

Python 66.17% Haskell 31.32% Makefile 0.53% Thrift 0.27% Dockerfile 1.64% Shell 0.08%

rasat's People

Contributors

Stargazers

Forkers

mshuffett torrin-msci-project yjang43 iamfaith felixstander guyla mldlstudy anilnayak-hme mpskex ashwindaswanibu viswanathan-tech asakuraken pureeidolon sepehraminiafshar yulong-liang seonhee99 nv259 fengxin-zhxx

rasat's Issues

Error Message "size mismatch for relation_k_emb.weight" when i'm trying load a training models using t5-small

I am rasat running on two consumer-grade graphics cards.The pre-trained model I am implementing is t5-small.And successfully executed the following command: CUDA_VISIBLE_DEVICES="0,1" python3 -m torch.distributed.launch --nnodes=1 --nproc_per_node=2 seq2seq/run_seq2seq.py configs/spider/train_spider_rasat_small.json

tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray.
208210 ***** eval metrics *****
208211   epoch                   =    3071.95
208212   eval_exact_match        =     0.5348
208213   eval_exec               =     0.5387
208214   eval_loss               =     0.7128
208215   eval_runtime            = 0:02:24.19
208216   eval_samples            =       1034
208217   eval_samples_per_second =      7.171
208218 100% 65/65 [02:22<00:00,  2.20s/it]<__array_function__ internals>:5: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different l

However, when I was evaluating, as mentioned in the train configuration file, I set the evaluation model path to "./experiment/train_spider_rasat_small",.

I encountered an error when executing the evaluation command
python3 seq2seq/eval_run_seq2seq.py configs/spider/eval_spider_rasat_4160.json

The error message is:

Dataset name: spider
Mode: dev
Databases has been preprocessed. Use cache.
Dataset has been preprocessed. Use cache.
Dataset: spider
Mode: dev
Match Questions...
100%|█████████████████████████████████████████████████████████████████████████████████████████████| 1034/1034 [00:01<00:00, 606.60it/s]Question match errors: 0/1034
Match Table, Columns, DB Contents...
1034it [00:01, 614.75it/s]
DB match errors: 0/1034
Generate Relations...
100%|██████████████████████████████████████████████████████████████████████████████████████████████| 1034/1034 [00:10<00:00, 95.10it/s]Edge match errors: 0/2340638
06/28/2023 20:30:11 - WARNING - datasets.arrow_dataset -   Loading cached processed dataset at ./transformers_cache/spider/spider/1.0.0/a9000e8b37ea883ad113d628d95c9067385cc1105e2641a44bfa3090483dbb9b/cache-21e2b8bdcac7ddca.arrow
===================================================
Num of relations uesd in RASAT is :  45
===================================================
Use relation model.
./experiment/train_spider_rasat_small
Traceback (most recent call last):
  File "seq2seq/eval_run_seq2seq.py", line 320, in <module>
    main()
  File "seq2seq/eval_run_seq2seq.py", line 208, in main
    model = nn.DataParallel(model_cls_wrapper(T5ForConditionalGeneration).from_pretrained(
  File "/opt/conda/lib/python3.8/site-packages/transformers/modeling_utils.py", line 1453, in from_pretrained
    model, missing_keys, unexpected_keys, mismatched_keys, error_msgs = cls._load_state_dict_into_model(
  File "/opt/conda/lib/python3.8/site-packages/transformers/modeling_utils.py", line 1607, in _load_state_dict_into_model
    raise RuntimeError(f"Error(s) in loading state_dict for {model.__class__.__name__}:\n\t{error_msg}")
RuntimeError: Error(s) in loading state_dict for T5ForConditionalGeneration:
        size mismatch for relation_k_emb.weight: copying a param with shape torch.Size([49, 64]) from checkpoint, the shape in current model is torch.Size([46, 64]).
        size mismatch for relation_v_emb.weight: copying a param with shape torch.Size([49, 64]) from checkpoint, the shape in current model is torch.Size([46, 64]).
        size mismatch for encoder.relation_k_emb.weight: copying a param with shape torch.Size([49, 64]) from checkpoint, the shape in current model is torch.Size([46, 64]).
        size mismatch for encoder.relation_v_emb.weight: copying a param with shape torch.Size([49, 64]) from checkpoint, the shape in current model is torch.Size([46, 64]).

wandb: Waiting for W&B process to finish, PID 310089... (failed 1). Press ctrl-c to abort syncing.

Could you please check and see where the error occurred? Thank you.

converting the list to a single numpy.ndarray with numpy.array() before converting to a tensor

usr/local/lib/python3.7/site-packages/transformers/tokenization_utils_base.py:705: UserWarning: Creating a tensor from a list of numpy.ndarrays is extremely slow. Please consider converting the list to a single numpy.ndarray with numpy.array() before converting to a tensor. (Triggered internally at ../torch/csrc/utils/tensor_new.cpp:230.)
tensor = as_tensor(value)

I am trying to load the model using a previously trained checkpoint file and it takes a long time to skip the previous data and has the above warning, how can I solve it?

File missing when starting training

So the problem I encountered was that the file named train_0125_example.json as stated in the readme file does not exist. I changed the file name to train_sparc_rasat_small.json hoping that would circumvent the problem, but another problem arised stating that No module named 'third_party.

Run time

Is this normal? Does it really take that long?

Model weight missing.

Hello author, I found that the model weights were missing when I tried to reproduce your results directly using Docker. Can you share the model weights with me?

model_cls_wrapper is not used

Hi, @JiexingQi, I notice that model_cls_wrapper is not used when PICARD is enable, which is different from the implementation of PICARD. Will PICARD work if we don't wrap the model with model_cls_wrapper?

The code references are listed below:
RASAT implementation: https://github.com/LUMIA-Group/rasat/blob/main/seq2seq/run_seq2seq.py#L170
PICARD implementation: https://github.com/ServiceNow/picard/blob/main/seq2seq/run_seq2seq.py#L162

Could you share the training config of sparc_add_coref_t5_3b_order_0514_ckpt-4224?

Hi, @JiexingQi
I'm trying to reproduce RASAT-T5-3b on SParC. I have trained the model for 4736 steps and only got the best QEM of 63.7%, IEM of 45.0% at step 3776. There is still a gap of 1.3% of QEM compared to sparc_add_coref_t5_3b_order_0514_ckpt-4224 without PICARD. Is there anything wrong with my training config? Could you share the training config of sparc_add_coref_t5_3b_order_0514_ckpt-4224?

My training config is modified based on the given train_sparc_rasat_small.json. I have 8 gpus, so I modify per_device_train_batch_size and gradient_accumulation_steps to achieve the recommended total_batch_size of 2048.

model_name_or_path: t5-small -> t5-3b
dataset: sparc+spider -> sparc
per_device_train_batch_size: 16 -> 2
per_device_eval_batch_size: 16 -> 2
gradient_accumulation_steps: 32 -> 128
use_coref: false -> true
use_dependency: true -> false

My full training config:

{
    "run_name": "train_sparc_rasat_3b",
    "model_name_or_path": "t5-3b",
    "use_rasat": true,
    "dataset": "sparc",
    "wandb_project_name": "rasat_experiment",
    "source_prefix": "",
    "schema_serialization_type": "custom",
    "schema_serialization_randomized": false,
    "schema_serialization_with_db_id": true,
    "schema_serialization_with_db_content": false,
    "normalize_query": true,
    "target_with_db_id": true,
    "output_dir": "./experiment/train_sparc_rasat_3b",
    "cache_dir": "./transformers_cache",
    "do_train": true,
    "do_eval": true,
    "fp16": false,
    "num_train_epochs": 3072,
    "per_device_train_batch_size": 2,
    "per_device_eval_batch_size": 2,
    "gradient_accumulation_steps": 128,
    "label_smoothing_factor": 0.0,
    "learning_rate": 1e-4,
    "adafactor": true,
    "adam_eps": 1e-6,
    "lr_scheduler_type": "constant",
    "warmup_ratio": 0.0,
    "warmup_steps": 0,
    "weight_decay": 0,
    "seed": 1,
    "report_to": ["wandb"],
    "logging_strategy": "steps",
    "logging_first_step": true,
    "logging_steps": 3,
    "load_best_model_at_end": true,
    "metric_for_best_model": "exact_match",
    "greater_is_better": true,
    "save_total_limit": 2,
    "save_steps": 64,
    "evaluation_strategy": "steps",
    "eval_steps": 64,
    "predict_with_generate": true,
    "num_beams": 4,
    "num_beam_groups": 1,
    "edge_type": "Default",
    "use_coref": true, 
    "use_dependency": false,
    "use_picard": false,
    "overwrite_output_dir": true,
    "dataloader_num_workers": 8,
    "group_by_length": true,
    "gradient_checkpointing":true
}

Indexing error if max_train_samples is not 7000

If max_train_samples is not 7000, then this line will throw an IndexError.

It seems that we should always go to the else branch since it will always loop over exactly the length of the dataset.

rasat/seq2seq/utils/dataset.py

Line 256 in b912ce8

train_input_ids = [dataset[i]['input_ids'] for i in range(7000)]

problem with train/eval

Hi. I'm trying to run the project and produce the results myself. But I keep getting this error on train/eval commands.

Traceback (most recent call last):
File "/content/rasat/seq2seq/run_seq2seq.py", line 292, in
main()
File "/content/rasat/seq2seq/run_seq2seq.py", line 158, in main
metric, dataset_splits = load_dataset(
File "/content/rasat/seq2seq/utils/dataset_loader.py", line 181, in load_dataset
sparc_dataset_splits = prepare_splits(
File "/content/rasat/seq2seq/utils/dataset.py", line 346, in prepare_splits
train_split = _prepare_train_split(
File "/content/rasat/seq2seq/utils/dataset.py", line 261, in _prepare_train_split
relation_matrix_l = preprocess_by_dataset(
File "/content/rasat/seq2seq/preprocess/choose_dataset.py", line 14, in preprocess_by_dataset
_, relations = preprocessing_lgerels2t5rels_changeOrder(data_base_dir, dataset_name, t5_processed, mode, edge_type, use_coref, use_dependency)
File "/content/rasat/seq2seq/preprocess/lgerels2t5rels_changeOrder.py", line 476, in preprocessing_lgerels2t5rels_changeOrder
match_table_and_column(dataset_lgesql, table_lgesql, t5_tokenizer)
File "/content/rasat/seq2seq/preprocess/lgerels2t5rels_changeOrder.py", line 136, in match_table_and_column
raise e
File "/content/rasat/seq2seq/preprocess/lgerels2t5rels_changeOrder.py", line 133, in match_table_and_column
lge_table = table_lgesql[db_name]['table_names']
KeyError: 'sqlite_sequence:name,seq'

Does anybody have a solution?
If you have a working notebook that works just fine on train/eval commands, It would be kind of you to share it.

Thank you

Eval process issue

Thanks for reading this issue. When I'm already in docker and run "CUDA_VISIBLE_DEVICES="2" python3 seq2seq/eval_run_seq2seq.py configs/cosql/eval_cosql_rasat_576.json", I got this error.

Truncation was not explicitly activated but max_length is provided a specific value, please use truncation=True to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to truncation.
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 6.38ba/s]
01/07/2024 08:31:51 - WARNING - stanza - Can not find mwt: default from official model list. Ignoring it.
Traceback (most recent call last):
File "seq2seq/eval_run_seq2seq.py", line 310, in
main()
File "seq2seq/eval_run_seq2seq.py", line 177, in main
tokenizer=tokenizer,
File "/app/seq2seq/utils/dataset_loader.py", line 123, in load_dataset
**_prepare_splits_kwargs,
File "/app/seq2seq/utils/dataset.py", line 360, in prepare_splits
pre_process_function=pre_process_function,
File "/app/seq2seq/utils/dataset.py", line 324, in _prepare_eval_split
use_dependency=data_training_args.use_dependency
File "/app/seq2seq/preprocess/choose_dataset.py", line 12, in preprocess_by_dataset
preprocessing_generate_lgerels(data_base_dir, dataset_name, mode, use_coref, use_dependency)
File "/app/seq2seq/preprocess/process_dataset.py", line 81, in preprocessing_generate_lgerels
processor = Preprocessor(dataset_name, db_dir=db_dir, db_content=True)
File "/app/seq2seq/preprocess/common_utils.py", line 146, in init
self.nlp_tokenize = stanza.Pipeline('en', processors='tokenize,mwt,pos,lemma,depparse', tokenize_pretokenized = False, use_gpu=True)#, use_gpu=False)
File "/home/toolkit/.local/lib/python3.7/site-packages/stanza/pipeline/core.py", line 107, in init
self.load_list = add_dependencies(resources, lang, self.load_list) if lang in resources else []
File "/home/toolkit/.local/lib/python3.7/site-packages/stanza/resources/common.py", line 245, in add_dependencies
default_dependencies = resources[lang]['default_dependencies']
KeyError: 'default_dependencies'

Thanks for any solution. That will be really important for me.

Picard Client Launch Issue

I am executing the Eval code on the spider data with picard using docker image

provided 777 permission to mentioned 2 directories
'make eval'
inside docker -> Install stanza package
python3 seq2seq/eval_run_seq2seq.py configs/spider/eval_spider_rasat_4160.json

After preprocessing , generating relation i got following error :

`
Use relation model.
Traceback (most recent call last):
File "seq2seq/eval_run_seq2seq.py", line 309, in
main()
File "seq2seq/eval_run_seq2seq.py", line 197, in main
model = model_cls_wrapper(T5ForConditionalGeneration).from_pretrained(
File "seq2seq/eval_run_seq2seq.py", line 184, in
model_cls=model_cls, picard_args=picard_args, tokenizer=tokenizer, schemas=dataset_splits.schemas
File "/app/seq2seq/utils/custom_picard_model_wrapper.py", line 482, in with_picard
asyncio.run(_init_picard(), debug=False)
File "/opt/conda/lib/python3.7/asyncio/runners.py", line 43, in run
return loop.run_until_complete(main)
File "/opt/conda/lib/python3.7/asyncio/base_events.py", line 587, in run_until_complete
return future.result()
File "/app/seq2seq/utils/custom_picard_model_wrapper.py", line 126, in _init_picard
await _register_schema(db_id=db_id, db_info=db_info, picard_client=client)
File "/app/seq2seq/utils/custom_picard_model_wrapper.py", line 132, in _register_schema
await picard_client.registerSQLSchema(db_id, sql_schema)
thrift.py3.exceptions.TransportError: (<TransportErrorType.UNKNOWN: 0>, 'Channel is !good()', 0, <TransportOptions.0: 0>)

Enviornment : Macbook
It would be really helpful if you can suggest on this : @Monstarrr @JiexingQi @hantek

How to fetch the Prediction Probabilities of the Predicted Tokens

We want to get the prediction probabilities of the predicted tokens (on the top beam) without constrained decoding (PICARD). How can we get these values? @JiexingQi Can you please point to the code for the same? Thanks.

Hello, I have a small question for you. What is the meaning of the sequence number after the model configuration file?

In the Spider dataset, there are also such configuration file extensions.

I would like to ask what these mean, and if I use t5-small for training, which configuration file should I use for evaluation?

Wishing you good health during the Dragon Boat Festival and success in your work. Best regards.

Not able to clone the repo

Cloning into 'RASAT'...
fatal: repository 'https://github.com/JiexingQi/RASAT.git/' not found

share the predicted SQL file

Hello, may I ask if you can share the predicted SQL that you finally ran on the validation set? Replicating your work is too time-consuming for me as my server does not have a GPU.

Possible error in adding new tokens to t5_tokenizer

Is the extra space supposed to be here for the <= and < tokens for the t5_tokenizer?

rasat/seq2seq/preprocess/lgerels2t5rels_changeOrder.py

Line 433 in b912ce8

t5_tokenizer.add_tokens([AddedToken(" <="), AddedToken(" <")])

Possible error in `generate_relations`

rasat/seq2seq/preprocess/lgerels2t5rels_changeOrder.py

Line 378 in 961a697

for edge in edges:

In the for edge in edges loop, db_t5_src_id and db_t5_dst_id will always be empty lists since they are reassigned after the try-except block. I believe the intention here was to assign these two lists before the try-except block.

Out of memory with default configs/train.json on 4*24GB GPU

Hi @JiexingQi I found you asked similar question here: ServiceNow#29. I tried to train t5-3b to use CUDA_VISIBLE_DEVICES="0,1,2,3" python3 -m torch.distributed.launch --nnodes=1 --nproc_per_node=4 seq2seq/run_seq2seq.py configs/train.json with even config like this:
"per_device_train_batch_size": 1,
"per_device_eval_batch_size": 1,
"gradient_accumulation_steps": 1,
"gradient_checkpointing": true,
But I still got out of memory error and all four GPUs' memory are used up (about 22GB used for each of the GPU)
I think you must have some similar experience when using picard code. Could you show me how you solve this annoying out of memory problem? Thank you!

Question about Question Dependency Structure relation

Hello there,

First of all, great work you guys! I'm really excited to understand and evaluate your model on other datasets.

My question is simply: How is Question Dependency Structure relation different from regular self-attention? According to Figure 2 in the paper, it seems that this relation serves the same purpose as vanilla self-attention. Can you expand upon this relation concept?

TypeError: issubclass() arg 1 must be a class

Getting an error while running `bash run_corefer_processing.sh`
`Start coref process
2023-06-13 08:13:23 INFO: Loading these models for language: en (English):

| Processor | Package |

| tokenize | combined |

2023-06-13 08:13:23 INFO: Use device: cpu
2023-06-13 08:13:23 INFO: Loading: tokenize
2023-06-13 08:13:23 INFO: Done loading processors!
Traceback (most recent call last):
File "/home/prathamesh/code/T2S_model_training/rasat/get_coref.py", line 303, in
get_coref_by_path(input_path = args.input_path, output_path = args.output_path, dataset_name = args.dataset_name, mode = args.mode)
File "/home/prathamesh/code/T2S_model_training/rasat/get_coref.py", line 285, in get_coref_by_path
nlp = init_nlp()
File "/home/prathamesh/code/T2S_model_training/rasat/get_coref.py", line 38, in init_nlp
nlp = spacy.load('en_core_web_trf')
File "/home/prathamesh/anaconda3/envs/coreferee/lib/python3.9/site-packages/spacy/init.py", line 51, in load
return util.load_model(
File "/home/prathamesh/anaconda3/envs/coreferee/lib/python3.9/site-packages/spacy/util.py", line 321, in load_model
return load_model_from_package(name, **kwargs)
File "/home/prathamesh/anaconda3/envs/coreferee/lib/python3.9/site-packages/spacy/util.py", line 354, in load_model_from_package
return cls.load(vocab=vocab, disable=disable, exclude=exclude, config=config)
File "/home/prathamesh/anaconda3/envs/coreferee/lib/python3.9/site-packages/en_core_web_trf/init.py", line 10, in load
return load_model_from_init_py(file, **overrides)
File "/home/prathamesh/anaconda3/envs/coreferee/lib/python3.9/site-packages/spacy/util.py", line 514, in load_model_from_init_py
return load_model_from_path(
File "/home/prathamesh/anaconda3/envs/coreferee/lib/python3.9/site-packages/spacy/util.py", line 389, in load_model_from_path
nlp = load_model_from_config(config, vocab=vocab, disable=disable, exclude=exclude)
File "/home/prathamesh/anaconda3/envs/coreferee/lib/python3.9/site-packages/spacy/util.py", line 426, in load_model_from_config
nlp = lang_cls.from_config(
File "/home/prathamesh/anaconda3/envs/coreferee/lib/python3.9/site-packages/spacy/language.py", line 1715, in from_config
nlp.add_pipe(
File "/home/prathamesh/anaconda3/envs/coreferee/lib/python3.9/site-packages/spacy/language.py", line 777, in add_pipe
pipe_component = self.create_pipe(
File "/home/prathamesh/anaconda3/envs/coreferee/lib/python3.9/site-packages/spacy/language.py", line 661, in create_pipe
resolved = registry.resolve(cfg, validate=validate)
File "/home/prathamesh/anaconda3/envs/coreferee/lib/python3.9/site-packages/thinc/config.py", line 746, in resolve
resolved, _ = cls._make(
File "/home/prathamesh/anaconda3/envs/coreferee/lib/python3.9/site-packages/thinc/config.py", line 795, in _make
filled, _, resolved = cls._fill(
File "/home/prathamesh/anaconda3/envs/coreferee/lib/python3.9/site-packages/thinc/config.py", line 850, in _fill
filled[key], validation[v_key], final[key] = cls._fill(
File "/home/prathamesh/anaconda3/envs/coreferee/lib/python3.9/site-packages/thinc/config.py", line 849, in _fill
promise_schema = cls.make_promise_schema(value, resolve=resolve)
File "/home/prathamesh/anaconda3/envs/coreferee/lib/python3.9/site-packages/thinc/config.py", line 1057, in make_promise_schema
return create_model("ArgModel", **sig_args)
File "pydantic/main.py", line 990, in pydantic.main.create_model
File "pydantic/main.py", line 299, in pydantic.main.ModelMetaclass.new
File "pydantic/fields.py", line 411, in pydantic.fields.ModelField.infer
File "pydantic/fields.py", line 342, in pydantic.fields.ModelField.init
File "pydantic/fields.py", line 451, in pydantic.fields.ModelField.prepare
File "pydantic/fields.py", line 550, in pydantic.fields.ModelField._type_analysis
File "/home/prathamesh/anaconda3/envs/coreferee/lib/python3.9/typing.py", line 847, in subclasscheck
return issubclass(cls, self.origin)
TypeError: issubclass() arg 1 must be a class`

Is there an alternative to reproduce picard result without docker?

How make prediction on trained model ?

the version of transformers

I tried to train，but the following error occurred,it seems like the version of transformers too high,I tried to install transformers==4.13.0,but failed...

Size do not match error when training on the spider dataset

When I ran CUDA_VISIBLE_DEVICES="0" python3 seq2seq/run_seq2seq.py configs/spider/train_spider_rasat_small.json
I get the following error:
Traceback (most recent call last): File "/home/yaoy/convertsql2tree/RASAT/seq2seq/run_seq2seq.py", line 292, in <module> main() File "/home/yaoy/convertsql2tree/RASAT/seq2seq/run_seq2seq.py", line 237, in main train_result = trainer.train(resume_from_checkpoint=checkpoint) File "/home/yaoy/miniconda3/envs/rasat/lib/python3.9/site-packages/transformers/trainer.py", line 1325, in train tr_loss_step = self.training_step(model, inputs) File "/home/yaoy/miniconda3/envs/rasat/lib/python3.9/site-packages/transformers/trainer.py", line 1884, in training_step loss = self.compute_loss(model, inputs) File "/home/yaoy/miniconda3/envs/rasat/lib/python3.9/site-packages/transformers/trainer.py", line 1916, in compute_loss outputs = model(**inputs) File "/home/yaoy/miniconda3/envs/rasat/lib/python3.9/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/home/yaoy/convertsql2tree/RASAT/seq2seq/model/t5_relation_model.py", line 1742, in forward encoder_outputs = self.encoder( File "/home/yaoy/miniconda3/envs/rasat/lib/python3.9/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/home/yaoy/convertsql2tree/RASAT/seq2seq/model/t5_relation_model.py", line 1135, in forward layer_outputs = checkpoint( File "/home/yaoy/miniconda3/envs/rasat/lib/python3.9/site-packages/torch/utils/checkpoint.py", line 177, in checkpoint return CheckpointFunction.apply(function, preserve, *args) File "/home/yaoy/miniconda3/envs/rasat/lib/python3.9/site-packages/torch/utils/checkpoint.py", line 75, in forward outputs = run_function(*args) File "/home/yaoy/convertsql2tree/RASAT/seq2seq/model/t5_relation_model.py", line 1131, in custom_forward return tuple(module(*inputs, use_cache, output_attentions)) File "/home/yaoy/miniconda3/envs/rasat/lib/python3.9/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/home/yaoy/convertsql2tree/RASAT/seq2seq/model/t5_relation_model.py", line 755, in forward self_attention_outputs = self.layer[0]( File "/home/yaoy/miniconda3/envs/rasat/lib/python3.9/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/home/yaoy/convertsql2tree/RASAT/seq2seq/model/t5_relation_model.py", line 658, in forward attention_output = self.SelfAttention( File "/home/yaoy/miniconda3/envs/rasat/lib/python3.9/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/home/yaoy/convertsql2tree/RASAT/seq2seq/model/t5_relation_model.py", line 587, in forward scores = relative_attention_logits(query_states, key_states, relation_k_states) # [batch, heads, num queries, num kvs] File "/home/yaoy/convertsql2tree/RASAT/seq2seq/model/t5_relation_model.py", line 512, in relative_attention_logits q_tr_t_matmul = torch.matmul(q_t, r_t) RuntimeError: The size of tensor a (472) must match the size of tensor b (468) at non-singleton dimension 1
I try to print the size of the q_t and r_t, and get the following result:
q_t shape: torch.Size([8, 472, 8, 64])
r_t shape: torch.Size([8, 468, 64, 468])

I suppose the second dimension should all be 512? Does anyone have any idea of what went wrong and how to modify it?

Why is table alignment necessary?

align_tables.py seems to swap column names for certain databases.

For example, for the Spider dataset, the column names for the store_1 database is swapped. I understand that this is because of annotation issues from the original Spider dataset. However, I don't understand why a simple swapping solves this issue.

cspider数据集需要怎么训练呢？

请问cspider中文数据集要怎么处理才能训练呢？

Inadvertently removing question-question-generic relations?

rasat/seq2seq/preprocess/lgerels2t5rels_changeOrder.py

Line 381 in 961a697

if edge[2] in ["question-question-generic"]:

This line explicitly removes question-question-generic relations. Is there a reason why this is done?

Training process bug

I have followed all the steps mentioned in the repo and the training went smooth enough until this point. Most probably it has a bug in rasat/seq2seq/utils/spider.py. Here is the error description:

ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (1034,) + inhomogeneous part.

Can you please help me to fix the issue?

lumia-group / rasat Goto Github PK

rasat's People

Contributors

Stargazers

Forkers

rasat's Issues

Getting an error while running bash run_corefer_processing.sh `Start coref process 2023-06-13 08:13:23 INFO: Loading these models for language: en (English):

| Processor | Package |

| tokenize | combined |

Recommend Projects

Recommend Topics

Recommend Org

Getting an error while running `bash run_corefer_processing.sh`
`Start coref process
2023-06-13 08:13:23 INFO: Loading these models for language: en (English):