Giter VIP home page Giter VIP logo

sqlova's Introduction

SQLova

  • SQLova is a neural semantic parser translating natural language utterance to SQL query. The name is originated from the name of our department: Search & QLova (Search & Clova).

Authors

Abstract

  • We present the new state-of-the-art semantic parsing model that translates a natural language (NL) utterance into a SQL query.
  • The model is evaluated on WikiSQL, a semantic parsing dataset consisting of 80,654 (NL, SQL) pairs over 24,241 tables from Wikipedia.
  • We achieve 83.6% logical form accuracy and 89.6% execution accuracy on WikiSQL test set.

The model in a nutshell

Results (Updated at Jan 12, 2019)

Model Dev
logical form
accuracy
Dev
execution
accuracy
Test
logical form
accuracy
Test
execution
accuracy
SQLova 81.6 (+5.5)^ 87.2 (+3.2)^ 80.7 (+5.3)^ 86.2 (+2.5)^
SQLova-EG 84.2 (+8.2)* 90.2 (+3.0)* 83.6(+8.2)* 89.6 (+2.5)*
  • ^: Compared to current SOTA models that do not use execution guided decoding.
  • *: Compared to current SOTA.
  • The order of where conditions is ignored in measuring logical form accuracy in our model.

Source code

Requirements

  • python3.6 or higher.
  • PyTorch 0.4.0 or higher.
  • CUDA 9.0
  • Python libraries: babel, matplotlib, defusedxml, tqdm
  • Example
    • Install minicoda
    • conda install pytorch torchvision -c pytorch
    • conda install -c conda-forge records==0.5.2
    • conda install babel
    • conda install matplotlib
    • conda install defusedxml
    • conda install tqdm
  • The code has been tested on Tesla M40 GPU running on Ubuntu 16.04.4 LTS.

Running code

  • Type python3 train.py --seed 1 --bS 16 --accumulate_gradients 2 --bert_type_abb uS --fine_tune --lr 0.001 --lr_bert 0.00001 --max_seq_leng 222 on terminal.
    • --seed 1: Set the seed of random generator. The accuracies changes by few percent depending on seed.
    • --bS 16: Set the batch size by 16.
    • --accumulate_gradients 2: Make the effective batch size be 16 * 2 = 32.
    • --bert_type_abb uS: Uncased-Base BERT model is used. Use uL to use Uncased-Large BERT.
    • --fine_tune: Train BERT. Without this, only the sequence-to-SQL module is trained.
    • --lr 0.001: Set the learning rate of the sequence-to-SQL module as 0.001.
    • --lr_bert 0.00001: Set the learning rate of BERT module as 0.00001.
    • --max_seq_leng 222: Set the maximum number of input token lengths of BERT.
  • The model should show ~79% logical accuracy (lx) on dev set after ~12 hrs (~10 epochs). Higher accuracy can be obtained with longer training, by selecting different seed, by using Uncased Large BERT model, or by using execution guided decoding.
  • Add --EG argument while running train.py to use execution guided decoding.
  • Whenever higher logical form accuracy calculated on the dev set, following three files are saved on current folder:
    • model_best.pt: the checkpoint of the the sequence-to-SQL module.
    • model_bert_best.pt: the checkpoint of the BERT module.
    • results_dev.jsonl: json file for official evaluation.
  • Shallow-Layer and Decoder-Layer models can be trained similarly (train_shallow_layer.py, train_decoder_layer.py).

Evaluation on WikiSQL DEV set

  • To calculate logical form and execution accuracies on dev set using official evaluation script,
    • Download original WikiSQL dataset.
    • tar xvf data.tar.bz2
    • Move them under $HOME/data/WikiSQL-1.1/data
    • Set path on evaluation_ws.py. This is the file where the path information has added on original evaluation.py script. Or you can use original evaluation.py by setting the path to the files by yourself.
    • Type python3 evaluation_ws.py on terminal.

Evaluation on WikiSQL TEST set

  • Uncomment line 550-557 of train.py to load test_loader and test_table.
  • One test(...) function, use test_loader and test_table instead of dev_loader and dev_table.
  • Save the output of test(...) with save_for_evaluation(...) function.
  • Evaluate with evaluatoin_ws.py as before.

Load pre-trained SQLova parameters.

  • Pretrained SQLova model parameters are uploaded in release. To start from this, uncomment line 562-565 and set paths.

Code base

  • Pretrained BERT models were downloaded from official repository.
  • BERT code is from huggingface-pytorch-pretrained-BERT.
  • The sequence-to-SQL model is started from the source code of SQLNet and significantly re-written while maintaining the basic column-attention and sequence-to-set structure of the SQLNet.

Data

  • The data is annotated by using annotate_ws.py which is based on annotate.py from WikiSQL repository. The tokens of natural language guery, and the start and end indices of where-conditions on natural language tokens are annotated.
  • Pre-trained BERT parameters can be downloaded from BERT official repository and can be coverted to ptfile using following script. You need install both pytorch and tensorflow and change BERT_BASE_DIR to your data directory.
    cd sqlova
    export BERT_BASE_DIR=data/uncased_L-12_H-768_A-12
    python bert/convert_tf_checkpoint_to_pytorch.py \
        --tf_checkpoint_path $BERT_BASE_DIR/bert_model.ckpt \
        --bert_config_file    $BERT_BASE_DIR/bert_config.json \
        --pytorch_dump_path     $BERT_BASE_DIR/pytorch_model.bin 
  • bert/convert_tf_checkpoint_to_pytorch.py is from the previous version of huggingface-pytorch-pretrained-BERT, and current version of pytorch-pretrained-BERT is not compatible with the bert model used in this repo due to the difference in variable names (in LayerNorm). See this for the detail.
  • For the convenience, the annotated WikiSQL data and the PyTorch-converted pre-trained BERT parameters are available at here.

License

Copyright 2019-present NAVER Corp.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

sqlova's People

Contributors

0xflotus avatar guotong1988 avatar hanrelan avatar paulfitz avatar wenfengand avatar whwang299 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

sqlova's Issues

File Not found issue

Hi
I found your paper and github code and very interested in this topic.
I've tried to test your current result, but facing errors due to your files.
From here, I tried to run the python3 train.py --seed 1 --bS 16 --accumulate_gradients 2 --bert_type_abb uS --fine_tune --lr 0.001 --lr_bert 0.00001 --max_seq_leng 222.
But I got the error like:
python train.py --seed 1 --bS 16 --accumulate_gradients 2 --bert_type_abb uS --fine_tune --lr 0.001 --lr_bert 0.00001 --max_seq_leng 222 BERT-type: uncased_L-12_H-768_A-12 Traceback (most recent call last): File "train.py", line 552, in <module> train_data, train_table, dev_data, dev_table, train_loader, dev_loader = get_data(path_wikisql, args) File "train.py", line 183, in get_data train_data, train_table, dev_data, dev_table, _, _ = load_wikisql(path_wikisql, args.toy_model, args.toy_size, no_w2i=True, no_hs_tok=True) File "/home/ubuntu/work/torch-sqlova/sqlova/sqlova/utils/utils_wikisql.py", line 29, in load_wikisql train_data, train_table = load_wikisql_data(path_wikisql, mode='train', toy_model=toy_model, toy_size=toy_size, no_hs_tok=no_hs_tok, aug=aug) File "/home/ubuntu/work/torch-sqlova/sqlova/sqlova/utils/utils_wikisql.py", line 58, in load_wikisql_data with open(path_sql) as f: FileNotFoundError: [Errno 2] No such file or directory: '/home/wonseok/data/wikisql_tok/train_tok.jsonl'

I think you have additional data except of this repository code.
What is your additional data?
Thank you.

Difference between the PyTorch-converted pre-trained BERT parameters released on Google Drive and the one obtained using HuggingFace conversion script

I tried to get the Pytorch pre-trained BERT checkpoint using the conversion script provided by HuggingFace. The script executed without any problem and I was able to obtain a binary converted file.

However, I noticed a few differences between this file compared with the PyTorch-converted pre-trained BERT parameters released on Google Drive.

First, the two files has different variable naming. The HuggingFace converted file has the prefix bert. for each variable and cannot be taken by SQLova directly.

RuntimeError: Error(s) in loading state_dict for BertModel:

Missing key(s) in state_dict: "embeddings.word_embeddings.weight", "embeddings.position_embeddings.weight", "embeddings.token_type_embeddings.weight", "embeddings.LayerNorm.gamma", "embeddings.LayerNorm.beta", "encoder.layer.0.attention.self.query.weight", "encoder.layer.0.attention.self.query.bias", "encoder.layer.0.attention.self.key.weight", "encoder.layer.0.attention.self.key.bias", "encoder.layer.0.attention.self.value.weight", "encoder.layer.0.attention.self.value.bias", "encoder.layer.0.attention.output.dense.weight", "encoder.layer.0.attention.output.dense.bias", "encoder.layer.0.attention.output.LayerNorm.gamma", "encoder.layer.0.attention.output.LayerNorm.beta", "encoder.layer.0.intermediate.dense.weight", "encoder.layer.0.intermediate.dense.bias", "encoder.layer.0.output.dense.weight", "encoder.layer.0.output.dense.bias", "encoder.layer.0.output.LayerNorm.gamma", "encoder.layer.0.output.LayerNorm.beta", "encoder.layer.1.attention.self.query.weight"...

Unexpected key(s) in state_dict: "bert.embeddings.word_embeddings.weight", "bert.embeddings.position_embeddings.weight", "bert.embeddings.token_type_embeddings.weight", "bert.embeddings.LayerNorm.weight", "bert.embeddings.LayerNorm.bias", "bert.encoder.layer.0.attention.self.query.weight", "bert.encoder.layer.0.attention.self.query.bias", "bert.encoder.layer.0.attention.self.key.weight", "bert.encoder.layer.0.attention.self.key.bias", "bert.encoder.layer.0.attention.self.value.weight", "bert.encoder.layer.0.attention.self.value.bias", "bert.encoder.layer.0.attention.output.dense.weight", "bert.encoder.layer.0.attention.output.dense.bias", "bert.encoder.layer.0.attention.output.LayerNorm.weight", "bert.encoder.layer.0.attention.output.LayerNorm.bias", "bert.encoder.layer.0.intermediate.dense.weight", "bert.encoder.layer.0.intermediate.dense.bias", "bert.encoder.layer.0.output.dense.weight", "bert.encoder.layer.0.output.dense.bias", "bert.encoder.layer.0.output.LayerNorm.weight", "bert.encoder.layer.0.output.LayerNorm.bias"...

I was able to map most variables in these two files by manipulating the naming and verify their equivalence, but I cannot find a mapping of the following tensors in the HuggingFace conversion to the Google Drive release, most of them related to layer normalization.

bert.embeddings.LayerNorm.weight
bert.embeddings.LayerNorm.bias
bert.encoder.layer.0.attention.output.LayerNorm.weight
bert.encoder.layer.0.attention.output.LayerNorm.bias
bert.encoder.layer.0.output.LayerNorm.weight
bert.encoder.layer.0.output.LayerNorm.bias
bert.encoder.layer.1.attention.output.LayerNorm.weight
bert.encoder.layer.1.attention.output.LayerNorm.bias
bert.encoder.layer.1.output.LayerNorm.weight
bert.encoder.layer.1.output.LayerNorm.bias
bert.encoder.layer.2.attention.output.LayerNorm.weight
bert.encoder.layer.2.attention.output.LayerNorm.bias
bert.encoder.layer.2.output.LayerNorm.weight
bert.encoder.layer.2.output.LayerNorm.bias
bert.encoder.layer.3.attention.output.LayerNorm.weight
bert.encoder.layer.3.attention.output.LayerNorm.bias
bert.encoder.layer.3.output.LayerNorm.weight
bert.encoder.layer.3.output.LayerNorm.bias
bert.encoder.layer.4.attention.output.LayerNorm.weight
bert.encoder.layer.4.attention.output.LayerNorm.bias
bert.encoder.layer.4.output.LayerNorm.weight
bert.encoder.layer.4.output.LayerNorm.bias
bert.encoder.layer.5.attention.output.LayerNorm.weight
bert.encoder.layer.5.attention.output.LayerNorm.bias
bert.encoder.layer.5.output.LayerNorm.weight
bert.encoder.layer.5.output.LayerNorm.bias
bert.encoder.layer.6.attention.output.LayerNorm.weight
bert.encoder.layer.6.attention.output.LayerNorm.bias
bert.encoder.layer.6.output.LayerNorm.weight
bert.encoder.layer.6.output.LayerNorm.bias
bert.encoder.layer.7.attention.output.LayerNorm.weight
bert.encoder.layer.7.attention.output.LayerNorm.bias
bert.encoder.layer.7.output.LayerNorm.weight
bert.encoder.layer.7.output.LayerNorm.bias
bert.encoder.layer.8.attention.output.LayerNorm.weight
bert.encoder.layer.8.attention.output.LayerNorm.bias
bert.encoder.layer.8.output.LayerNorm.weight
bert.encoder.layer.8.output.LayerNorm.bias
bert.encoder.layer.9.attention.output.LayerNorm.weight
bert.encoder.layer.9.attention.output.LayerNorm.bias
bert.encoder.layer.9.output.LayerNorm.weight
bert.encoder.layer.9.output.LayerNorm.bias
bert.encoder.layer.10.attention.output.LayerNorm.weight
bert.encoder.layer.10.attention.output.LayerNorm.bias
bert.encoder.layer.10.output.LayerNorm.weight
bert.encoder.layer.10.output.LayerNorm.bias
bert.encoder.layer.11.attention.output.LayerNorm.weight
bert.encoder.layer.11.attention.output.LayerNorm.bias
bert.encoder.layer.11.output.LayerNorm.weight
bert.encoder.layer.11.output.LayerNorm.bias
cls.predictions.bias
cls.predictions.transform.dense.weight
cls.predictions.transform.dense.bias
cls.predictions.transform.LayerNorm.weight
cls.predictions.transform.LayerNorm.bias
cls.predictions.decoder.weight
cls.seq_relationship.weight
cls.seq_relationship.bias

May I understand what causes the above differences? Is layer normalization removed from the BERT architecture on purpose? Thanks.

HOW TO SOLVE THIS PROBLEM in train

when run "train.py --seed 1 --bS 16 --accumulate_gradients 2 --bert_type_abb uS --fine_tune --lr 0.001 --lr_bert 0.00001 --max_seq_leng 222"

HOW TO SOLVE THIS PROBLEM
this is the error :
/home/quh/.conda/envs/pytorch/lib/python3.7/site-packages/torch/nn/functional.py:1386: UserWarning: nn.functional.sigmoid is deprecated. Use torch.sigmoid instead.
warnings.warn("nn.functional.sigmoid is deprecated. Use torch.sigmoid instead.")
Error closing cursor
Traceback (most recent call last):
File "/home/quh/.conda/envs/pytorch/lib/python3.7/site-packages/sqlalchemy/engine/result.py", line 1268, in fetchone
row = self._fetchone_impl()
File "/home/quh/.conda/envs/pytorch/lib/python3.7/site-packages/sqlalchemy/engine/result.py", line 1148, in _fetchone_impl
return self.cursor.fetchone()
sqlite3.ProgrammingError: Cannot operate on a closed database.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/quh/.conda/envs/pytorch/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1333, in _safe_close_cursor
cursor.close()
sqlite3.ProgrammingError: Cannot operate on a closed database.
Traceback (most recent call last):
File "/home/quh/.conda/envs/pytorch/lib/python3.7/site-packages/sqlalchemy/engine/result.py", line 1268, in fetchone
row = self._fetchone_impl()
File "/home/quh/.conda/envs/pytorch/lib/python3.7/site-packages/sqlalchemy/engine/result.py", line 1148, in _fetchone_impl
return self.cursor.fetchone()
sqlite3.ProgrammingError: Cannot operate on a closed database.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "train.py", line 605, in
dset_name='train')
File "train.py", line 310, in train
cnt_x1_list, g_ans, pr_ans = get_cnt_x_list(engine, tb, g_sc, g_sa, sql_i, pr_sc, pr_sa, pr_sql_i)
File "/home/quh/pythonwork/nl2sql/nl2sql_baseline/sqlova/sqlova/utils/utils_wikisql.py", line 1652, in get_cnt_x_list
g_ans1 = engine.execute(tb[b]['id'], g_sc[b], g_sa[b], g_sql_i[b]['conds'])
File "/home/quh/pythonwork/nl2sql/nl2sql_baseline/sqlova/sqlnet/dbengine.py", line 29, in execute
table_info = self.db.query('SELECT sql from sqlite_master WHERE tbl_name = :name', name=table_id).all()[0].sql.replace('\n','')
File "/home/quh/.conda/envs/pytorch/lib/python3.7/site-packages/records.py", line 195, in all
rows = list(self)
File "/home/quh/.conda/envs/pytorch/lib/python3.7/site-packages/records.py", line 126, in iter
yield next(self)
File "/home/quh/.conda/envs/pytorch/lib/python3.7/site-packages/records.py", line 136, in next
nextrow = next(self._rows)
File "/home/quh/.conda/envs/pytorch/lib/python3.7/site-packages/records.py", line 365, in
row_gen = (Record(cursor.keys(), row) for row in cursor)
File "/home/quh/.conda/envs/pytorch/lib/python3.7/site-packages/sqlalchemy/engine/result.py", line 946, in iter
row = self.fetchone()
File "/home/quh/.conda/envs/pytorch/lib/python3.7/site-packages/sqlalchemy/engine/result.py", line 1276, in fetchone
e, None, None, self.cursor, self.context
File "/home/quh/.conda/envs/pytorch/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1466, in _handle_dbapi_exception
util.raise_from_cause(sqlalchemy_exception, exc_info)
File "/home/quh/.conda/envs/pytorch/lib/python3.7/site-packages/sqlalchemy/util/compat.py", line 399, in raise_from_cause
reraise(type(exception), exception, tb=exc_tb, cause=cause)
File "/home/quh/.conda/envs/pytorch/lib/python3.7/site-packages/sqlalchemy/util/compat.py", line 153, in reraise
raise value.with_traceback(tb)
File "/home/quh/.conda/envs/pytorch/lib/python3.7/site-packages/sqlalchemy/engine/result.py", line 1268, in fetchone
row = self._fetchone_impl()
File "/home/quh/.conda/envs/pytorch/lib/python3.7/site-packages/sqlalchemy/engine/result.py", line 1148, in _fetchone_impl
return self.cursor.fetchone()
sqlalchemy.exc.ProgrammingError: (sqlite3.ProgrammingError) Cannot operate on a closed database.
(Background on this error at: http://sqlalche.me/e/f405)

Train lf accuracy higher than execution accuracy

Hi,

When running the script, it is surprised to see that after several iterations (<3), the training logical form accuracy becomes higher than the training execution accuracy, which confuses me a lot.

May I ask what is the reason behind it? I am confused for several days and could not figure it out. Thank you for any help in advance!

Loading Pretrained SQLova Parameters

I'm having an error when trying to load the pretrained parameters. Seems like the file may be missing some keys?

Any help you could provide will be much appreciated.

self.class.name, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for FT_Scalar_1:
Unexpected key(s) in state_dict: "wcp.enc_h.weight_ih_l0", "wcp.enc_h.weight_hh_l0", "wcp.enc_h.bias_ih_l0", "wcp.enc_h.bias_hh_l0", "wcp.enc_h.weight_ih_l0_reverse", "wcp.enc_h.weight_hh_l0_reverse", "wcp.enc_h.bias_ih_l0_reverse", "wcp.enc_h.bias_hh_l0_reverse", "wcp.enc_h.weight_ih_l1", "wcp.enc_h.weight_hh_l1", "wcp.enc_h.bias_ih_l1", "wcp.enc_h.bias_hh_l1", "wcp.enc_h.weight_ih_l1_reverse", "wcp.enc_h.weight_hh_l1_reverse", "wcp.enc_h.bias_ih_l1_reverse", "wcp.enc_h.bias_hh_l1_reverse", "wcp.enc_n.weight_ih_l0", "wcp.enc_n.weight_hh_l0", "wcp.enc_n.bias_ih_l0", "wcp.enc_n.bias_hh_l0", "wcp.enc_n.weight_ih_l0_reverse", "wcp.enc_n.weight_hh_l0_reverse", "wcp.enc_n.bias_ih_l0_reverse", "wcp.enc_n.bias_hh_l0_reverse", "wcp.enc_n.weight_ih_l1", "wcp.enc_n.weight_hh_l1", "wcp.enc_n.bias_ih_l1", "wcp.enc_n.bias_hh_l1", "wcp.enc_n.weight_ih_l1_reverse", "wcp.enc_n.weight_hh_l1_reverse", "wcp.enc_n.bias_ih_l1_reverse", "wcp.enc_n.bias_hh_l1_reverse", "wcp.W_att.weight", "wcp.W_att.bias", "wcp.W_c.weight", "wcp.W_c.bias", "wcp.W_hs.weight", "wcp.W_hs.bias", "wcp.W_out.1.weight", "wcp.W_out.1.bias", "scp.enc_h.weight_ih_l0", "scp.enc_h.weight_hh_l0", "scp.enc_h.bias_ih_l0", "scp.enc_h.bias_hh_l0", "scp.enc_h.weight_ih_l0_reverse", "scp.enc_h.weight_hh_l0_reverse", "scp.enc_h.bias_ih_l0_reverse", "scp.enc_h.bias_hh_l0_reverse", "scp.enc_h.weight_ih_l1", "scp.enc_h.weight_hh_l1", "scp.enc_h.bias_ih_l1", "scp.enc_h.bias_hh_l1", "scp.enc_h.weight_ih_l1_reverse", "scp.enc_h.weight_hh_l1_reverse", "scp.enc_h.bias_ih_l1_reverse", "scp.enc_h.bias_hh_l1_reverse", "scp.enc_n.weight_ih_l0", "scp.enc_n.weight_hh_l0", "scp.enc_n.bias_ih_l0", "scp.enc_n.bias_hh_l0", "scp.enc_n.weight_ih_l0_reverse", "scp.enc_n.weight_hh_l0_reverse", "scp.enc_n.bias_ih_l0_reverse", "scp.enc_n.bias_hh_l0_reverse", "scp.enc_n.weight_ih_l1", "scp.enc_n.weight_hh_l1", "scp.enc_n.bias_ih_l1", "scp.enc_n.bias_hh_l1", "scp.enc_n.weight_ih_l1_reverse", "scp.enc_n.weight_hh_l1_reverse", "scp.enc_n.bias_ih_l1_reverse", "scp.enc_n.bias_hh_l1_reverse", "scp.W_att.weight", "scp.W_att.bias", "scp.W_c.weight", "scp.W_c.bias", "scp.W_hs.weight", "scp.W_hs.bias", "scp.sc_out.1.weight", "scp.sc_out.1.bias", "sap.enc_h.weight_ih_l0", "sap.enc_h.weight_hh_l0", "sap.enc_h.bias_ih_l0", "sap.enc_h.bias_hh_l0", "sap.enc_h.weight_ih_l0_reverse", "sap.enc_h.weight_hh_l0_reverse", "sap.enc_h.bias_ih_l0_reverse", "sap.enc_h.bias_hh_l0_reverse", "sap.enc_h.weight_ih_l1", "sap.enc_h.weight_hh_l1", "sap.enc_h.bias_ih_l1", "sap.enc_h.bias_hh_l1", "sap.enc_h.weight_ih_l1_reverse", "sap.enc_h.weight_hh_l1_reverse", "sap.enc_h.bias_ih_l1_reverse", "sap.enc_h.bias_hh_l1_reverse", "sap.enc_n.weight_ih_l0", "sap.enc_n.weight_hh_l0", "sap.enc_n.bias_ih_l0", "sap.enc_n.bias_hh_l0", "sap.enc_n.weight_ih_l0_reverse", "sap.enc_n.weight_hh_l0_reverse", "sap.enc_n.bias_ih_l0_reverse", "sap.enc_n.bias_hh_l0_reverse", "sap.enc_n.weight_ih_l1", "sap.enc_n.weight_hh_l1", "sap.enc_n.bias_ih_l1", "sap.enc_n.bias_hh_l1", "sap.enc_n.weight_ih_l1_reverse", "sap.enc_n.weight_hh_l1_reverse", "sap.enc_n.bias_ih_l1_reverse", "sap.enc_n.bias_hh_l1_reverse", "sap.W_att.weight", "sap.W_att.bias", "sap.sa_out.0.weight", "sap.sa_out.0.bias", "sap.sa_out.2.weight", "sap.sa_out.2.bias", "wnp.enc_h.weight_ih_l0", "wnp.enc_h.weight_hh_l0", "wnp.enc_h.bias_ih_l0", "wnp.enc_h.bias_hh_l0", "wnp.enc_h.weight_ih_l0_reverse", "wnp.enc_h.weight_hh_l0_reverse", "wnp.enc_h.bias_ih_l0_reverse", "wnp.enc_h.bias_hh_l0_reverse", "wnp.enc_h.weight_ih_l1", "wnp.enc_h.weight_hh_l1", "wnp.enc_h.bias_ih_l1", "wnp.enc_h.bias_hh_l1", "wnp.enc_h.weight_ih_l1_reverse", "wnp.enc_h.weight_hh_l1_reverse", "wnp.enc_h.bias_ih_l1_reverse", "wnp.enc_h.bias_hh_l1_reverse", "wnp.enc_n.weight_ih_l0", "wnp.enc_n.weight_hh_l0", "wnp.enc_n.bias_ih_l0", "wnp.enc_n.bias_hh_l0", "wnp.enc_n.weight_ih_l0_reverse", "wnp.enc_n.weight_hh_l0_reverse", "wnp.enc_n.bias_ih_l0_reverse", "wnp.enc_n.bias_hh_l0_reverse", "wnp.enc_n.weight_ih_l1", "wnp.enc_n.weight_hh_l1", "wnp.enc_n.bias_ih_l1", "wnp.enc_n.bias_hh_l1", "wnp.enc_n.weight_ih_l1_reverse", "wnp.enc_n.weight_hh_l1_reverse", "wnp.enc_n.bias_ih_l1_reverse", "wnp.enc_n.bias_hh_l1_reverse", "wnp.W_att_h.weight", "wnp.W_att_h.bias", "wnp.W_hidden.weight", "wnp.W_hidden.bias", "wnp.W_cell.weight", "wnp.W_cell.bias", "wnp.W_att_n.weight", "wnp.W_att_n.bias", "wnp.wn_out.0.weight", "wnp.wn_out.0.bias", "wnp.wn_out.2.weight", "wnp.wn_out.2.bias", "wop.enc_h.weight_ih_l0", "wop.enc_h.weight_hh_l0", "wop.enc_h.bias_ih_l0", "wop.enc_h.bias_hh_l0", "wop.enc_h.weight_ih_l0_reverse", "wop.enc_h.weight_hh_l0_reverse", "wop.enc_h.bias_ih_l0_reverse", "wop.enc_h.bias_hh_l0_reverse", "wop.enc_h.weight_ih_l1", "wop.enc_h.weight_hh_l1", "wop.enc_h.bias_ih_l1", "wop.enc_h.bias_hh_l1", "wop.enc_h.weight_ih_l1_reverse", "wop.enc_h.weight_hh_l1_reverse", "wop.enc_h.bias_ih_l1_reverse", "wop.enc_h.bias_hh_l1_reverse", "wop.enc_n.weight_ih_l0", "wop.enc_n.weight_hh_l0", "wop.enc_n.bias_ih_l0", "wop.enc_n.bias_hh_l0", "wop.enc_n.weight_ih_l0_reverse", "wop.enc_n.weight_hh_l0_reverse", "wop.enc_n.bias_ih_l0_reverse", "wop.enc_n.bias_hh_l0_reverse", "wop.enc_n.weight_ih_l1", "wop.enc_n.weight_hh_l1", "wop.enc_n.bias_ih_l1", "wop.enc_n.bias_hh_l1", "wop.enc_n.weight_ih_l1_reverse", "wop.enc_n.weight_hh_l1_reverse", "wop.enc_n.bias_ih_l1_reverse", "wop.enc_n.bias_hh_l1_reverse", "wop.W_att.weight", "wop.W_att.bias", "wop.W_c.weight", "wop.W_c.bias", "wop.W_hs.weight", "wop.W_hs.bias", "wop.wo_out.0.weight", "wop.wo_out.0.bias", "wop.wo_out.2.weight", "wop.wo_out.2.bias", "wvp.enc_h.weight_ih_l0", "wvp.enc_h.weight_hh_l0", "wvp.enc_h.bias_ih_l0", "wvp.enc_h.bias_hh_l0", "wvp.enc_h.weight_ih_l0_reverse", "wvp.enc_h.weight_hh_l0_reverse", "wvp.enc_h.bias_ih_l0_reverse", "wvp.enc_h.bias_hh_l0_reverse", "wvp.enc_h.weight_ih_l1", "wvp.enc_h.weight_hh_l1", "wvp.enc_h.bias_ih_l1", "wvp.enc_h.bias_hh_l1", "wvp.enc_h.weight_ih_l1_reverse", "wvp.enc_h.weight_hh_l1_reverse", "wvp.enc_h.bias_ih_l1_reverse", "wvp.enc_h.bias_hh_l1_reverse", "wvp.enc_n.weight_ih_l0", "wvp.enc_n.weight_hh_l0", "wvp.enc_n.bias_ih_l0", "wvp.enc_n.bias_hh_l0", "wvp.enc_n.weight_ih_l0_reverse", "wvp.enc_n.weight_hh_l0_reverse", "wvp.enc_n.bias_ih_l0_reverse", "wvp.enc_n.bias_hh_l0_reverse", "wvp.enc_n.weight_ih_l1", "wvp.enc_n.weight_hh_l1", "wvp.enc_n.bias_ih_l1", "wvp.enc_n.bias_hh_l1", "wvp.enc_n.weight_ih_l1_reverse", "wvp.enc_n.weight_hh_l1_reverse", "wvp.enc_n.bias_ih_l1_reverse", "wvp.enc_n.bias_hh_l1_reverse", "wvp.W_att.weight", "wvp.W_att.bias", "wvp.W_c.weight", "wvp.W_c.bias", "wvp.W_hs.weight", "wvp.W_hs.bias", "wvp.W_op.weight", "wvp.W_op.bias", "wvp.wv_out.0.weight", "wvp.wv_out.0.bias", "wvp.wv_out.2.weight", "wvp.wv_out.2.bias".

Error when testing

I run the command in Readme.md. And after a long time of running, an error is thrown:

Traceback (most recent call last):
  File "train.py", line 614, in <module>
    dset_name='dev', EG=args.EG)
  File "train.py", line 495, in test
    cnt_x1_list, g_ans, pr_ans = get_cnt_x_list(engine, tb, g_sc, g_sa, sql_i, pr_sc, pr_sa, pr_sql_i)
  File "/home/sqlova-master/sqlova/utils/utils_wikisql.py", line 1651, in get_cnt_x_list
    g_ans1 = engine.execute(tb[b]['id'], g_sc[b], g_sa[b], g_sql_i[b]['conds'])
  File "/home/sqlova-master/sqlnet/dbengine.py", line 29, in execute
    table_info = self.db.query('SELECT sql from sqlite_master WHERE tbl_name = :name', name=table_id).all()[0].sql.replace('\n','')
IndexError: list index out of range

Is it because no result is received? Do you have any idea to solve this issue? Thank you in advance!

Where can I get wikisql_tok?

It throws an error

(py36) one@test:~/sqlova$ python3 train.py --seed 1 --bS 16 --accumulate_gradients 2 --bert_type_abb uS --fine_tune --lr 0.001 --lr_bert 0.00001 --max_seq_leng 222
BERT-type: uncased_L-12_H-768_A-12
Traceback (most recent call last):
  File "train.py", line 552, in <module>
    train_data, train_table, dev_data, dev_table, train_loader, dev_loader = get_data(path_wikisql, args)
  File "train.py", line 183, in get_data
    train_data, train_table, dev_data, dev_table, _, _ = load_wikisql(path_wikisql, args.toy_model, args.toy_size, no_w2i=True, no_hs_tok=True)
  File "/home/one/sqlova/sqlova/utils/utils_wikisql.py", line 29, in load_wikisql
    train_data, train_table = load_wikisql_data(path_wikisql, mode='train', toy_model=toy_model, toy_size=toy_size, no_hs_tok=no_hs_tok, aug=aug)
  File "/home/one/sqlova/sqlova/utils/utils_wikisql.py", line 58, in load_wikisql_data
    with open(path_sql) as f:
FileNotFoundError: [Errno 2] No such file or directory: '/home/one/data/wikisql_tok/train_tok.jsonl'

RuntimeError: CUDA out of memory.

BERT-type: uncased_L-12_H-768_A-12
Batch_size = 32
BERT parameters:
learning rate: 1e-05
Fine-tune BERT: True
vocab size: 30522
hidden_size: 768
num_hidden_layer: 12
num_attention_heads: 12
hidden_act: gelu
intermediate_size: 3072
hidden_dropout_prob: 0.1
attention_probs_dropout_prob: 0.1
max_position_embeddings: 512
type_vocab_size: 2
initializer_range: 0.02
Load pre-trained parameters.
Seq-to-SQL: the number of final BERT layers to be used: 2
Seq-to-SQL: the size of hidden dimension = 100
Seq-to-SQL: LSTM encoding layer size = 2
Seq-to-SQL: dropout rate = 0.3
Seq-to-SQL: learning rate = 0.001

Traceback (most recent call last):
  File "train.py", line 603, in <module>
    dset_name='train')
  File "train.py", line 239, in train
    num_out_layers_n=num_target_layers, num_out_layers_h=num_target_layers)
  File "/data4/tong.guo/sqlova-master/sqlova/utils/utils_wikisql.py", line 817, in get_wemb_bert
    nlu_tt, t_to_tt_idx, tt_to_t_idx = get_bert_output(model_bert, tokenizer, nlu_t, headers, max_seq_length)
  File "/data4/tong.guo/sqlova-master/sqlova/utils/utils_wikisql.py", line 751, in get_bert_output
    all_encoder_layer, pooled_output = model_bert(all_input_ids, all_segment_ids, all_input_mask)
  File "/data4/tong.guo/Py36/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/data4/tong.guo/sqlova-master/bert/modeling.py", line 396, in forward
    all_encoder_layers = self.encoder(embedding_output, extended_attention_mask)
  File "/data4/tong.guo/Py36/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/data4/tong.guo/sqlova-master/bert/modeling.py", line 326, in forward
    hidden_states = layer_module(hidden_states, attention_mask)
  File "/data4/tong.guo/Py36/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/data4/tong.guo/sqlova-master/bert/modeling.py", line 311, in forward
    attention_output = self.attention(hidden_states, attention_mask)
  File "/data4/tong.guo/Py36/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/data4/tong.guo/sqlova-master/bert/modeling.py", line 272, in forward
    self_output = self.self(input_tensor, attention_mask)
  File "/data4/tong.guo/Py36/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/data4/tong.guo/sqlova-master/bert/modeling.py", line 226, in forward
    attention_scores = torch.matmul(query_layer, key_layer.transpose(-1, -2))
RuntimeError: CUDA out of memory. Tried to allocate 10.50 MiB (GPU 0; 11.17 GiB total capacity; 10.59 GiB already allocated; 5.69 MiB free; 257.72 MiB cached)

Typos in ReadMe.md

While going through the readme file, I found a few of the typos:

image

  1. guery - query
  2. coverted - converted

Please rectify it

what's mean in "wvi_corenlp"

{"table_id":"1-1000181-1","phase":1,"question":"What is the current series where the new series began in June 2011?","question_tok":["What","is","the","current","series","where","the","new","series","began","in","June","2011","?"],"sql":{"sel":4,"conds":[[5,0,"New series began in June 2011"]],"agg":0},"query":{"sel":4,"conds":[[5,0,"New series began in June 2011"]],"agg":0},"wvi_corenlp":[[7,12]]}

In your markup training file "train_tok.jsonl", Like "wvi_corenlp":[[7,12]], what does that mean?
How do I produce this file "train_tok.jsonl" with a new training set? thanks

RuntimeError: Buy new RAM!0

Hi,
I'm trying to Run train.py but I'm getting this error. Should I really buy new rams?
Thanks

Traceback (most recent call last):
File "train.py", line 604, in
dset_name='train')
File "train.py", line 240, in train
num_out_layers_n=num_target_layers, num_out_layers_h=num_target_layers)
File "C:\Users\Admin\Desktop\Projects\University\Software\Trainer+Evaluator\sqlova-master\sqlova\utils\utils_wikisql.py", line 820, in get_wemb_bert
nlu_tt, t_to_tt_idx, tt_to_t_idx = get_bert_output(model_bert, tokenizer, nlu_t, hds, max_seq_length)
File "C:\Users\Admin\Desktop\Projects\University\Software\Trainer+Evaluator\sqlova-master\sqlova\utils\utils_wikisql.py", line 754, in get_bert_output
all_encoder_layer, pooled_output = model_bert(all_input_ids, all_segment_ids, all_input_mask)
File "C:\Users\Admin\Miniconda3\lib\site-packages\torch\nn\modules\module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "C:\Users\Admin\Desktop\Projects\University\Software\Trainer+Evaluator\sqlova-master\bert\modeling.py", line 396, in forward
all_encoder_layers = self.encoder(embedding_output, extended_attention_mask)
File "C:\Users\Admin\Miniconda3\lib\site-packages\torch\nn\modules\module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "C:\Users\Admin\Desktop\Projects\University\Software\Trainer+Evaluator\sqlova-master\bert\modeling.py", line 326, in forward
hidden_states = layer_module(hidden_states, attention_mask)
File "C:\Users\Admin\Miniconda3\lib\site-packages\torch\nn\modules\module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "C:\Users\Admin\Desktop\Projects\University\Software\Trainer+Evaluator\sqlova-master\bert\modeling.py", line 311, in forward
attention_output = self.attention(hidden_states, attention_mask)
File "C:\Users\Admin\Miniconda3\lib\site-packages\torch\nn\modules\module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "C:\Users\Admin\Desktop\Projects\University\Software\Trainer+Evaluator\sqlova-master\bert\modeling.py", line 272, in forward
self_output = self.self(input_tensor, attention_mask)
File "C:\Users\Admin\Miniconda3\lib\site-packages\torch\nn\modules\module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "C:\Users\Admin\Desktop\Projects\University\Software\Trainer+Evaluator\sqlova-master\bert\modeling.py", line 236, in forward
attention_probs = self.dropout(attention_probs)
File "C:\Users\Admin\Miniconda3\lib\site-packages\torch\nn\modules\module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "C:\Users\Admin\Miniconda3\lib\site-packages\torch\nn\modules\dropout.py", line 58, in forward
return F.dropout(input, self.p, self.training, self.inplace)
File "C:\Users\Admin\Miniconda3\lib\site-packages\torch\nn\functional.py", line 830, in dropout
else _VF.dropout(input, p, training))
RuntimeError: [enforce fail at ..\c10\core\CPUAllocator.cpp:62] data. DefaultCPUAllocator: not enough memory: you tried to allocate %dGB. Buy new RAM!0

about B and C Modular in your paper

In your paper, what are the roles of Decoder-Layer and NL2SQL-Layer respectively, and what is the order of execution before and after? I don't quite understand your meaning.
thanks

How to fine tune sequence-to-SQL?

I want to fine tune sequence-to-SQL, which takes natural language sentences into a program and gets the arithmetic and nth value.
What should I do?

pickle problem in train

Hi guys:
Everything before training goes well. However, when i got in epoches, problem is as following. Have I do something wrong?

Microsoft Windows [版本 10.0.17134.556]
(c) 2018 Microsoft Corporation。保留所有权利。

(venv) C:\PycharmProjects\sqlova>python train.py --seed 1 --bS 16 --accumulate_gradients 2 --bert_type_abb uS --fine_tune --lr 0.001 --lr_bert 0.00001 --max_seq_le
ng 222
BERT-type: uncased_L-12_H-768_A-12
Batch_size = 32
BERT parameters:
learning rate: 1e-05
Fine-tune BERT: True
vocab size: 30522
hidden_size: 768
num_hidden_layer: 12
num_attention_heads: 12
hidden_act: gelu
intermediate_size: 3072
hidden_dropout_prob: 0.1
attention_probs_dropout_prob: 0.1
max_position_embeddings: 512
type_vocab_size: 2
initializer_range: 0.02
Load pre-trained parameters.
Seq-to-SQL: the number of final BERT layers to be used: 2
Seq-to-SQL: the size of hidden dimension = 100
Seq-to-SQL: LSTM encoding layer size = 2
Seq-to-SQL: dropout rate = 0.3
Seq-to-SQL: learning rate = 0.001
Traceback (most recent call last):
File "train.py", line 591, in
dset_name='train')
File "train.py", line 211, in train
for iB, t in enumerate(train_loader):
File "C:\PycharmProjects\sqlova\venv\lib\site-packages\torch\utils\data\dataloader.py", line 822, in iter
return _DataLoaderIter(self)
File "C:\PycharmProjects\sqlova\venv\lib\site-packages\torch\utils\data\dataloader.py", line 563, in init
w.start()
File "C:\Python36\lib\multiprocessing\process.py", line 105, in start
self._popen = self._Popen(self)
File "C:\Python36\lib\multiprocessing\context.py", line 223, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "C:\Python36\lib\multiprocessing\context.py", line 322, in _Popen
return Popen(process_obj)
File "C:\Python36\lib\multiprocessing\popen_spawn_win32.py", line 65, in init
reduction.dump(process_obj, to_child)
File "C:\Python36\lib\multiprocessing\reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
AttributeError: Can't pickle local object 'get_loader_wikisql..'

(venv) C:\PycharmProjects\sqlova>Traceback (most recent call last):
File "", line 1, in
File "C:\Python36\lib\multiprocessing\spawn.py", line 105, in spawn_main
exitcode = _main(fd)
File "C:\Python36\lib\multiprocessing\spawn.py", line 115, in _main
self = reduction.pickle.load(from_parent)
EOFError: Ran out of input

(venv) C:\PycharmProjects\sqlova>

How to use trained model for sample questions?

Hi,
I am new to Deep Learning. your solution is for seq2SQL is very impressive. Could you please provide prediction.py file so that we can use to trained model for sample questions and get the sense of it.

Appreciate your help.

Thanks,
Prasad.P

how to test?

Excuse me, I'm not sure how to test. I already have test. jsonl and test. table. jsonl. How can I load the trained model, then load test. jsonl and test. table. jsonl into the model.
Prediction by Model, and then output the predicted SQL statement.
Because I want to make a manual comparison between the predicted SQL statements and the correct SQL statements.
Thanks again

train_shallow_layer.py doesn't train correctly

I'm trying to train the shallow-layer model and after 4-5 epochs I'm still seeing acc_lx close to zero. Is that normal? If you have an example training run log and the associated losses, that would be great. I want to make sure that something isn't broken before letting it train for a couple days.

The loss actually doesn't seem to change at all between epochs so I think training isn't happening, but I haven't modified the source code other than the paths.

Including my training configuration and output log below. I have an 11GB GPU so I had to change the batch size and gradient accumulation to prevent out of memory errors.

python train_shallow_layer.py --seed 1 --bS 8 --accumulate_gradients 4 --bert_type_abb uS --fine_tune --lr 0.001 --lr_bert 0.00001 --max_seq_leng 222

BERT-type: uncased_L-12_H-768_A-12
Batch_size = 32
BERT parameters:
learning rate: 1e-05
Fine-tune BERT: True
vocab size: 30522
hidden_size: 768
num_hidden_layer: 12
num_attention_heads: 12
hidden_act: gelu
intermediate_size: 3072
hidden_dropout_prob: 0.1
attention_probs_dropout_prob: 0.1
max_position_embeddings: 512
type_vocab_size: 2
initializer_range: 0.02
Load pre-trained parameters.
Seq-to-SQL: the number of final BERT layers to be used: 1
Seq-to-SQL: the size of hidden dimension = 100
Seq-to-SQL: LSTM encoding layer size = 2
Seq-to-SQL: dropout rate = 0.3
Seq-to-SQL: learning rate = 0.001


train results ------------
 Epoch: 0, ave loss: 6.216941231456922, acc_sc: 0.163, acc_sa: 0.717, acc_wn: 0.590,         acc_wc: 0.092, acc_wo: 0.547, acc_wvi: 0.016, acc_wv: 0.016, acc_lx: 0.000, acc_x: 0.001
dev results ------------
 Epoch: 0, ave loss: 6.288717828444157, acc_sc: 0.174, acc_sa: 0.715, acc_wn: 0.683,         acc_wc: 0.143, acc_wo: 0.658, acc_wvi: 0.016, acc_wv: 0.027, acc_lx: 0.000, acc_x: 0.001
 Best Dev lx acc: 0.00023750148438427741 at epoch: 0
train results ------------
 Epoch: 1, ave loss: 6.191903309470515, acc_sc: 0.166, acc_sa: 0.720, acc_wn: 0.692,         acc_wc: 0.113, acc_wo: 0.668, acc_wvi: 0.028, acc_wv: 0.028, acc_lx: 0.000, acc_x: 0.001
dev results ------------
 Epoch: 1, ave loss: 6.2836473057494, acc_sc: 0.168, acc_sa: 0.715, acc_wn: 0.683,         acc_wc: 0.148, acc_wo: 0.658, acc_wvi: 0.0
07, acc_wv: 0.014, acc_lx: 0.000, acc_x: 0.000
 Best Dev lx acc: 0.00023750148438427741 at epoch: 0
train results ------------
 Epoch: 2, ave loss: 6.187300067725954, acc_sc: 0.167, acc_sa: 0.720, acc_wn: 0.693,         acc_wc: 0.113, acc_wo: 0.669, acc_wvi: 0.033, acc_wv: 0.033, acc_lx: 0.000, acc_x: 0.001
dev results ------------
 Epoch: 2, ave loss: 6.283452489599453, acc_sc: 0.169, acc_sa: 0.715, acc_wn: 0.683,         acc_wc: 0.152, acc_wo: 0.658, acc_wvi: 0.000, acc_wv: 0.000, acc_lx: 0.000, acc_x: 0.001
 Best Dev lx acc: 0.00023750148438427741 at epoch: 0

Pre-Trained model

Can I get access to any pre-trained model which I can load and test?

Annotated queries not lower-cased

Hi, may I ask if all the queries are supposed to be lowercased during the annotation? Since in your code, for the bert, uS is the default setting. However, when annotating queries, they are not being [lowercased.]

Thank you very much for the help!

new error when run train.py

image
when i first time to run the train.py in Pycharm on win10,but it occurs.
Can u tell me why? and how to fix it?Thx

train.py runs forever...15 hrs on a GTX 960 that i had to abort

➜ sqlova git:(master) python3 train.py --seed 1 --bS 16 --accumulate_gradients 2 --bert_type_abb uS --fine_tune --lr 0.001 --lr_bert 0.00001 --max_seq_leng 222

BERT-type: uncased_L-12_H-768_A-12
Batch_size = 32
BERT parameters:
learning rate: 1e-05
Fine-tune BERT: True
vocab size: 30522
hidden_size: 768
num_hidden_layer: 12
num_attention_heads: 12
hidden_act: gelu
intermediate_size: 3072
hidden_dropout_prob: 0.1
attention_probs_dropout_prob: 0.1
max_position_embeddings: 512
type_vocab_size: 2
initializer_range: 0.02
Load pre-trained parameters.
Seq-to-SQL: the number of final BERT layers to be used: 2
Seq-to-SQL: the size of hidden dimension = 100
Seq-to-SQL: LSTM encoding layer size = 2
Seq-to-SQL: dropout rate = 0.3
Seq-to-SQL: learning rate = 0.001
/home/leftnoteasy/miniconda3/lib/python3.7/site-packages/torch/nn/functional.py:1386: UserWarning: nn.functional.sigmoid is deprecated. Use torch.sigmoid instead.
warnings.warn("nn.functional.sigmoid is deprecated. Use torch.sigmoid instead.")

^C
Traceback (most recent call last):
File "train.py", line 605, in
dset_name='train')
File "train.py", line 241, in train
num_out_layers_n=num_target_layers, num_out_layers_h=num_target_layers)
File "/home/leftnoteasy/borde/sqlova/sqlova/utils/utils_wikisql.py", line 817, in get_wemb_bert
nlu_tt, t_to_tt_idx, tt_to_t_idx = get_bert_output(model_bert, tokenizer, nlu_t, hds, max_seq_length)
File "/home/leftnoteasy/borde/sqlova/sqlova/utils/utils_wikisql.py", line 751, in get_bert_output
all_encoder_layer, pooled_output = model_bert(all_input_ids, all_segment_ids, all_input_mask)
File "/home/leftnoteasy/miniconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "/home/leftnoteasy/borde/sqlova/bert/modeling.py", line 396, in forward
all_encoder_layers = self.encoder(embedding_output, extended_attention_mask)
File "/home/leftnoteasy/miniconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "/home/leftnoteasy/borde/sqlova/bert/modeling.py", line 326, in forward
hidden_states = layer_module(hidden_states, attention_mask)
File "/home/leftnoteasy/miniconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "/home/leftnoteasy/borde/sqlova/bert/modeling.py", line 311, in forward
attention_output = self.attention(hidden_states, attention_mask)
File "/home/leftnoteasy/miniconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "/home/leftnoteasy/borde/sqlova/bert/modeling.py", line 272, in forward
self_output = self.self(input_tensor, attention_mask)
File "/home/leftnoteasy/miniconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "/home/leftnoteasy/borde/sqlova/bert/modeling.py", line 215, in forward
mixed_query_layer = self.query(hidden_states)
File "/home/leftnoteasy/miniconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "/home/leftnoteasy/miniconda3/lib/python3.7/site-packages/torch/nn/modules/linear.py", line 92, in forward
return F.linear(input, self.weight, self.bias)
File "/home/leftnoteasy/miniconda3/lib/python3.7/site-packages/torch/nn/functional.py", line 1408, in linear
output = input.matmul(weight.t())
KeyboardInterrupt
^C

What's the intuition behind the segment id labels?

Dear authors, I'm new to BERT and hope to ask a question regarding the segment id setting.

tokens: ['[CLS]', 'how', 'many', 'table', 'singer', 'do', 'we', 'have', '?', '[SEP]', 'stadium', '[SEP]', 'singer', '[SEP]', 'concert', '[SEP]', 'singer', 'in', 'concert', '[SEP]']

segment id: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 1, 0, 1, 1, 1, 1]

Are there any particular thoughts on setting all the "[SEP]" to 0 (and the last one to 1)? I think segment ids are used to separate sentences. Thanks in advance for answering my question!

about *.jsonl

Hello, I've run through the whole program and made a prediction, but the files I predict to use are test. jsonl, test. table. jsonl, test. db and test_jok. jsonl.
Now I want to use my own data to predict. I see your add_csv.py and add_question.py files, and I have two questions about them:

  1. the key of "sql" in *. jsonl, how do I fill in the value inside, like "sel" "agg". Do I need to understand the database and fill it in by myself?
  2. You do not provide the code to generate *_jok.jsonl. But the program needs this file, how do I create it?
    I hope to get your help. Thank you.

Not getting proper output with dates

Hi,
I'm using the same model on my custom datasets. And, I'm running some queries like "tickets available after 21 July 2019". But, I'm not getting proper output. Can anyone please suggest me how to handle this?

weird error occurs when run predict.py

Sorry, execution of evaluate_ws.py failed with an error.
Can you guess the cause?

Thank you in advance.

algorithm datasets Accuracy
gloom chaos 5
gloom violin 6
gloom tooth 5
brave chaos 4
brave violin 8
brave tooth 9

question:
Is the accuracy of the algorithm gloom in the dataset chaos smaller than the accuracy of the algorithm brave in the dataset tooth?
error:
invalid argument 0: Sizes of tensors must match except in dimension 2. Got 3 and 4 in dimension 1 at /Users/soumith/code/builder/wheel/pytorch-src/aten/src/TH/generic/THTensorMath.cpp:3616

assert problem

sqlova/utils/utils_wikisql.py line
def get_bert_output
len 728-735

`

        while len(input_ids1) < max_seq_length:
                    input_ids1.append(0)
                    input_mask1.append(0)
                    segment_ids1.append(0)

        assert len(input_ids1) == max_seq_length
        assert len(input_mask1) == max_seq_length
        assert len(segment_ids1) == max_seq_length

`
There will raise a error if --max_seq_leng is smaller than seq length.

On the actual situation

Thank you for your great work, and then in practical application, if I just give "question". Can SQL achieve output prediction

Multiple Where Clauses

Although WikiSQL Test dataset has only one where clause, if we use the shallow layer model on our own training dataset and test set, can SQLlova deal with multiple where clauses. I believe the functions get_wc1, get_wo1, get_wv1 handle this such that you can have multiple condition 'lists'. Would love to get clarification on whether that's true.

What is the case for multiple select columns? Is this something that is possible already or would I need to change the code to accommodate the case of training data with multiple selected columns?

ProgrammingError: (sqlite3.ProgrammingError) Cannot operate on a closed database.

Hi, I've tried to run train.py and getting this error, how should I do to solve it?
Traceback (most recent call last):
File "train.py", line 607, in
dset_name='train')
File "train.py", line 310, in train
cnt_x1_list, g_ans, pr_ans = get_cnt_x_list(engine, tb, g_sc, g_sa, sql_i, pr_sc, pr_sa, pr_sql_i)
File "/home/ubuntu/hengan.liao/sqlova-master/sqlova/utils/utils_wikisql.py", line 1652, in get_cnt_x_list
g_ans1 = engine.execute(tb[b]['id'], g_sc[b], g_sa[b], g_sql_i[b]['conds'])
File "/home/ubuntu/hengan.liao/sqlova-master/sqlnet/dbengine.py", line 29, in execute
table_info = self.db.query('SELECT sql from sqlite_master WHERE tbl_name = :name', name=table_id).all()[0].sql.replace('\n','')
File "/home/ubuntu/anaconda3/envs/wcyEnv/lib/python3.6/site-packages/records.py", line 195, in all
rows = list(self)
File "/home/ubuntu/anaconda3/envs/wcyEnv/lib/python3.6/site-packages/records.py", line 126, in iter
yield next(self)
File "/home/ubuntu/anaconda3/envs/wcyEnv/lib/python3.6/site-packages/records.py", line 136, in next
nextrow = next(self._rows)
File "/home/ubuntu/anaconda3/envs/wcyEnv/lib/python3.6/site-packages/records.py", line 365, in
row_gen = (Record(cursor.keys(), row) for row in cursor)
File "/home/ubuntu/anaconda3/envs/wcyEnv/lib/python3.6/site-packages/sqlalchemy/engine/result.py", line 946, in iter
row = self.fetchone()
File "/home/ubuntu/anaconda3/envs/wcyEnv/lib/python3.6/site-packages/sqlalchemy/engine/result.py", line 1276, in fetchone
e, None, None, self.cursor, self.context
File "/home/ubuntu/anaconda3/envs/wcyEnv/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1466, in _handle_dbapi_exception
util.raise_from_cause(sqlalchemy_exception, exc_info)
File "/home/ubuntu/anaconda3/envs/wcyEnv/lib/python3.6/site-packages/sqlalchemy/util/compat.py", line 383, in raise_from_cause
reraise(type(exception), exception, tb=exc_tb, cause=cause)
File "/home/ubuntu/anaconda3/envs/wcyEnv/lib/python3.6/site-packages/sqlalchemy/util/compat.py", line 128, in reraise
raise value.with_traceback(tb)
File "/home/ubuntu/anaconda3/envs/wcyEnv/lib/python3.6/site-packages/sqlalchemy/engine/result.py", line 1268, in fetchone
row = self._fetchone_impl()
File "/home/ubuntu/anaconda3/envs/wcyEnv/lib/python3.6/site-packages/sqlalchemy/engine/result.py", line 1148, in _fetchone_impl
return self.cursor.fetchone()
sqlalchemy.exc.ProgrammingError: (sqlite3.ProgrammingError) Cannot operate on a closed database.
(Background on this error at: http://sqlalche.me/e/f405)

Prediction uses select column from the ground truth

Hi Wonseok:
thanks for your pretty work here and the issue is as following:
https://github.com/naver/sqlova/blob/master/train.py#L406
g_wvi_corenlp = get_g_wvi_corenlp(t)
this is using the "conds" in dev and if one want to predict from a plan sentence of english, he would not find the way to get something like "{"conds":[[0,0,"1998"]], "sel":1, "agg":0}".

the following url tells a similar problem on SQLnet
xiaojunxu/SQLNet#12

have i miss something or do you have a similar snippet of codes.

thank a lot

How do I use predict.py for custom data?

I have CSV data of strings and numbers, how should I convert them into DB and other jsonl files and infer it with my custom data? add_csv.py isn't accepting string values in the CSV.

Thanks,
Bill.

RuntimeError: CUDA error: out of memory

BERT-type: uncased_L-12_H-768_A-12
Batch_size = 8
BERT parameters:
learning rate: 1e-05
Fine-tune BERT: True
vocab size: 30522
hidden_size: 768
num_hidden_layer: 12
num_attention_heads: 12
hidden_act: gelu
intermediate_size: 3072
hidden_dropout_prob: 0.1
attention_probs_dropout_prob: 0.1
max_position_embeddings: 512
type_vocab_size: 2
initializer_range: 0.02
Load pre-trained parameters.
Traceback (most recent call last):
File "train.py", line 576, in
model, model_bert, tokenizer, bert_config = get_models(args, BERT_PT_PATH)
File "train.py", line 156, in get_models
args.no_pretraining)
File "train.py", line 126, in get_bert
model_bert.to(device)
File "/home/ubuntu/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 386, in to
return self._apply(convert)
File "/home/ubuntu/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 193, in _apply
module._apply(fn)
File "/home/ubuntu/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 193, in _apply
module._apply(fn)
File "/home/ubuntu/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 199, in _apply
param.data = fn(param.data)
File "/home/ubuntu/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 384, in convert
return t.to(device, dtype if t.is_floating_point() else None, non_blocking)
RuntimeError: CUDA error: out of memory

Testing models with predict.py does not give me any results file

I cant to test your project using predict.py, i runned "python predict.py" commande with requires parameters and its still running in the step below without gives me the file of results. (in my cas results_dev.jsonl) ??
I'm tested on both windows and ubuntu (with CPU only and without CUDA).

XXXX@DESKTOP-YYYY:/mnt/c/users/administrateur/desktop/sqlova-master$ python3 predict.py --bert_type_abb uL --model_file models/model_best.pt --bert_model_file models/model_bert_best.pt --bert_path data --result_path result --data_path data --split dev
BERT-type: uncased_L-24_H-1024_A-16
Batch_size = 32
BERT parameters:
learning rate: 1e-05
Fine-tune BERT: False
vocab size: 30522
hidden_size: 1024
num_hidden_layer: 24
num_attention_heads: 16
hidden_act: gelu
intermediate_size: 4096
hidden_dropout_prob: 0.1
attention_probs_dropout_prob: 0.1
max_position_embeddings: 512
type_vocab_size: 2
initializer_range: 0.02
Seq-to-SQL: the number of final BERT layers to be used: 2
Seq-to-SQL: the size of hidden dimension = 100
Seq-to-SQL: LSTM encoding layer size = 2
Seq-to-SQL: dropout rate = 0.3
Seq-to-SQL: learning rate = 0.001

Missing Positional Requirements

When I run python3 train.py --seed 1 --bS 16 --accumulate_gradients 2 --bert_type_abb uS --fine_tune --lr 0.001 --lr_bert 0.00001 --max_seq_leng 222 .
I get the error
Traceback (most recent call last): File "train.py", line 582, in <module> model, model_bert, tokenizer, bert_config = get_models(args, BERT_PT_PATH, trained=True, path_model_bert=path_model_bert, path_model=path_model) File "train.py", line 156, in get_models args.no_pretraining) File "train.py", line 120, in get_bert model_bert = BertModel(bert_config) TypeError: __init__() missing 2 required positional arguments: 'is_training' and 'input_ids
I am using Python 3.6.7

Missing quotes in Where clause

I modified the code for making predictions for a user query. But in the generated SQL query, 'text' datatypes do not have quotes around them in the Where clause.

Basically, I get

'SELECT avg(sentiScore) FROM review_data WHERE product = pizza'

Instead of generating

'SELECT avg(sentiScore) FROM review_data WHERE product = "pizza" '

Is there something that I'm missing? Or is this the limitation of the DL approach?

RuntimeError: cannot join current thread

I start the stanford-corenlp-full-2018-10-05 server at localhost,
and run annotate_ws.py.
The error log is:

C:\Python36\python.exe C:/Users/gt/Desktop/sqlova-master/annotate_ws.py
annotating ./data\train.jsonl
loading tables
 99%|█████████▉| 18370/18585 [00:00<00:00, 17084.73it/s]loading examples
100%|██████████| 18585/18585 [00:01<00:00, 18542.01it/s]
 15%|█▌        | 8460/56355 [01:54<17:39, 45.21it/s]Traceback (most recent call last):
  File "C:\Python36\lib\site-packages\urllib3\connection.py", line 141, in _new_conn
    (self.host, self.port), self.timeout, **extra_kw)
  File "C:\Python36\lib\site-packages\urllib3\util\connection.py", line 83, in create_connection
    raise err
  File "C:\Python36\lib\site-packages\urllib3\util\connection.py", line 73, in create_connection
    sock.connect(sa)
OSError: [WinError 10048] 通常每个套接字地址(协议/网络地址/端口)只允许使用一次。

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Python36\lib\site-packages\urllib3\connectionpool.py", line 601, in urlopen
    chunked=chunked)
  File "C:\Python36\lib\site-packages\urllib3\connectionpool.py", line 357, in _make_request
    conn.request(method, url, **httplib_request_kw)
  File "C:\Python36\lib\http\client.py", line 1239, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "C:\Python36\lib\http\client.py", line 1285, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "C:\Python36\lib\http\client.py", line 1234, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "C:\Python36\lib\http\client.py", line 1026, in _send_output
    self.send(msg)
  File "C:\Python36\lib\http\client.py", line 964, in send
    self.connect()
  File "C:\Python36\lib\site-packages\urllib3\connection.py", line 166, in connect
    conn = self._new_conn()
  File "C:\Python36\lib\site-packages\urllib3\connection.py", line 150, in _new_conn
    self, "Failed to establish a new connection: %s" % e)
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x0000014AC11179E8>: Failed to establish a new connection: [WinError 10048] 通常每个套接字地址(协议/网络地址/端口)只允许使用一次。

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Python36\lib\site-packages\requests\adapters.py", line 440, in send
    timeout=timeout
  File "C:\Python36\lib\site-packages\urllib3\connectionpool.py", line 639, in urlopen
    _stacktrace=sys.exc_info()[2])
  File "C:\Python36\lib\site-packages\urllib3\util\retry.py", line 388, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='localhost', port=9000): Max retries exceeded with url: /?properties=%7B%27annotators%27%3A+%27ssplit%2Ctokenize%27%2C+%27outputFormat%27%3A+%27serialized%27%2C+%27serializer%27%3A+%27edu.stanford.nlp.pipeline.ProtobufAnnotationSerializer%27%7D (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x0000014AC11179E8>: Failed to establish a new connection: [WinError 10048] 通常每个套接字地址(协议/网络地址/端口)只允许使用一次。',))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:/Users/gt/Desktop/sqlova-master/annotate_ws.py", line 188, in <module>
    a = annotate_example_ws(d, tables[d['table_id']])
  File "C:/Users/gt/Desktop/sqlova-master/annotate_ws.py", line 118, in annotate_example_ws
    _wv_ann1 = annotate(str(conds11[2]))
  File "C:/Users/gt/Desktop/sqlova-master/annotate_ws.py", line 23, in annotate
    for s in client.annotate(sentence):
  File "C:\Python36\lib\site-packages\stanza\nlp\corenlp.py", line 119, in annotate
    doc_pb = self.annotate_proto(text, annotators)
  File "C:\Python36\lib\site-packages\stanza\nlp\corenlp.py", line 99, in annotate_proto
    r = self._request(text, properties)
  File "C:\Python36\lib\site-packages\stanza\nlp\corenlp.py", line 58, in _request
    r = requests.post(self.server, params={'properties': str(properties)}, data=text.encode('utf-8'))
  File "C:\Python36\lib\site-packages\requests\api.py", line 112, in post
    return request('post', url, data=data, json=json, **kwargs)
  File "C:\Python36\lib\site-packages\requests\api.py", line 58, in request
    return session.request(method=method, url=url, **kwargs)
  File "C:\Python36\lib\site-packages\requests\sessions.py", line 508, in request
    resp = self.send(prep, **send_kwargs)
  File "C:\Python36\lib\site-packages\requests\sessions.py", line 618, in send
    r = adapter.send(request, **kwargs)
  File "C:\Python36\lib\site-packages\requests\adapters.py", line 508, in send
    raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPConnectionPool(host='localhost', port=9000): Max retries exceeded with url: /?properties=%7B%27annotators%27%3A+%27ssplit%2Ctokenize%27%2C+%27outputFormat%27%3A+%27serialized%27%2C+%27serializer%27%3A+%27edu.stanford.nlp.pipeline.ProtobufAnnotationSerializer%27%7D (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x0000014AC11179E8>: Failed to establish a new connection: [WinError 10048] 通常每个套接字地址(协议/网络地址/端口)只允许使用一次。',))
Exception ignored in: <bound method tqdm.__del__ of  15%|█▌        | 8460/56355 [01:54<17:39, 45.21it/s]>
Traceback (most recent call last):
  File "C:\Python36\lib\site-packages\tqdm\_tqdm.py", line 931, in __del__
    self.close()
  File "C:\Python36\lib\site-packages\tqdm\_tqdm.py", line 1133, in close
    self._decr_instances(self)
  File "C:\Python36\lib\site-packages\tqdm\_tqdm.py", line 496, in _decr_instances
    cls.monitor.exit()
  File "C:\Python36\lib\site-packages\tqdm\_monitor.py", line 52, in exit
    self.join()
  File "C:\Python36\lib\threading.py", line 1053, in join
    raise RuntimeError("cannot join current thread")
RuntimeError: cannot join current thread

Process finished with exit code 1

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.