naver / sqlova Goto Github PK

View Code? Open in Web Editor NEW

631.0 31.0 166.0 404 KB

License: Apache License 2.0

Python 55.73% Jupyter Notebook 44.21% Shell 0.05%

sqlova's Introduction

SQLova

SQLova is a neural semantic parser translating natural language utterance to SQL query. The name is originated from the name of our department: Search & QLova (Search & Clova).

Authors

Wonseok Hwang, Jinyeong Yim, Seunghyun Park, and Minjoon Seo.
Affiliation: Clova AI Research, NAVER Corp., Seongnam, Korea.
The updated version of manuscript is available from arXiv.
- The manuscript is significantly re-written to improve readability.
- The detailed description of the model and human evaluation process have added.
- To be presented at KR2ML Workshop at NeurIPS 2019.
- The old version.

Abstract

We present the new state-of-the-art semantic parsing model that translates a natural language (NL) utterance into a SQL query.
The model is evaluated on WikiSQL, a semantic parsing dataset consisting of 80,654 (NL, SQL) pairs over 24,241 tables from Wikipedia.
We achieve 83.6% logical form accuracy and 89.6% execution accuracy on WikiSQL test set.

The model in a nutshell

BERT based table- and context-aware word-embedding.
The sequence-to-SQL model leveraging recent works (Seq2SQL, SQLNet).
Execution-guided decoding is applied in SQLova-EG.

Results (Updated at Jan 12, 2019)

Model	Dev logical form accuracy	Dev execution accuracy	Test logical form accuracy	Test execution accuracy
SQLova	81.6 (+5.5)^	87.2 (+3.2)^	80.7 (+5.3)^	86.2 (+2.5)^
SQLova-EG	84.2 (+8.2)*	90.2 (+3.0)*	83.6(+8.2)*	89.6 (+2.5)*

^: Compared to current SOTA models that do not use execution guided decoding.
*: Compared to current SOTA.
The order of where conditions is ignored in measuring logical form accuracy in our model.

Source code

Requirements

python3.6 or higher.
PyTorch 0.4.0 or higher.
CUDA 9.0
Python libraries: babel, matplotlib, defusedxml, tqdm
Example
- Install minicoda
- conda install pytorch torchvision -c pytorch
- conda install -c conda-forge records==0.5.2
- conda install babel
- conda install matplotlib
- conda install defusedxml
- conda install tqdm
The code has been tested on Tesla M40 GPU running on Ubuntu 16.04.4 LTS.

Running code

Type python3 train.py --seed 1 --bS 16 --accumulate_gradients 2 --bert_type_abb uS --fine_tune --lr 0.001 --lr_bert 0.00001 --max_seq_leng 222 on terminal.
- --seed 1: Set the seed of random generator. The accuracies changes by few percent depending on seed.
- --bS 16: Set the batch size by 16.
- --accumulate_gradients 2: Make the effective batch size be 16 * 2 = 32.
- --bert_type_abb uS: Uncased-Base BERT model is used. Use uL to use Uncased-Large BERT.
- --fine_tune: Train BERT. Without this, only the sequence-to-SQL module is trained.
- --lr 0.001: Set the learning rate of the sequence-to-SQL module as 0.001.
- --lr_bert 0.00001: Set the learning rate of BERT module as 0.00001.
- --max_seq_leng 222: Set the maximum number of input token lengths of BERT.
The model should show ~79% logical accuracy (lx) on dev set after ~12 hrs (~10 epochs). Higher accuracy can be obtained with longer training, by selecting different seed, by using Uncased Large BERT model, or by using execution guided decoding.
Add --EG argument while running train.py to use execution guided decoding.
Whenever higher logical form accuracy calculated on the dev set, following three files are saved on current folder:
- model_best.pt: the checkpoint of the the sequence-to-SQL module.
- model_bert_best.pt: the checkpoint of the BERT module.
- results_dev.jsonl: json file for official evaluation.
Shallow-Layer and Decoder-Layer models can be trained similarly (train_shallow_layer.py, train_decoder_layer.py).

Evaluation on WikiSQL DEV set

To calculate logical form and execution accuracies on dev set using official evaluation script,
- Download original WikiSQL dataset.
- tar xvf data.tar.bz2
- Move them under $HOME/data/WikiSQL-1.1/data
- Set path on evaluation_ws.py. This is the file where the path information has added on original evaluation.py script. Or you can use original evaluation.py by setting the path to the files by yourself.
- Type python3 evaluation_ws.py on terminal.

Evaluation on WikiSQL TEST set

Uncomment line 550-557 of train.py to load test_loader and test_table.
One test(...) function, use test_loader and test_table instead of dev_loader and dev_table.
Save the output of test(...) with save_for_evaluation(...) function.
Evaluate with evaluatoin_ws.py as before.

Load pre-trained SQLova parameters.

Pretrained SQLova model parameters are uploaded in release. To start from this, uncomment line 562-565 and set paths.

Code base

Pretrained BERT models were downloaded from official repository.
BERT code is from huggingface-pytorch-pretrained-BERT.
The sequence-to-SQL model is started from the source code of SQLNet and significantly re-written while maintaining the basic column-attention and sequence-to-set structure of the SQLNet.

Data

The data is annotated by using annotate_ws.py which is based on annotate.py from WikiSQL repository. The tokens of natural language guery, and the start and end indices of where-conditions on natural language tokens are annotated.
Pre-trained BERT parameters can be downloaded from BERT official repository and can be coverted to ptfile using following script. You need install both pytorch and tensorflow and change BERT_BASE_DIR to your data directory.

    cd sqlova
    export BERT_BASE_DIR=data/uncased_L-12_H-768_A-12
    python bert/convert_tf_checkpoint_to_pytorch.py \
        --tf_checkpoint_path $BERT_BASE_DIR/bert_model.ckpt \
        --bert_config_file    $BERT_BASE_DIR/bert_config.json \
        --pytorch_dump_path     $BERT_BASE_DIR/pytorch_model.bin

bert/convert_tf_checkpoint_to_pytorch.py is from the previous version of huggingface-pytorch-pretrained-BERT, and current version of pytorch-pretrained-BERT is not compatible with the bert model used in this repo due to the difference in variable names (in LayerNorm). See this for the detail.
For the convenience, the annotated WikiSQL data and the PyTorch-converted pre-trained BERT parameters are available at here.

License

Copyright 2019-present NAVER Corp.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

sqlova's People

Contributors

Stargazers

Watchers

Forkers

hiyoung-asr joon-park92 jeongu pyseany awesome-archive yuzhiliu laotao jerryjia 11380824 sibearradio devross tin-chata vincentlux lovit coffeedjimmy atamborrino won21kr henghuiz-zz bhagathbabub aspk ygan kabongosalomon seniormlstar alekdev2019 frankniujc dspringq fuxiang-chen ds-keshev seanhtchoi elephtai amoshua irenelizeth caucxing takayuki211 pku-wuwei hamroune paulfitz zhanzecheng dnnyjns corporatepiyush nguyendn seominjoon tspannhw mbrukman lynchpin4 wenfengand dantodor 0xflotus jpedrone aerinkim stjordanis prashant118 steveustcer yyht guptam tarsbase longxudou quuhua911 stuartchan buaaalban rogerspy guhaifudeng christinexinxiang wlhgtc futong jokerdu hatrix233 davidgphub yifdu jeinlee1991 yikeqicn jupyzhu weiyujian jaidevd dusual jcarlosneto awesomemachinelearning k4ni5h knights207210 milozms chenshaolong yl1113 jianqiang pokbe jacobin-sctcs statechular11 liguozhanglearner dragomirradev brunnurs freyaya123 fubincom shibin-george jinyeong sigma-lm nico3865 ociredef92 a7v8x tornadozou shyamsunder0072 leschus

sqlova's Issues

How do I use predict.py for custom data?

I have CSV data of strings and numbers, how should I convert them into DB and other jsonl files and infer it with my custom data? add_csv.py isn't accepting string values in the CSV.

Thanks,
Bill.

Typos in ReadMe.md

While going through the readme file, I found a few of the typos:

guery - query
coverted - converted

Please rectify it

Hello, I've run through the whole program and made a prediction, but the files I predict to use are test. jsonl， test. table. jsonl， test. db and test_jok. jsonl.
Now I want to use my own data to predict. I see your add_csv.py and add_question.py files, and I have two questions about them:

the key of "sql" in *. jsonl, how do I fill in the value inside, like "sel" "agg". Do I need to understand the database and fill it in by myself?
You do not provide the code to generate *_jok.jsonl. But the program needs this file, how do I create it?
I hope to get your help. Thank you.

Share Shallow Layer Parameters

Would you be willing to share your shallow layer pretrained parameters using the BERT large uncased model?

HOW TO SOLVE THIS PROBLEM in train

when run "train.py --seed 1 --bS 16 --accumulate_gradients 2 --bert_type_abb uS --fine_tune --lr 0.001 --lr_bert 0.00001 --max_seq_leng 222"

HOW TO SOLVE THIS PROBLEM
this is the error :
/home/quh/.conda/envs/pytorch/lib/python3.7/site-packages/torch/nn/functional.py:1386: UserWarning: nn.functional.sigmoid is deprecated. Use torch.sigmoid instead.
warnings.warn("nn.functional.sigmoid is deprecated. Use torch.sigmoid instead.")
Error closing cursor
Traceback (most recent call last):
File "/home/quh/.conda/envs/pytorch/lib/python3.7/site-packages/sqlalchemy/engine/result.py", line 1268, in fetchone
row = self._fetchone_impl()
File "/home/quh/.conda/envs/pytorch/lib/python3.7/site-packages/sqlalchemy/engine/result.py", line 1148, in _fetchone_impl
return self.cursor.fetchone()
sqlite3.ProgrammingError: Cannot operate on a closed database.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/quh/.conda/envs/pytorch/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1333, in _safe_close_cursor
cursor.close()
sqlite3.ProgrammingError: Cannot operate on a closed database.
Traceback (most recent call last):
File "/home/quh/.conda/envs/pytorch/lib/python3.7/site-packages/sqlalchemy/engine/result.py", line 1268, in fetchone
row = self._fetchone_impl()
File "/home/quh/.conda/envs/pytorch/lib/python3.7/site-packages/sqlalchemy/engine/result.py", line 1148, in _fetchone_impl
return self.cursor.fetchone()
sqlite3.ProgrammingError: Cannot operate on a closed database.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "train.py", line 605, in
dset_name='train')
File "train.py", line 310, in train
cnt_x1_list, g_ans, pr_ans = get_cnt_x_list(engine, tb, g_sc, g_sa, sql_i, pr_sc, pr_sa, pr_sql_i)
File "/home/quh/pythonwork/nl2sql/nl2sql_baseline/sqlova/sqlova/utils/utils_wikisql.py", line 1652, in get_cnt_x_list
g_ans1 = engine.execute(tb[b]['id'], g_sc[b], g_sa[b], g_sql_i[b]['conds'])
File "/home/quh/pythonwork/nl2sql/nl2sql_baseline/sqlova/sqlnet/dbengine.py", line 29, in execute
table_info = self.db.query('SELECT sql from sqlite_master WHERE tbl_name = :name', name=table_id).all()[0].sql.replace('\n','')
File "/home/quh/.conda/envs/pytorch/lib/python3.7/site-packages/records.py", line 195, in all
rows = list(self)
File "/home/quh/.conda/envs/pytorch/lib/python3.7/site-packages/records.py", line 126, in iter
yield next(self)
File "/home/quh/.conda/envs/pytorch/lib/python3.7/site-packages/records.py", line 136, in next
nextrow = next(self._rows)
File "/home/quh/.conda/envs/pytorch/lib/python3.7/site-packages/records.py", line 365, in
row_gen = (Record(cursor.keys(), row) for row in cursor)
File "/home/quh/.conda/envs/pytorch/lib/python3.7/site-packages/sqlalchemy/engine/result.py", line 946, in iter
row = self.fetchone()
File "/home/quh/.conda/envs/pytorch/lib/python3.7/site-packages/sqlalchemy/engine/result.py", line 1276, in fetchone
e, None, None, self.cursor, self.context
File "/home/quh/.conda/envs/pytorch/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1466, in _handle_dbapi_exception
util.raise_from_cause(sqlalchemy_exception, exc_info)
File "/home/quh/.conda/envs/pytorch/lib/python3.7/site-packages/sqlalchemy/util/compat.py", line 399, in raise_from_cause
reraise(type(exception), exception, tb=exc_tb, cause=cause)
File "/home/quh/.conda/envs/pytorch/lib/python3.7/site-packages/sqlalchemy/util/compat.py", line 153, in reraise
raise value.with_traceback(tb)
File "/home/quh/.conda/envs/pytorch/lib/python3.7/site-packages/sqlalchemy/engine/result.py", line 1268, in fetchone
row = self._fetchone_impl()
File "/home/quh/.conda/envs/pytorch/lib/python3.7/site-packages/sqlalchemy/engine/result.py", line 1148, in _fetchone_impl
return self.cursor.fetchone()
sqlalchemy.exc.ProgrammingError: (sqlite3.ProgrammingError) Cannot operate on a closed database.
(Background on this error at: http://sqlalche.me/e/f405)

RuntimeError: CUDA out of memory.

BERT-type: uncased_L-12_H-768_A-12
Batch_size = 32
BERT parameters:
learning rate: 1e-05
Fine-tune BERT: True
vocab size: 30522
hidden_size: 768
num_hidden_layer: 12
num_attention_heads: 12
hidden_act: gelu
intermediate_size: 3072
hidden_dropout_prob: 0.1
attention_probs_dropout_prob: 0.1
max_position_embeddings: 512
type_vocab_size: 2
initializer_range: 0.02
Load pre-trained parameters.
Seq-to-SQL: the number of final BERT layers to be used: 2
Seq-to-SQL: the size of hidden dimension = 100
Seq-to-SQL: LSTM encoding layer size = 2
Seq-to-SQL: dropout rate = 0.3
Seq-to-SQL: learning rate = 0.001

Traceback (most recent call last):
  File "train.py", line 603, in <module>
    dset_name='train')
  File "train.py", line 239, in train
    num_out_layers_n=num_target_layers, num_out_layers_h=num_target_layers)
  File "/data4/tong.guo/sqlova-master/sqlova/utils/utils_wikisql.py", line 817, in get_wemb_bert
    nlu_tt, t_to_tt_idx, tt_to_t_idx = get_bert_output(model_bert, tokenizer, nlu_t, headers, max_seq_length)
  File "/data4/tong.guo/sqlova-master/sqlova/utils/utils_wikisql.py", line 751, in get_bert_output
    all_encoder_layer, pooled_output = model_bert(all_input_ids, all_segment_ids, all_input_mask)
  File "/data4/tong.guo/Py36/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/data4/tong.guo/sqlova-master/bert/modeling.py", line 396, in forward
    all_encoder_layers = self.encoder(embedding_output, extended_attention_mask)
  File "/data4/tong.guo/Py36/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/data4/tong.guo/sqlova-master/bert/modeling.py", line 326, in forward
    hidden_states = layer_module(hidden_states, attention_mask)
  File "/data4/tong.guo/Py36/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/data4/tong.guo/sqlova-master/bert/modeling.py", line 311, in forward
    attention_output = self.attention(hidden_states, attention_mask)
  File "/data4/tong.guo/Py36/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/data4/tong.guo/sqlova-master/bert/modeling.py", line 272, in forward
    self_output = self.self(input_tensor, attention_mask)
  File "/data4/tong.guo/Py36/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/data4/tong.guo/sqlova-master/bert/modeling.py", line 226, in forward
    attention_scores = torch.matmul(query_layer, key_layer.transpose(-1, -2))
RuntimeError: CUDA out of memory. Tried to allocate 10.50 MiB (GPU 0; 11.17 GiB total capacity; 10.59 GiB already allocated; 5.69 MiB free; 257.72 MiB cached)

train_shallow_layer.py doesn't train correctly

I'm trying to train the shallow-layer model and after 4-5 epochs I'm still seeing acc_lx close to zero. Is that normal? If you have an example training run log and the associated losses, that would be great. I want to make sure that something isn't broken before letting it train for a couple days.

The loss actually doesn't seem to change at all between epochs so I think training isn't happening, but I haven't modified the source code other than the paths.

Including my training configuration and output log below. I have an 11GB GPU so I had to change the batch size and gradient accumulation to prevent out of memory errors.

python train_shallow_layer.py --seed 1 --bS 8 --accumulate_gradients 4 --bert_type_abb uS --fine_tune --lr 0.001 --lr_bert 0.00001 --max_seq_leng 222

BERT-type: uncased_L-12_H-768_A-12
Batch_size = 32
BERT parameters:
learning rate: 1e-05
Fine-tune BERT: True
vocab size: 30522
hidden_size: 768
num_hidden_layer: 12
num_attention_heads: 12
hidden_act: gelu
intermediate_size: 3072
hidden_dropout_prob: 0.1
attention_probs_dropout_prob: 0.1
max_position_embeddings: 512
type_vocab_size: 2
initializer_range: 0.02
Load pre-trained parameters.
Seq-to-SQL: the number of final BERT layers to be used: 1
Seq-to-SQL: the size of hidden dimension = 100
Seq-to-SQL: LSTM encoding layer size = 2
Seq-to-SQL: dropout rate = 0.3
Seq-to-SQL: learning rate = 0.001


train results ------------
 Epoch: 0, ave loss: 6.216941231456922, acc_sc: 0.163, acc_sa: 0.717, acc_wn: 0.590,         acc_wc: 0.092, acc_wo: 0.547, acc_wvi: 0.016, acc_wv: 0.016, acc_lx: 0.000, acc_x: 0.001
dev results ------------
 Epoch: 0, ave loss: 6.288717828444157, acc_sc: 0.174, acc_sa: 0.715, acc_wn: 0.683,         acc_wc: 0.143, acc_wo: 0.658, acc_wvi: 0.016, acc_wv: 0.027, acc_lx: 0.000, acc_x: 0.001
 Best Dev lx acc: 0.00023750148438427741 at epoch: 0
train results ------------
 Epoch: 1, ave loss: 6.191903309470515, acc_sc: 0.166, acc_sa: 0.720, acc_wn: 0.692,         acc_wc: 0.113, acc_wo: 0.668, acc_wvi: 0.028, acc_wv: 0.028, acc_lx: 0.000, acc_x: 0.001
dev results ------------
 Epoch: 1, ave loss: 6.2836473057494, acc_sc: 0.168, acc_sa: 0.715, acc_wn: 0.683,         acc_wc: 0.148, acc_wo: 0.658, acc_wvi: 0.0
07, acc_wv: 0.014, acc_lx: 0.000, acc_x: 0.000
 Best Dev lx acc: 0.00023750148438427741 at epoch: 0
train results ------------
 Epoch: 2, ave loss: 6.187300067725954, acc_sc: 0.167, acc_sa: 0.720, acc_wn: 0.693,         acc_wc: 0.113, acc_wo: 0.669, acc_wvi: 0.033, acc_wv: 0.033, acc_lx: 0.000, acc_x: 0.001
dev results ------------
 Epoch: 2, ave loss: 6.283452489599453, acc_sc: 0.169, acc_sa: 0.715, acc_wn: 0.683,         acc_wc: 0.152, acc_wo: 0.658, acc_wvi: 0.000, acc_wv: 0.000, acc_lx: 0.000, acc_x: 0.001
 Best Dev lx acc: 0.00023750148438427741 at epoch: 0

Why WCP's output score is [batch,header_num] not [batch,4,header_num]?

Thank you! @whwang299

For real industrial application, what strategy to locate the exact table?

WikiSQL is the datasets that the table corresponding to question is given.

But in real industrial application, we have 100+ tables for 1 new question.

Thank you!

RuntimeError: CUDA error: out of memory

BERT-type: uncased_L-12_H-768_A-12
Batch_size = 8
BERT parameters:
learning rate: 1e-05
Fine-tune BERT: True
vocab size: 30522
hidden_size: 768
num_hidden_layer: 12
num_attention_heads: 12
hidden_act: gelu
intermediate_size: 3072
hidden_dropout_prob: 0.1
attention_probs_dropout_prob: 0.1
max_position_embeddings: 512
type_vocab_size: 2
initializer_range: 0.02
Load pre-trained parameters.
Traceback (most recent call last):
File "train.py", line 576, in
model, model_bert, tokenizer, bert_config = get_models(args, BERT_PT_PATH)
File "train.py", line 156, in get_models
args.no_pretraining)
File "train.py", line 126, in get_bert
model_bert.to(device)
File "/home/ubuntu/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 386, in to
return self._apply(convert)
File "/home/ubuntu/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 193, in _apply
module._apply(fn)
File "/home/ubuntu/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 193, in _apply
module._apply(fn)
File "/home/ubuntu/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 199, in _apply
param.data = fn(param.data)
File "/home/ubuntu/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 384, in convert
return t.to(device, dtype if t.is_floating_point() else None, non_blocking)
RuntimeError: CUDA error: out of memory

Testing models with predict.py does not give me any results file

I cant to test your project using predict.py, i runned "python predict.py" commande with requires parameters and its still running in the step below without gives me the file of results. (in my cas results_dev.jsonl) ??
I'm tested on both windows and ubuntu (with CPU only and without CUDA).

XXXX@DESKTOP-YYYY:/mnt/c/users/administrateur/desktop/sqlova-master$ python3 predict.py --bert_type_abb uL --model_file models/model_best.pt --bert_model_file models/model_bert_best.pt --bert_path data --result_path result --data_path data --split dev
BERT-type: uncased_L-24_H-1024_A-16
Batch_size = 32
BERT parameters:
learning rate: 1e-05
Fine-tune BERT: False
vocab size: 30522
hidden_size: 1024
num_hidden_layer: 24
num_attention_heads: 16
hidden_act: gelu
intermediate_size: 4096
hidden_dropout_prob: 0.1
attention_probs_dropout_prob: 0.1
max_position_embeddings: 512
type_vocab_size: 2
initializer_range: 0.02
Seq-to-SQL: the number of final BERT layers to be used: 2
Seq-to-SQL: the size of hidden dimension = 100
Seq-to-SQL: LSTM encoding layer size = 2
Seq-to-SQL: dropout rate = 0.3
Seq-to-SQL: learning rate = 0.001

Missing quotes in Where clause

I modified the code for making predictions for a user query. But in the generated SQL query, 'text' datatypes do not have quotes around them in the Where clause.

Basically, I get

'SELECT avg(sentiScore) FROM review_data WHERE product = pizza'

Instead of generating

'SELECT avg(sentiScore) FROM review_data WHERE product = "pizza" '

Is there something that I'm missing? Or is this the limitation of the DL approach?

ProgrammingError: (sqlite3.ProgrammingError) Cannot operate on a closed database.

Hi, I've tried to run train.py and getting this error, how should I do to solve it?
Traceback (most recent call last):
File "train.py", line 607, in
dset_name='train')
File "train.py", line 310, in train
cnt_x1_list, g_ans, pr_ans = get_cnt_x_list(engine, tb, g_sc, g_sa, sql_i, pr_sc, pr_sa, pr_sql_i)
File "/home/ubuntu/hengan.liao/sqlova-master/sqlova/utils/utils_wikisql.py", line 1652, in get_cnt_x_list
g_ans1 = engine.execute(tb[b]['id'], g_sc[b], g_sa[b], g_sql_i[b]['conds'])
File "/home/ubuntu/hengan.liao/sqlova-master/sqlnet/dbengine.py", line 29, in execute
table_info = self.db.query('SELECT sql from sqlite_master WHERE tbl_name = :name', name=table_id).all()[0].sql.replace('\n','')
File "/home/ubuntu/anaconda3/envs/wcyEnv/lib/python3.6/site-packages/records.py", line 195, in all
rows = list(self)
File "/home/ubuntu/anaconda3/envs/wcyEnv/lib/python3.6/site-packages/records.py", line 126, in iter
yield next(self)
File "/home/ubuntu/anaconda3/envs/wcyEnv/lib/python3.6/site-packages/records.py", line 136, in next
nextrow = next(self._rows)
File "/home/ubuntu/anaconda3/envs/wcyEnv/lib/python3.6/site-packages/records.py", line 365, in
row_gen = (Record(cursor.keys(), row) for row in cursor)
File "/home/ubuntu/anaconda3/envs/wcyEnv/lib/python3.6/site-packages/sqlalchemy/engine/result.py", line 946, in iter
row = self.fetchone()
File "/home/ubuntu/anaconda3/envs/wcyEnv/lib/python3.6/site-packages/sqlalchemy/engine/result.py", line 1276, in fetchone
e, None, None, self.cursor, self.context
File "/home/ubuntu/anaconda3/envs/wcyEnv/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1466, in _handle_dbapi_exception
util.raise_from_cause(sqlalchemy_exception, exc_info)
File "/home/ubuntu/anaconda3/envs/wcyEnv/lib/python3.6/site-packages/sqlalchemy/util/compat.py", line 383, in raise_from_cause
reraise(type(exception), exception, tb=exc_tb, cause=cause)
File "/home/ubuntu/anaconda3/envs/wcyEnv/lib/python3.6/site-packages/sqlalchemy/util/compat.py", line 128, in reraise
raise value.with_traceback(tb)
File "/home/ubuntu/anaconda3/envs/wcyEnv/lib/python3.6/site-packages/sqlalchemy/engine/result.py", line 1268, in fetchone
row = self._fetchone_impl()
File "/home/ubuntu/anaconda3/envs/wcyEnv/lib/python3.6/site-packages/sqlalchemy/engine/result.py", line 1148, in _fetchone_impl
return self.cursor.fetchone()
sqlalchemy.exc.ProgrammingError: (sqlite3.ProgrammingError) Cannot operate on a closed database.
(Background on this error at: http://sqlalche.me/e/f405)

On the actual situation

Thank you for your great work, and then in practical application, if I just give "question". Can SQL achieve output prediction

How to get pytorch_model_uncased for chinese?

I wanted to train Chinese-based datasets, but I found that I lacked the corresponding "pytorch_model_uncased_L-24_H-1024_A-16" file. How can I get it?

RuntimeError: Buy new RAM!0

Hi,
I'm trying to Run train.py but I'm getting this error. Should I really buy new rams?
Thanks

Traceback (most recent call last):
File "train.py", line 604, in
dset_name='train')
File "train.py", line 240, in train
num_out_layers_n=num_target_layers, num_out_layers_h=num_target_layers)
File "C:\Users\Admin\Desktop\Projects\University\Software\Trainer+Evaluator\sqlova-master\sqlova\utils\utils_wikisql.py", line 820, in get_wemb_bert
nlu_tt, t_to_tt_idx, tt_to_t_idx = get_bert_output(model_bert, tokenizer, nlu_t, hds, max_seq_length)
File "C:\Users\Admin\Desktop\Projects\University\Software\Trainer+Evaluator\sqlova-master\sqlova\utils\utils_wikisql.py", line 754, in get_bert_output
all_encoder_layer, pooled_output = model_bert(all_input_ids, all_segment_ids, all_input_mask)
File "C:\Users\Admin\Miniconda3\lib\site-packages\torch\nn\modules\module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "C:\Users\Admin\Desktop\Projects\University\Software\Trainer+Evaluator\sqlova-master\bert\modeling.py", line 396, in forward
all_encoder_layers = self.encoder(embedding_output, extended_attention_mask)
File "C:\Users\Admin\Miniconda3\lib\site-packages\torch\nn\modules\module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "C:\Users\Admin\Desktop\Projects\University\Software\Trainer+Evaluator\sqlova-master\bert\modeling.py", line 326, in forward
hidden_states = layer_module(hidden_states, attention_mask)
File "C:\Users\Admin\Miniconda3\lib\site-packages\torch\nn\modules\module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "C:\Users\Admin\Desktop\Projects\University\Software\Trainer+Evaluator\sqlova-master\bert\modeling.py", line 311, in forward
attention_output = self.attention(hidden_states, attention_mask)
File "C:\Users\Admin\Miniconda3\lib\site-packages\torch\nn\modules\module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "C:\Users\Admin\Desktop\Projects\University\Software\Trainer+Evaluator\sqlova-master\bert\modeling.py", line 272, in forward
self_output = self.self(input_tensor, attention_mask)
File "C:\Users\Admin\Miniconda3\lib\site-packages\torch\nn\modules\module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "C:\Users\Admin\Desktop\Projects\University\Software\Trainer+Evaluator\sqlova-master\bert\modeling.py", line 236, in forward
attention_probs = self.dropout(attention_probs)
File "C:\Users\Admin\Miniconda3\lib\site-packages\torch\nn\modules\module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "C:\Users\Admin\Miniconda3\lib\site-packages\torch\nn\modules\dropout.py", line 58, in forward
return F.dropout(input, self.p, self.training, self.inplace)
File "C:\Users\Admin\Miniconda3\lib\site-packages\torch\nn\functional.py", line 830, in dropout
else _VF.dropout(input, p, training))
RuntimeError: [enforce fail at ..\c10\core\CPUAllocator.cpp:62] data. DefaultCPUAllocator: not enough memory: you tried to allocate %dGB. Buy new RAM!0

Where can I get wikisql_tok?

It throws an error

(py36) one@test:~/sqlova$ python3 train.py --seed 1 --bS 16 --accumulate_gradients 2 --bert_type_abb uS --fine_tune --lr 0.001 --lr_bert 0.00001 --max_seq_leng 222
BERT-type: uncased_L-12_H-768_A-12
Traceback (most recent call last):
  File "train.py", line 552, in <module>
    train_data, train_table, dev_data, dev_table, train_loader, dev_loader = get_data(path_wikisql, args)
  File "train.py", line 183, in get_data
    train_data, train_table, dev_data, dev_table, _, _ = load_wikisql(path_wikisql, args.toy_model, args.toy_size, no_w2i=True, no_hs_tok=True)
  File "/home/one/sqlova/sqlova/utils/utils_wikisql.py", line 29, in load_wikisql
    train_data, train_table = load_wikisql_data(path_wikisql, mode='train', toy_model=toy_model, toy_size=toy_size, no_hs_tok=no_hs_tok, aug=aug)
  File "/home/one/sqlova/sqlova/utils/utils_wikisql.py", line 58, in load_wikisql_data
    with open(path_sql) as f:
FileNotFoundError: [Errno 2] No such file or directory: '/home/one/data/wikisql_tok/train_tok.jsonl'

Why wnp’s result is not the input for wcp?

In https://github.com/naver/sqlova/blob/master/sqlova/model/nl2sql/wikisql_models.py

wnp is the where number predict, wcp is the where column predict.
So I think wnp's result should be the input for wcp.
Thank you! @whwang299

Multiple Where Clauses

Although WikiSQL Test dataset has only one where clause, if we use the shallow layer model on our own training dataset and test set, can SQLlova deal with multiple where clauses. I believe the functions get_wc1, get_wo1, get_wv1 handle this such that you can have multiple condition 'lists'. Would love to get clarification on whether that's true.

What is the case for multiple select columns? Is this something that is possible already or would I need to change the code to accommodate the case of training data with multiple selected columns?

Prediction uses select column from the ground truth

Hi Wonseok:
thanks for your pretty work here and the issue is as following:
https://github.com/naver/sqlova/blob/master/train.py#L406
g_wvi_corenlp = get_g_wvi_corenlp(t)
this is using the "conds" in dev and if one want to predict from a plan sentence of english, he would not find the way to get something like "{"conds":[[0,0,"1998"]], "sel":1, "agg":0}".

the following url tells a similar problem on SQLnet
xiaojunxu/SQLNet#12

have i miss something or do you have a similar snippet of codes.

thank a lot

pickle problem in train

Hi guys:
Everything before training goes well. However, when i got in epoches, problem is as following. Have I do something wrong?

Microsoft Windows [版本 10.0.17134.556]
(c) 2018 Microsoft Corporation。保留所有权利。

(venv) C:\PycharmProjects\sqlova>python train.py --seed 1 --bS 16 --accumulate_gradients 2 --bert_type_abb uS --fine_tune --lr 0.001 --lr_bert 0.00001 --max_seq_le
ng 222
BERT-type: uncased_L-12_H-768_A-12
Batch_size = 32
BERT parameters:
learning rate: 1e-05
Fine-tune BERT: True
vocab size: 30522
hidden_size: 768
num_hidden_layer: 12
num_attention_heads: 12
hidden_act: gelu
intermediate_size: 3072
hidden_dropout_prob: 0.1
attention_probs_dropout_prob: 0.1
max_position_embeddings: 512
type_vocab_size: 2
initializer_range: 0.02
Load pre-trained parameters.
Seq-to-SQL: the number of final BERT layers to be used: 2
Seq-to-SQL: the size of hidden dimension = 100
Seq-to-SQL: LSTM encoding layer size = 2
Seq-to-SQL: dropout rate = 0.3
Seq-to-SQL: learning rate = 0.001
Traceback (most recent call last):
File "train.py", line 591, in
dset_name='train')
File "train.py", line 211, in train
for iB, t in enumerate(train_loader):
File "C:\PycharmProjects\sqlova\venv\lib\site-packages\torch\utils\data\dataloader.py", line 822, in iter
return _DataLoaderIter(self)
File "C:\PycharmProjects\sqlova\venv\lib\site-packages\torch\utils\data\dataloader.py", line 563, in init
w.start()
File "C:\Python36\lib\multiprocessing\process.py", line 105, in start
self._popen = self._Popen(self)
File "C:\Python36\lib\multiprocessing\context.py", line 223, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "C:\Python36\lib\multiprocessing\context.py", line 322, in _Popen
return Popen(process_obj)
File "C:\Python36\lib\multiprocessing\popen_spawn_win32.py", line 65, in init
reduction.dump(process_obj, to_child)
File "C:\Python36\lib\multiprocessing\reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
AttributeError: Can't pickle local object 'get_loader_wikisql..'

(venv) C:\PycharmProjects\sqlova>Traceback (most recent call last):
File "", line 1, in
File "C:\Python36\lib\multiprocessing\spawn.py", line 105, in spawn_main
exitcode = _main(fd)
File "C:\Python36\lib\multiprocessing\spawn.py", line 115, in _main
self = reduction.pickle.load(from_parent)
EOFError: Ran out of input

(venv) C:\PycharmProjects\sqlova>

weird error occurs when run evaluate_ws.py

sorry for bothering u, when i try to run evaluate_ws.py ,the error occurs and i try many ways, i failed.
Is the version of library babel?or anything else,thx

TypeError: init() missing 2 required positional arguments: 'is_training' and 'input_ids'

While running train.py in the get_bert function, in line
model_bert = BertModel(bert_config)
I am facing this issue:

TypeError: init() missing 2 required positional arguments: 'is_training' and 'input_ids'

how to test?

Excuse me, I'm not sure how to test. I already have test. jsonl and test. table. jsonl. How can I load the trained model, then load test. jsonl and test. table. jsonl into the model.
Prediction by Model, and then output the predicted SQL statement.
Because I want to make a manual comparison between the predicted SQL statements and the correct SQL statements.
Thanks again

Why does the author treat 'where value' as a classification problem?

The start and end token are treated as a classification problem and use cross-entropy loss. Have you tried nll loss and treated it as a tagging problem? Thank you for the great work. ^^

How to create training data from my own csv?

How can create * .jsonl, *.table.jsonl and *.db file from a csv which i will use as a database

can not understand here in paper.

H is the table header and col is what?

Difference between the PyTorch-converted pre-trained BERT parameters released on Google Drive and the one obtained using HuggingFace conversion script

I tried to get the Pytorch pre-trained BERT checkpoint using the conversion script provided by HuggingFace. The script executed without any problem and I was able to obtain a binary converted file.

However, I noticed a few differences between this file compared with the PyTorch-converted pre-trained BERT parameters released on Google Drive.

First, the two files has different variable naming. The HuggingFace converted file has the prefix bert. for each variable and cannot be taken by SQLova directly.

RuntimeError: Error(s) in loading state_dict for BertModel:

Missing key(s) in state_dict: "embeddings.word_embeddings.weight", "embeddings.position_embeddings.weight", "embeddings.token_type_embeddings.weight", "embeddings.LayerNorm.gamma", "embeddings.LayerNorm.beta", "encoder.layer.0.attention.self.query.weight", "encoder.layer.0.attention.self.query.bias", "encoder.layer.0.attention.self.key.weight", "encoder.layer.0.attention.self.key.bias", "encoder.layer.0.attention.self.value.weight", "encoder.layer.0.attention.self.value.bias", "encoder.layer.0.attention.output.dense.weight", "encoder.layer.0.attention.output.dense.bias", "encoder.layer.0.attention.output.LayerNorm.gamma", "encoder.layer.0.attention.output.LayerNorm.beta", "encoder.layer.0.intermediate.dense.weight", "encoder.layer.0.intermediate.dense.bias", "encoder.layer.0.output.dense.weight", "encoder.layer.0.output.dense.bias", "encoder.layer.0.output.LayerNorm.gamma", "encoder.layer.0.output.LayerNorm.beta", "encoder.layer.1.attention.self.query.weight"...

Unexpected key(s) in state_dict: "bert.embeddings.word_embeddings.weight", "bert.embeddings.position_embeddings.weight", "bert.embeddings.token_type_embeddings.weight", "bert.embeddings.LayerNorm.weight", "bert.embeddings.LayerNorm.bias", "bert.encoder.layer.0.attention.self.query.weight", "bert.encoder.layer.0.attention.self.query.bias", "bert.encoder.layer.0.attention.self.key.weight", "bert.encoder.layer.0.attention.self.key.bias", "bert.encoder.layer.0.attention.self.value.weight", "bert.encoder.layer.0.attention.self.value.bias", "bert.encoder.layer.0.attention.output.dense.weight", "bert.encoder.layer.0.attention.output.dense.bias", "bert.encoder.layer.0.attention.output.LayerNorm.weight", "bert.encoder.layer.0.attention.output.LayerNorm.bias", "bert.encoder.layer.0.intermediate.dense.weight", "bert.encoder.layer.0.intermediate.dense.bias", "bert.encoder.layer.0.output.dense.weight", "bert.encoder.layer.0.output.dense.bias", "bert.encoder.layer.0.output.LayerNorm.weight", "bert.encoder.layer.0.output.LayerNorm.bias"...

I was able to map most variables in these two files by manipulating the naming and verify their equivalence, but I cannot find a mapping of the following tensors in the HuggingFace conversion to the Google Drive release, most of them related to layer normalization.

bert.embeddings.LayerNorm.weight
bert.embeddings.LayerNorm.bias
bert.encoder.layer.0.attention.output.LayerNorm.weight
bert.encoder.layer.0.attention.output.LayerNorm.bias
bert.encoder.layer.0.output.LayerNorm.weight
bert.encoder.layer.0.output.LayerNorm.bias
bert.encoder.layer.1.attention.output.LayerNorm.weight
bert.encoder.layer.1.attention.output.LayerNorm.bias
bert.encoder.layer.1.output.LayerNorm.weight
bert.encoder.layer.1.output.LayerNorm.bias
bert.encoder.layer.2.attention.output.LayerNorm.weight
bert.encoder.layer.2.attention.output.LayerNorm.bias
bert.encoder.layer.2.output.LayerNorm.weight
bert.encoder.layer.2.output.LayerNorm.bias
bert.encoder.layer.3.attention.output.LayerNorm.weight
bert.encoder.layer.3.attention.output.LayerNorm.bias
bert.encoder.layer.3.output.LayerNorm.weight
bert.encoder.layer.3.output.LayerNorm.bias
bert.encoder.layer.4.attention.output.LayerNorm.weight
bert.encoder.layer.4.attention.output.LayerNorm.bias
bert.encoder.layer.4.output.LayerNorm.weight
bert.encoder.layer.4.output.LayerNorm.bias
bert.encoder.layer.5.attention.output.LayerNorm.weight
bert.encoder.layer.5.attention.output.LayerNorm.bias
bert.encoder.layer.5.output.LayerNorm.weight
bert.encoder.layer.5.output.LayerNorm.bias
bert.encoder.layer.6.attention.output.LayerNorm.weight
bert.encoder.layer.6.attention.output.LayerNorm.bias
bert.encoder.layer.6.output.LayerNorm.weight
bert.encoder.layer.6.output.LayerNorm.bias
bert.encoder.layer.7.attention.output.LayerNorm.weight
bert.encoder.layer.7.attention.output.LayerNorm.bias
bert.encoder.layer.7.output.LayerNorm.weight
bert.encoder.layer.7.output.LayerNorm.bias
bert.encoder.layer.8.attention.output.LayerNorm.weight
bert.encoder.layer.8.attention.output.LayerNorm.bias
bert.encoder.layer.8.output.LayerNorm.weight
bert.encoder.layer.8.output.LayerNorm.bias
bert.encoder.layer.9.attention.output.LayerNorm.weight
bert.encoder.layer.9.attention.output.LayerNorm.bias
bert.encoder.layer.9.output.LayerNorm.weight
bert.encoder.layer.9.output.LayerNorm.bias
bert.encoder.layer.10.attention.output.LayerNorm.weight
bert.encoder.layer.10.attention.output.LayerNorm.bias
bert.encoder.layer.10.output.LayerNorm.weight
bert.encoder.layer.10.output.LayerNorm.bias
bert.encoder.layer.11.attention.output.LayerNorm.weight
bert.encoder.layer.11.attention.output.LayerNorm.bias
bert.encoder.layer.11.output.LayerNorm.weight
bert.encoder.layer.11.output.LayerNorm.bias
cls.predictions.bias
cls.predictions.transform.dense.weight
cls.predictions.transform.dense.bias
cls.predictions.transform.LayerNorm.weight
cls.predictions.transform.LayerNorm.bias
cls.predictions.decoder.weight
cls.seq_relationship.weight
cls.seq_relationship.bias

May I understand what causes the above differences? Is layer normalization removed from the BERT architecture on purpose? Thanks.

Annotated queries not lower-cased

Hi, may I ask if all the queries are supposed to be lowercased during the annotation? Since in your code, for the bert, uS is the default setting. However, when annotating queries, they are not being [lowercased.]

Thank you very much for the help!

How to use trained model for sample questions?

Hi,
I am new to Deep Learning. your solution is for seq2SQL is very impressive. Could you please provide prediction.py file so that we can use to trained model for sample questions and get the sense of it.

Appreciate your help.

Thanks,
Prasad.P

what's mean in "wvi_corenlp"

{"table_id":"1-1000181-1","phase":1,"question":"What is the current series where the new series began in June 2011?","question_tok":["What","is","the","current","series","where","the","new","series","began","in","June","2011","?"],"sql":{"sel":4,"conds":[[5,0,"New series began in June 2011"]],"agg":0},"query":{"sel":4,"conds":[[5,0,"New series began in June 2011"]],"agg":0},"wvi_corenlp":[[7,12]]}

In your markup training file "train_tok.jsonl", Like "wvi_corenlp":[[7,12]], what does that mean?
How do I produce this file "train_tok.jsonl" with a new training set? thanks

What's the intuition behind the segment id labels?

Dear authors, I'm new to BERT and hope to ask a question regarding the segment id setting.

tokens: ['[CLS]', 'how', 'many', 'table', 'singer', 'do', 'we', 'have', '?', '[SEP]', 'stadium', '[SEP]', 'singer', '[SEP]', 'concert', '[SEP]', 'singer', 'in', 'concert', '[SEP]']

segment id: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 1, 0, 1, 1, 1, 1]

Are there any particular thoughts on setting all the "[SEP]" to 0 (and the last one to 1)? I think segment ids are used to separate sentences. Thanks in advance for answering my question!

Error when testing

I run the command in Readme.md. And after a long time of running, an error is thrown:

Traceback (most recent call last):
  File "train.py", line 614, in <module>
    dset_name='dev', EG=args.EG)
  File "train.py", line 495, in test
    cnt_x1_list, g_ans, pr_ans = get_cnt_x_list(engine, tb, g_sc, g_sa, sql_i, pr_sc, pr_sa, pr_sql_i)
  File "/home/sqlova-master/sqlova/utils/utils_wikisql.py", line 1651, in get_cnt_x_list
    g_ans1 = engine.execute(tb[b]['id'], g_sc[b], g_sa[b], g_sql_i[b]['conds'])
  File "/home/sqlova-master/sqlnet/dbengine.py", line 29, in execute
    table_info = self.db.query('SELECT sql from sqlite_master WHERE tbl_name = :name', name=table_id).all()[0].sql.replace('\n','')
IndexError: list index out of range

Is it because no result is received? Do you have any idea to solve this issue? Thank you in advance!

train.py runs forever...15 hrs on a GTX 960 that i had to abort

➜ sqlova git:(master) python3 train.py --seed 1 --bS 16 --accumulate_gradients 2 --bert_type_abb uS --fine_tune --lr 0.001 --lr_bert 0.00001 --max_seq_leng 222

BERT-type: uncased_L-12_H-768_A-12
Batch_size = 32
BERT parameters:
learning rate: 1e-05
Fine-tune BERT: True
vocab size: 30522
hidden_size: 768
num_hidden_layer: 12
num_attention_heads: 12
hidden_act: gelu
intermediate_size: 3072
hidden_dropout_prob: 0.1
attention_probs_dropout_prob: 0.1
max_position_embeddings: 512
type_vocab_size: 2
initializer_range: 0.02
Load pre-trained parameters.
Seq-to-SQL: the number of final BERT layers to be used: 2
Seq-to-SQL: the size of hidden dimension = 100
Seq-to-SQL: LSTM encoding layer size = 2
Seq-to-SQL: dropout rate = 0.3
Seq-to-SQL: learning rate = 0.001
/home/leftnoteasy/miniconda3/lib/python3.7/site-packages/torch/nn/functional.py:1386: UserWarning: nn.functional.sigmoid is deprecated. Use torch.sigmoid instead.
warnings.warn("nn.functional.sigmoid is deprecated. Use torch.sigmoid instead.")

^C
Traceback (most recent call last):
File "train.py", line 605, in
dset_name='train')
File "train.py", line 241, in train
num_out_layers_n=num_target_layers, num_out_layers_h=num_target_layers)
File "/home/leftnoteasy/borde/sqlova/sqlova/utils/utils_wikisql.py", line 817, in get_wemb_bert
nlu_tt, t_to_tt_idx, tt_to_t_idx = get_bert_output(model_bert, tokenizer, nlu_t, hds, max_seq_length)
File "/home/leftnoteasy/borde/sqlova/sqlova/utils/utils_wikisql.py", line 751, in get_bert_output
all_encoder_layer, pooled_output = model_bert(all_input_ids, all_segment_ids, all_input_mask)
File "/home/leftnoteasy/miniconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "/home/leftnoteasy/borde/sqlova/bert/modeling.py", line 396, in forward
all_encoder_layers = self.encoder(embedding_output, extended_attention_mask)
File "/home/leftnoteasy/miniconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "/home/leftnoteasy/borde/sqlova/bert/modeling.py", line 326, in forward
hidden_states = layer_module(hidden_states, attention_mask)
File "/home/leftnoteasy/miniconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "/home/leftnoteasy/borde/sqlova/bert/modeling.py", line 311, in forward
attention_output = self.attention(hidden_states, attention_mask)
File "/home/leftnoteasy/miniconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "/home/leftnoteasy/borde/sqlova/bert/modeling.py", line 272, in forward
self_output = self.self(input_tensor, attention_mask)
File "/home/leftnoteasy/miniconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "/home/leftnoteasy/borde/sqlova/bert/modeling.py", line 215, in forward
mixed_query_layer = self.query(hidden_states)
File "/home/leftnoteasy/miniconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "/home/leftnoteasy/miniconda3/lib/python3.7/site-packages/torch/nn/modules/linear.py", line 92, in forward
return F.linear(input, self.weight, self.bias)
File "/home/leftnoteasy/miniconda3/lib/python3.7/site-packages/torch/nn/functional.py", line 1408, in linear
output = input.matmul(weight.t())
KeyboardInterrupt
^C

assert problem

sqlova/utils/utils_wikisql.py line
def get_bert_output
len 728-735

        while len(input_ids1) < max_seq_length:
                    input_ids1.append(0)
                    input_mask1.append(0)
                    segment_ids1.append(0)

        assert len(input_ids1) == max_seq_length
        assert len(input_mask1) == max_seq_length
        assert len(segment_ids1) == max_seq_length

`
There will raise a error if --max_seq_leng is smaller than seq length.

Missing Positional Requirements

When I run python3 train.py --seed 1 --bS 16 --accumulate_gradients 2 --bert_type_abb uS --fine_tune --lr 0.001 --lr_bert 0.00001 --max_seq_leng 222 .
I get the error
Traceback (most recent call last): File "train.py", line 582, in <module> model, model_bert, tokenizer, bert_config = get_models(args, BERT_PT_PATH, trained=True, path_model_bert=path_model_bert, path_model=path_model) File "train.py", line 156, in get_models args.no_pretraining) File "train.py", line 120, in get_bert model_bert = BertModel(bert_config) TypeError: __init__() missing 2 required positional arguments: 'is_training' and 'input_ids
I am using Python 3.6.7

Loading Pretrained SQLova Parameters

I'm having an error when trying to load the pretrained parameters. Seems like the file may be missing some keys?

Any help you could provide will be much appreciated.

self.class.name, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for FT_Scalar_1:
Unexpected key(s) in state_dict: "wcp.enc_h.weight_ih_l0", "wcp.enc_h.weight_hh_l0", "wcp.enc_h.bias_ih_l0", "wcp.enc_h.bias_hh_l0", "wcp.enc_h.weight_ih_l0_reverse", "wcp.enc_h.weight_hh_l0_reverse", "wcp.enc_h.bias_ih_l0_reverse", "wcp.enc_h.bias_hh_l0_reverse", "wcp.enc_h.weight_ih_l1", "wcp.enc_h.weight_hh_l1", "wcp.enc_h.bias_ih_l1", "wcp.enc_h.bias_hh_l1", "wcp.enc_h.weight_ih_l1_reverse", "wcp.enc_h.weight_hh_l1_reverse", "wcp.enc_h.bias_ih_l1_reverse", "wcp.enc_h.bias_hh_l1_reverse", "wcp.enc_n.weight_ih_l0", "wcp.enc_n.weight_hh_l0", "wcp.enc_n.bias_ih_l0", "wcp.enc_n.bias_hh_l0", "wcp.enc_n.weight_ih_l0_reverse", "wcp.enc_n.weight_hh_l0_reverse", "wcp.enc_n.bias_ih_l0_reverse", "wcp.enc_n.bias_hh_l0_reverse", "wcp.enc_n.weight_ih_l1", "wcp.enc_n.weight_hh_l1", "wcp.enc_n.bias_ih_l1", "wcp.enc_n.bias_hh_l1", "wcp.enc_n.weight_ih_l1_reverse", "wcp.enc_n.weight_hh_l1_reverse", "wcp.enc_n.bias_ih_l1_reverse", "wcp.enc_n.bias_hh_l1_reverse", "wcp.W_att.weight", "wcp.W_att.bias", "wcp.W_c.weight", "wcp.W_c.bias", "wcp.W_hs.weight", "wcp.W_hs.bias", "wcp.W_out.1.weight", "wcp.W_out.1.bias", "scp.enc_h.weight_ih_l0", "scp.enc_h.weight_hh_l0", "scp.enc_h.bias_ih_l0", "scp.enc_h.bias_hh_l0", "scp.enc_h.weight_ih_l0_reverse", "scp.enc_h.weight_hh_l0_reverse", "scp.enc_h.bias_ih_l0_reverse", "scp.enc_h.bias_hh_l0_reverse", "scp.enc_h.weight_ih_l1", "scp.enc_h.weight_hh_l1", "scp.enc_h.bias_ih_l1", "scp.enc_h.bias_hh_l1", "scp.enc_h.weight_ih_l1_reverse", "scp.enc_h.weight_hh_l1_reverse", "scp.enc_h.bias_ih_l1_reverse", "scp.enc_h.bias_hh_l1_reverse", "scp.enc_n.weight_ih_l0", "scp.enc_n.weight_hh_l0", "scp.enc_n.bias_ih_l0", "scp.enc_n.bias_hh_l0", "scp.enc_n.weight_ih_l0_reverse", "scp.enc_n.weight_hh_l0_reverse", "scp.enc_n.bias_ih_l0_reverse", "scp.enc_n.bias_hh_l0_reverse", "scp.enc_n.weight_ih_l1", "scp.enc_n.weight_hh_l1", "scp.enc_n.bias_ih_l1", "scp.enc_n.bias_hh_l1", "scp.enc_n.weight_ih_l1_reverse", "scp.enc_n.weight_hh_l1_reverse", "scp.enc_n.bias_ih_l1_reverse", "scp.enc_n.bias_hh_l1_reverse", "scp.W_att.weight", "scp.W_att.bias", "scp.W_c.weight", "scp.W_c.bias", "scp.W_hs.weight", "scp.W_hs.bias", "scp.sc_out.1.weight", "scp.sc_out.1.bias", "sap.enc_h.weight_ih_l0", "sap.enc_h.weight_hh_l0", "sap.enc_h.bias_ih_l0", "sap.enc_h.bias_hh_l0", "sap.enc_h.weight_ih_l0_reverse", "sap.enc_h.weight_hh_l0_reverse", "sap.enc_h.bias_ih_l0_reverse", "sap.enc_h.bias_hh_l0_reverse", "sap.enc_h.weight_ih_l1", "sap.enc_h.weight_hh_l1", "sap.enc_h.bias_ih_l1", "sap.enc_h.bias_hh_l1", "sap.enc_h.weight_ih_l1_reverse", "sap.enc_h.weight_hh_l1_reverse", "sap.enc_h.bias_ih_l1_reverse", "sap.enc_h.bias_hh_l1_reverse", "sap.enc_n.weight_ih_l0", "sap.enc_n.weight_hh_l0", "sap.enc_n.bias_ih_l0", "sap.enc_n.bias_hh_l0", "sap.enc_n.weight_ih_l0_reverse", "sap.enc_n.weight_hh_l0_reverse", "sap.enc_n.bias_ih_l0_reverse", "sap.enc_n.bias_hh_l0_reverse", "sap.enc_n.weight_ih_l1", "sap.enc_n.weight_hh_l1", "sap.enc_n.bias_ih_l1", "sap.enc_n.bias_hh_l1", "sap.enc_n.weight_ih_l1_reverse", "sap.enc_n.weight_hh_l1_reverse", "sap.enc_n.bias_ih_l1_reverse", "sap.enc_n.bias_hh_l1_reverse", "sap.W_att.weight", "sap.W_att.bias", "sap.sa_out.0.weight", "sap.sa_out.0.bias", "sap.sa_out.2.weight", "sap.sa_out.2.bias", "wnp.enc_h.weight_ih_l0", "wnp.enc_h.weight_hh_l0", "wnp.enc_h.bias_ih_l0", "wnp.enc_h.bias_hh_l0", "wnp.enc_h.weight_ih_l0_reverse", "wnp.enc_h.weight_hh_l0_reverse", "wnp.enc_h.bias_ih_l0_reverse", "wnp.enc_h.bias_hh_l0_reverse", "wnp.enc_h.weight_ih_l1", "wnp.enc_h.weight_hh_l1", "wnp.enc_h.bias_ih_l1", "wnp.enc_h.bias_hh_l1", "wnp.enc_h.weight_ih_l1_reverse", "wnp.enc_h.weight_hh_l1_reverse", "wnp.enc_h.bias_ih_l1_reverse", "wnp.enc_h.bias_hh_l1_reverse", "wnp.enc_n.weight_ih_l0", "wnp.enc_n.weight_hh_l0", "wnp.enc_n.bias_ih_l0", "wnp.enc_n.bias_hh_l0", "wnp.enc_n.weight_ih_l0_reverse", "wnp.enc_n.weight_hh_l0_reverse", "wnp.enc_n.bias_ih_l0_reverse", "wnp.enc_n.bias_hh_l0_reverse", "wnp.enc_n.weight_ih_l1", "wnp.enc_n.weight_hh_l1", "wnp.enc_n.bias_ih_l1", "wnp.enc_n.bias_hh_l1", "wnp.enc_n.weight_ih_l1_reverse", "wnp.enc_n.weight_hh_l1_reverse", "wnp.enc_n.bias_ih_l1_reverse", "wnp.enc_n.bias_hh_l1_reverse", "wnp.W_att_h.weight", "wnp.W_att_h.bias", "wnp.W_hidden.weight", "wnp.W_hidden.bias", "wnp.W_cell.weight", "wnp.W_cell.bias", "wnp.W_att_n.weight", "wnp.W_att_n.bias", "wnp.wn_out.0.weight", "wnp.wn_out.0.bias", "wnp.wn_out.2.weight", "wnp.wn_out.2.bias", "wop.enc_h.weight_ih_l0", "wop.enc_h.weight_hh_l0", "wop.enc_h.bias_ih_l0", "wop.enc_h.bias_hh_l0", "wop.enc_h.weight_ih_l0_reverse", "wop.enc_h.weight_hh_l0_reverse", "wop.enc_h.bias_ih_l0_reverse", "wop.enc_h.bias_hh_l0_reverse", "wop.enc_h.weight_ih_l1", "wop.enc_h.weight_hh_l1", "wop.enc_h.bias_ih_l1", "wop.enc_h.bias_hh_l1", "wop.enc_h.weight_ih_l1_reverse", "wop.enc_h.weight_hh_l1_reverse", "wop.enc_h.bias_ih_l1_reverse", "wop.enc_h.bias_hh_l1_reverse", "wop.enc_n.weight_ih_l0", "wop.enc_n.weight_hh_l0", "wop.enc_n.bias_ih_l0", "wop.enc_n.bias_hh_l0", "wop.enc_n.weight_ih_l0_reverse", "wop.enc_n.weight_hh_l0_reverse", "wop.enc_n.bias_ih_l0_reverse", "wop.enc_n.bias_hh_l0_reverse", "wop.enc_n.weight_ih_l1", "wop.enc_n.weight_hh_l1", "wop.enc_n.bias_ih_l1", "wop.enc_n.bias_hh_l1", "wop.enc_n.weight_ih_l1_reverse", "wop.enc_n.weight_hh_l1_reverse", "wop.enc_n.bias_ih_l1_reverse", "wop.enc_n.bias_hh_l1_reverse", "wop.W_att.weight", "wop.W_att.bias", "wop.W_c.weight", "wop.W_c.bias", "wop.W_hs.weight", "wop.W_hs.bias", "wop.wo_out.0.weight", "wop.wo_out.0.bias", "wop.wo_out.2.weight", "wop.wo_out.2.bias", "wvp.enc_h.weight_ih_l0", "wvp.enc_h.weight_hh_l0", "wvp.enc_h.bias_ih_l0", "wvp.enc_h.bias_hh_l0", "wvp.enc_h.weight_ih_l0_reverse", "wvp.enc_h.weight_hh_l0_reverse", "wvp.enc_h.bias_ih_l0_reverse", "wvp.enc_h.bias_hh_l0_reverse", "wvp.enc_h.weight_ih_l1", "wvp.enc_h.weight_hh_l1", "wvp.enc_h.bias_ih_l1", "wvp.enc_h.bias_hh_l1", "wvp.enc_h.weight_ih_l1_reverse", "wvp.enc_h.weight_hh_l1_reverse", "wvp.enc_h.bias_ih_l1_reverse", "wvp.enc_h.bias_hh_l1_reverse", "wvp.enc_n.weight_ih_l0", "wvp.enc_n.weight_hh_l0", "wvp.enc_n.bias_ih_l0", "wvp.enc_n.bias_hh_l0", "wvp.enc_n.weight_ih_l0_reverse", "wvp.enc_n.weight_hh_l0_reverse", "wvp.enc_n.bias_ih_l0_reverse", "wvp.enc_n.bias_hh_l0_reverse", "wvp.enc_n.weight_ih_l1", "wvp.enc_n.weight_hh_l1", "wvp.enc_n.bias_ih_l1", "wvp.enc_n.bias_hh_l1", "wvp.enc_n.weight_ih_l1_reverse", "wvp.enc_n.weight_hh_l1_reverse", "wvp.enc_n.bias_ih_l1_reverse", "wvp.enc_n.bias_hh_l1_reverse", "wvp.W_att.weight", "wvp.W_att.bias", "wvp.W_c.weight", "wvp.W_c.bias", "wvp.W_hs.weight", "wvp.W_hs.bias", "wvp.W_op.weight", "wvp.W_op.bias", "wvp.wv_out.0.weight", "wvp.wv_out.0.bias", "wvp.wv_out.2.weight", "wvp.wv_out.2.bias".

what does table-aware mean?

It is basic but I still don't know.

@whwang299
Thank you very much!

File Not found issue

Hi
I found your paper and github code and very interested in this topic.
I've tried to test your current result, but facing errors due to your files.
From here, I tried to run the python3 train.py --seed 1 --bS 16 --accumulate_gradients 2 --bert_type_abb uS --fine_tune --lr 0.001 --lr_bert 0.00001 --max_seq_leng 222.
But I got the error like:
python train.py --seed 1 --bS 16 --accumulate_gradients 2 --bert_type_abb uS --fine_tune --lr 0.001 --lr_bert 0.00001 --max_seq_leng 222 BERT-type: uncased_L-12_H-768_A-12 Traceback (most recent call last): File "train.py", line 552, in <module> train_data, train_table, dev_data, dev_table, train_loader, dev_loader = get_data(path_wikisql, args) File "train.py", line 183, in get_data train_data, train_table, dev_data, dev_table, _, _ = load_wikisql(path_wikisql, args.toy_model, args.toy_size, no_w2i=True, no_hs_tok=True) File "/home/ubuntu/work/torch-sqlova/sqlova/sqlova/utils/utils_wikisql.py", line 29, in load_wikisql train_data, train_table = load_wikisql_data(path_wikisql, mode='train', toy_model=toy_model, toy_size=toy_size, no_hs_tok=no_hs_tok, aug=aug) File "/home/ubuntu/work/torch-sqlova/sqlova/sqlova/utils/utils_wikisql.py", line 58, in load_wikisql_data with open(path_sql) as f: FileNotFoundError: [Errno 2] No such file or directory: '/home/wonseok/data/wikisql_tok/train_tok.jsonl'

I think you have additional data except of this repository code.
What is your additional data?
Thank you.

How can I have a look at the ftable1.csv and ftable2.csv

Maybe I can generate it by myself.
But I want to start with these tables in tutorials.
Thanks!

请问在中文数据集的训练中所有指标都是0是是什么原因？

su所有的结果都是0，有人知道是什么原因吗

What is the function of `train_shallow_layer.py` and `train_decoder_layer.py`?

It seems train.py works.
Why need train_shallow_layer.py and train_decoder_layer.py?
Thank you.

Not getting proper output with dates

Hi,
I'm using the same model on my custom datasets. And, I'm running some queries like "tickets available after 21 July 2019". But, I'm not getting proper output. Can anyone please suggest me how to handle this?

Pre-Trained model

Can I get access to any pre-trained model which I can load and test?

weird error occurs when run predict.py

Sorry, execution of evaluate_ws.py failed with an error.
Can you guess the cause?

Thank you in advance.

algorithm	datasets	Accuracy
gloom	chaos	5
gloom	violin	6
gloom	tooth	5
brave	chaos	4
brave	violin	8
brave	tooth	9

question:
Is the accuracy of the algorithm gloom in the dataset chaos smaller than the accuracy of the algorithm brave in the dataset tooth?
error:
invalid argument 0: Sizes of tensors must match except in dimension 2. Got 3 and 4 in dimension 1 at /Users/soumith/code/builder/wheel/pytorch-src/aten/src/TH/generic/THTensorMath.cpp:3616

new error when run train.py

when i first time to run the train.py in Pycharm on win10,but it occurs.
Can u tell me why? and how to fix it?Thx

about B and C Modular in your paper

In your paper, what are the roles of Decoder-Layer and NL2SQL-Layer respectively, and what is the order of execution before and after? I don't quite understand your meaning.
thanks

How to fine tune sequence-to-SQL？

I want to fine tune sequence-to-SQL, which takes natural language sentences into a program and gets the arithmetic and nth value.
What should I do?

Train lf accuracy higher than execution accuracy

Hi,

When running the script, it is surprised to see that after several iterations (<3), the training logical form accuracy becomes higher than the training execution accuracy, which confuses me a lot.

May I ask what is the reason behind it? I am confused for several days and could not figure it out. Thank you for any help in advance!

RuntimeError: cannot join current thread

I start the stanford-corenlp-full-2018-10-05 server at localhost,
and run annotate_ws.py.
The error log is:

C:\Python36\python.exe C:/Users/gt/Desktop/sqlova-master/annotate_ws.py
annotating ./data\train.jsonl
loading tables
 99%|█████████▉| 18370/18585 [00:00<00:00, 17084.73it/s]loading examples
100%|██████████| 18585/18585 [00:01<00:00, 18542.01it/s]
 15%|█▌        | 8460/56355 [01:54<17:39, 45.21it/s]Traceback (most recent call last):
  File "C:\Python36\lib\site-packages\urllib3\connection.py", line 141, in _new_conn
    (self.host, self.port), self.timeout, **extra_kw)
  File "C:\Python36\lib\site-packages\urllib3\util\connection.py", line 83, in create_connection
    raise err
  File "C:\Python36\lib\site-packages\urllib3\util\connection.py", line 73, in create_connection
    sock.connect(sa)
OSError: [WinError 10048] 通常每个套接字地址(协议/网络地址/端口)只允许使用一次。

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Python36\lib\site-packages\urllib3\connectionpool.py", line 601, in urlopen
    chunked=chunked)
  File "C:\Python36\lib\site-packages\urllib3\connectionpool.py", line 357, in _make_request
    conn.request(method, url, **httplib_request_kw)
  File "C:\Python36\lib\http\client.py", line 1239, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "C:\Python36\lib\http\client.py", line 1285, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "C:\Python36\lib\http\client.py", line 1234, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "C:\Python36\lib\http\client.py", line 1026, in _send_output
    self.send(msg)
  File "C:\Python36\lib\http\client.py", line 964, in send
    self.connect()
  File "C:\Python36\lib\site-packages\urllib3\connection.py", line 166, in connect
    conn = self._new_conn()
  File "C:\Python36\lib\site-packages\urllib3\connection.py", line 150, in _new_conn
    self, "Failed to establish a new connection: %s" % e)
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x0000014AC11179E8>: Failed to establish a new connection: [WinError 10048] 通常每个套接字地址(协议/网络地址/端口)只允许使用一次。

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Python36\lib\site-packages\requests\adapters.py", line 440, in send
    timeout=timeout
  File "C:\Python36\lib\site-packages\urllib3\connectionpool.py", line 639, in urlopen
    _stacktrace=sys.exc_info()[2])
  File "C:\Python36\lib\site-packages\urllib3\util\retry.py", line 388, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='localhost', port=9000): Max retries exceeded with url: /?properties=%7B%27annotators%27%3A+%27ssplit%2Ctokenize%27%2C+%27outputFormat%27%3A+%27serialized%27%2C+%27serializer%27%3A+%27edu.stanford.nlp.pipeline.ProtobufAnnotationSerializer%27%7D (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x0000014AC11179E8>: Failed to establish a new connection: [WinError 10048] 通常每个套接字地址(协议/网络地址/端口)只允许使用一次。',))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:/Users/gt/Desktop/sqlova-master/annotate_ws.py", line 188, in <module>
    a = annotate_example_ws(d, tables[d['table_id']])
  File "C:/Users/gt/Desktop/sqlova-master/annotate_ws.py", line 118, in annotate_example_ws
    _wv_ann1 = annotate(str(conds11[2]))
  File "C:/Users/gt/Desktop/sqlova-master/annotate_ws.py", line 23, in annotate
    for s in client.annotate(sentence):
  File "C:\Python36\lib\site-packages\stanza\nlp\corenlp.py", line 119, in annotate
    doc_pb = self.annotate_proto(text, annotators)
  File "C:\Python36\lib\site-packages\stanza\nlp\corenlp.py", line 99, in annotate_proto
    r = self._request(text, properties)
  File "C:\Python36\lib\site-packages\stanza\nlp\corenlp.py", line 58, in _request
    r = requests.post(self.server, params={'properties': str(properties)}, data=text.encode('utf-8'))
  File "C:\Python36\lib\site-packages\requests\api.py", line 112, in post
    return request('post', url, data=data, json=json, **kwargs)
  File "C:\Python36\lib\site-packages\requests\api.py", line 58, in request
    return session.request(method=method, url=url, **kwargs)
  File "C:\Python36\lib\site-packages\requests\sessions.py", line 508, in request
    resp = self.send(prep, **send_kwargs)
  File "C:\Python36\lib\site-packages\requests\sessions.py", line 618, in send
    r = adapter.send(request, **kwargs)
  File "C:\Python36\lib\site-packages\requests\adapters.py", line 508, in send
    raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPConnectionPool(host='localhost', port=9000): Max retries exceeded with url: /?properties=%7B%27annotators%27%3A+%27ssplit%2Ctokenize%27%2C+%27outputFormat%27%3A+%27serialized%27%2C+%27serializer%27%3A+%27edu.stanford.nlp.pipeline.ProtobufAnnotationSerializer%27%7D (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x0000014AC11179E8>: Failed to establish a new connection: [WinError 10048] 通常每个套接字地址(协议/网络地址/端口)只允许使用一次。',))
Exception ignored in: <bound method tqdm.__del__ of  15%|█▌        | 8460/56355 [01:54<17:39, 45.21it/s]>
Traceback (most recent call last):
  File "C:\Python36\lib\site-packages\tqdm\_tqdm.py", line 931, in __del__
    self.close()
  File "C:\Python36\lib\site-packages\tqdm\_tqdm.py", line 1133, in close
    self._decr_instances(self)
  File "C:\Python36\lib\site-packages\tqdm\_tqdm.py", line 496, in _decr_instances
    cls.monitor.exit()
  File "C:\Python36\lib\site-packages\tqdm\_monitor.py", line 52, in exit
    self.join()
  File "C:\Python36\lib\threading.py", line 1053, in join
    raise RuntimeError("cannot join current thread")
RuntimeError: cannot join current thread

Process finished with exit code 1