beachwang / dail-sql Goto Github PK

View Code? Open in Web Editor NEW

336.0 336.0 53.0 2.02 MB

A efficient and effective few-shot NL2SQL method on GPT-4.

License: Apache License 2.0

Python 95.84% Shell 4.16%

dail-sql's People

Contributors

Stargazers

Watchers

dail-sql's Issues

How are foreign keys added to the schema?

Hello,

I just read your work and find that you add foreign keys to BS_p, OD_p, TR_p, and AS_p prompt formats. I'm curious about these formats with foreign keys, but I can't find details of these formats in your paper. Did I missed something or could you just give some examples about these formats?

Thank you.

Error during data_preprocess

Hello!

Thanks for great work! when I followed the instruction I got stuck when I ran the data_preprocess.py， it told me that 'Error while loading a tagger model (probably missing model file). Why did this situation occur? How to resolve this problem?

Thanks a lot!

About test question in Chinese.

Thanks for sharing the DAIl-SQL codes. I would like to ask when the question is in Chinese, even if you create the Chinese-English mapping table in advance, there will be problems with table positioning and column selection. Do you have any suggestions for this problem?

About preliminary model graphix

I have read your paper and replicated your experiment, but there's something I don't quite understand. Perhaps I missed it, but could you please point out to me how the preliminary model graphix is introduced in your code? I would greatly appreciate it!

Are there any finetuned open-source models that the paper mentioned?🤗

Question on Pre-Predicted s'

Hello, I just read your paper and am taking a look at the code. Thank you for your contributions.

There was one element in the paper that I did not fully understand which I hope you can assist me with.

Where does the pre-predicted s' come from? Does this arrive from a separate, preliminary model, and is this model within the codebase?

python generate_question.py \
--data_type spider \
--split test \
--tokenizer gpt-3.5-turbo \
--max_seq_len 4096 \
--selector_type EUCDISMASKPRESKLSIMTHR \
--pre_test_result [your_pre_generated_queries_file] \
--prompt_repr SQL \
--k_shot 9 \
--example_type QA

I saw that in an example run setup for considering both question similarity and query similarity, the "pre_test_result" flag was included. I am unsure where these pre generated queries arrive from, and would greatly appreciate an explanation. Thank you.

stanford core nlp link on readme page not working

Can't download

Stanford core nlp

Can you provide code for calculating the EM accuracy and EX accuracy of prediction results?

Can you provide the code for assessment?

Can you provide the code for evaluation?

error when generate SQL by GPT-4 for EUCDISMASKPRESKLSIMTHR as the pre-generated SQL query

error when generate SQL by GPT-4 for EUCDISMASKPRESKLSIMTHR as the pre-generated SQL query
python ask_llm.py
--openai_api_key $1
--model gpt-4
--question ./dataset/process/SPIDER-TEST_SQL_9-SHOT_EUCDISQUESTIONMASK_QA-EXAMPLE_CTX-200_ANS-4096/
--end_index 2

question.json + pre_test_result [your_pre_generated_queries_file]

[UPDATE] Nevermind, your_pre_generated_queries_file is already included at ./results/DAIL-SQL+GPT-4.txt

I hope all is well, I'm attempting to re-implement this research paper, and I was wondering if there is any chance that you have a pre_generated_queries file that I could use to do some testing of the functionality. If not, no worries. Thanks and the paper is awesome, great work!

python generate_question.py
--data_type spider
--split test
--tokenizer gpt-3.5-turbo
--max_seq_len 4096
--selector_type EUCDISMASKPRESKLSIMTHR
--pre_test_result [your_pre_generated_queries_file]
--prompt_repr SQL
--k_shot 9
--example_type QA

pre_test_results BIRD-SQL

Hi,

Like BugMaker-Boyan, I'm super grateful that you have added the BIRD-SQL results -- it's definitely the best benchmark in Text-to-SQL right now.

I'm strapped for compute at the moment (aren't we all), and don't think I can afford to run the pre_test_result generation phase.

I want to check that the url below, contains the pre_test_results you used to Bird-SQL.

https://github.com/AlibabaResearch/DAMO-ConvAI/blob/main/bird/llm/exp_result/turbo_output_kg/predict_dev.json

If they are not the correct pre_test_results, I was wondering if you could release the pre_test_results you originally generated, similar to the results/graphix_result.txt for Spider dev?

I'm excited to see what you and your team do next!

About processed data

Hello, it's an honor to read your paper, which has inspired me deeply. When downloading your preprocessed data, I found the following four files, representing examples of how they were processed. Can you help me explain them?

Help wanted! Question about the `run_for_bird.sh`.

In the run_for_bird.py, I find two pieces of code that may have similar functionality, what is the difference between them?

BTW, If the following section code is used, will the 02_ask_chatgpt.py code file be open source?

Detailed README

Hello, I just read your paper and am looking at the code. I appreciate your efforts for this new method. I want to use this method on my database and I am not getting it after reading this how can I do this or it is just for spider data? I would be grateful if you could guide us on how can we use on own data.
Thanks !!

the process of generating the jsonl file

Hello, first of all I would like to congratulate you for the great work! I have been focusing on the code but I am having difficulties in understanding the process and steps in order to generate the jsonl file of spider data of test_schema_linking and train_schema_linking, could you help me in this ?

Did you use Spider-dev as the examples set when evaluate Spider-test

hi，thanks for sharing DAIl-SQL codes.
We notice the improvements by DAIL demo selection(EuclideanDistanceQuestionMaskPreSkeletonSimilarThresholdSelector)

Here is my question:
When evaluating at Spider-Test, did you add Spider-dev into the demo sets?

Results and prompts for the test set

Hi!

In the scripts provided the schema linking is only for the dev and train_spider sets, which have the same databases. To get the results reported in the paper for the test set, which has a different set of tables, how do I do the schema linking? Do I include every table in the database folder and in the tables file (having training and dev), or this will lower the results? Or do I alter the script in preprocess to do a separate schema link for the test?

Thanks
Anton

dev.json

Hello, I saw that in your code, "train_spider_and_others.json" and "train_others.json" were merged to get train_spider_and_others. Would you like to ask if you used all the spider data sets in your paper, or just the dev.json data set?

I see that the default data in generate_question.py uses dev.json

Did you train the model using train_spider_and_others and then test it using dev.json?

Script or code for evaluation?

Hi,
I would appreciate it if you could provide me with the evaluation script since there is no information about it in the repository. Although I am aware that you use some other packages for evaluation, I am not sure how you utilized them to assess the results.

"target" doesn't have contribute "pre_skeleton" , how to fix it ?

The error comes from here:
if jaccard_similarity(train_json[index]["pre_skeleton"], target["pre_skeleton"]) < self.threshold:

the details as below:
File "/home/jiangshan/code/DAIL-SQL/prompt/ExampleSelectorTemplate.py", line 353, in get_examples
if jaccard_similarity(train_json[index]["pre_skeleton"], target["pre_skeleton"]) < self.threshold:
File "/home/jiangshan/code/DAIL-SQL/prompt/PromptICLTemplate.py", line 51, in format
examples = self.get_examples(target, self.NUM_EXAMPLE * scope_factor, cross_domain=cross_domain)
File "/home/jiangshan/code/DAIL-SQL/generate_question.py", line 87, in
question_format = prompt.format(target=question_json,
KeyError: 'pre_skeleton'

index = 2039

train_json[index]:
{'db_id': 'party_people', 'query': 'SELECT count() FROM region', 'query_toks': ['SELECT', 'count', '(', '', ')', 'FROM', 'region'], 'query_toks_no_value': ['select', 'count', '(', '*', ')', 'from', 'region'], 'question': 'How many regions do we have?', 'question_toks': ['How', 'many', 'regions', 'do', 'we', 'have', '?'], 'sql': {'from': {...}, 'select': [...], 'where': [...], 'groupBy': [...], 'having': [...], 'orderBy': [...], 'limit': None, 'intersect': None, 'union': None, 'except': None}, 'tables': [{...}, {...}, {...}, {...}], 'query_skeleton': 'select count ( _ ) from _', 'path_db': '/home/jiangshan/data/datasets/llm/spider/database/party_people/party_people.sqlite', 'sc_link': {'q_col_match': {...}, 'q_tab_match': {...}}, 'cv_link': {'num_date_match': {}, 'cell_match': {}}, 'question_for_copying': ['how', 'many', 'regions', 'do', 'we', 'have', '?'], 'column_to_table': {'0': None, '1': 0, '2': 0, '3': 0, '4': 0, '5': 0, '6': 0, '7': 1, '8': 1, '9': 1, '10': 1, '11': 1, '12': 1, '13': 2, '14': 2, '15': 2, '16': 2, '17': 3, '18': 3, ...}, 'table_names_original': ['region', 'party', 'member', 'party_events'], 'question_pattern': 'how many _ do we have ?', 'pre_skeleton': 'select count ( _ ) from _'}

The target is :
{'db_id': 'concert_singer', 'query': 'SELECT count() FROM singer', 'query_toks': ['SELECT', 'count', '(', '', ')', 'FROM', 'singer'], 'query_toks_no_value': ['select', 'count', '(', '*', ')', 'from', 'singer'], 'question': 'How many singers do we have?', 'question_toks': ['How', 'many', 'singers', 'do', 'we', 'have', '?'], 'sql': {'from': {...}, 'select': [...], 'where': [...], 'groupBy': [...], 'having': [...], 'orderBy': [...], 'limit': None, 'intersect': None, 'union': None, 'except': None}, 'tables': [{...}, {...}, {...}, {...}], 'query_skeleton': 'select count ( _ ) from _', 'path_db': '/home/jiangshan/data/datasets/llm/spider/database/concert_singer/concert_singer.sqlite', 'sc_link': {'q_col_match': {...}, 'q_tab_match': {...}}, 'cv_link': {'num_date_match': {}, 'cell_match': {}}, 'question_for_copying': ['how', 'many', 'singers', 'do', 'we', 'have', '?'], 'column_to_table': {'0': None, '1': 0, '2': 0, '3': 0, '4': 0, '5': 0, '6': 0, '7': 0, '8': 1, '9': 1, '10': 1, '11': 1, '12': 1, '13': 1, '14': 1, '15': 2, '16': 2, '17': 2, '18': 2, ...}, 'table_names_original': ['stadium', 'singer', 'concert', 'singer_in_concert'], 'question_pattern': 'how many _ do we have ?'}

The debug args are:
"args": [
"--data_type", "spider",
"--split", "test",
"--tokenizer", "/home/schinta/data/model/llm/pre_train/THUDM/chatglm3-6b",
"--max_seq_len", "4096",
"--selector_type", "EUCDISMASKPRESKLSIMTHR",
"--prompt_repr", "SQL",
"--k_shot", "9",
"--example_type", "QA"
]

The pip list show as below:
Package Version

accelerate 0.28.0
aiofiles 23.2.1
aiohttp 3.8.4
aiosignal 1.3.1
altair 5.2.0
annotated-types 0.6.0
annoy 1.17.1
anyio 3.7.0
async-timeout 4.0.2
attrs 23.1.0
bpemb 0.3.5
certifi 2024.2.2
charset-normalizer 3.1.0
click 8.1.7
cmake 3.26.3
contourpy 1.1.1
corenlp-protobuf 3.8.0
cpm-kernels 1.0.11
cycler 0.12.1
dataclasses-json 0.5.7
distro 1.9.0
exceptiongroup 1.1.1
ffmpy 0.3.2
filelock 3.12.0
fonttools 4.50.0
frozenlist 1.3.3
fsspec 2023.5.0
gensim 4.3.2
greenlet 2.0.2
h11 0.14.0
httpcore 1.0.4
httpx 0.27.0
huggingface-hub 0.23.0
idna 3.7
importlib_resources 6.4.0
Jinja2 3.1.2
joblib 1.2.0
jsonschema 4.21.1
jsonschema-specifications 2023.12.1
kiwisolver 1.4.5
latex2mathml 3.77.0
lit 16.0.6
Markdown 3.6
markdown-it-py 3.0.0
MarkupSafe 2.1.2
marshmallow 3.19.0
marshmallow-enum 1.5.1
matplotlib 3.7.5
mdtex2html 1.3.0
mdurl 0.1.2
mpmath 1.3.0
multidict 6.0.4
mypy-extensions 1.0.0
nemoguardrails 0.3.0
networkx 3.1
nltk 3.8.1
numexpr 2.8.4
numpy 1.24.4
nvidia-cublas-cu11 11.10.3.66
nvidia-cublas-cu12 12.1.3.1
nvidia-cuda-cupti-cu11 11.7.101
nvidia-cuda-cupti-cu12 12.1.105
nvidia-cuda-nvrtc-cu11 11.7.99
nvidia-cuda-nvrtc-cu12 12.1.105
nvidia-cuda-runtime-cu11 11.7.99
nvidia-cuda-runtime-cu12 12.1.105
nvidia-cudnn-cu11 8.5.0.96
nvidia-cudnn-cu12 8.9.2.26
nvidia-cufft-cu11 10.9.0.58
nvidia-cufft-cu12 11.0.2.54
nvidia-curand-cu11 10.2.10.91
nvidia-curand-cu12 10.3.2.106
nvidia-cusolver-cu11 11.4.0.1
nvidia-cusolver-cu12 11.4.5.107
nvidia-cusparse-cu11 11.7.4.91
nvidia-cusparse-cu12 12.1.0.106
nvidia-nccl-cu11 2.14.3
nvidia-nccl-cu12 2.20.5
nvidia-nvjitlink-cu12 12.4.127
nvidia-nvtx-cu11 11.7.91
nvidia-nvtx-cu12 12.1.105
openai 1.30.1
openapi-schema-pydantic 1.2.4
orjson 3.9.15
packaging 24.0
pandas 2.0.3
pillow 10.2.0
pip 24.0
pkgutil_resolve_name 1.3.10
protobuf 3.20.3
psutil 5.9.8
pydantic 2.7.1
pydantic_core 2.18.2
pydub 0.25.1
Pygments 2.17.2
pyparsing 3.1.2
python-multipart 0.0.9
pytz 2024.1
PyYAML 6.0
referencing 0.34.0
regex 2023.5.5
requests 2.31.0
rfc3986 1.5.0
rich 13.7.1
rpds-py 0.18.0
ruff 0.3.4
safetensors 0.3.1
scikit-learn 1.2.2
scipy 1.10.1
semantic-version 2.10.0
sentence-transformers 2.2.2
sentencepiece 0.1.99
setuptools 65.5.1
shellingham 1.5.4
simpleeval 0.9.13
six 1.16.0
smart-open 7.0.4
sniffio 1.3.0
sql_metadata 2.11.0
SQLAlchemy 2.0.17
sqlparse 0.5.0
stanford-corenlp 3.9.2
sympy 1.12
threadpoolctl 3.1.0
tokenizers 0.13.3
tomlkit 0.12.0
toolz 0.12.1
torch 2.3.0
torchtext 0.18.0
torchvision 0.18.0
tqdm 4.65.0
transformers 4.27.1
triton 2.3.0
typer 0.10.0
typing_extensions 4.11.0
typing-inspect 0.9.0
tzdata 2024.1
urllib3 2.2.1
uvicorn 0.22.0
websockets 11.0.3
wheel 0.43.0
wrapt 1.16.0
yarl 1.9.2
zipp 3.18.1

Can you provide the pre-generated SQL query dataset？

ChatGPT4 API calls are expensive, can you provide the pre-generated SQL query dataset which is generated by GPT-4 for EUCDISMASKPRESKLSIMTHR？

open-source for BIRD dataset?

Thanks for your nice work.

Can you open source code & dev-predicted-sqls for re-implement BIRD dataset? I cannot find related code in this repo.

Thanks.

NameError in data_preprocess.py

DAIL-SQL/data_preprocess.py

Line 112 in d78ce34

if data_type == "spider":

Need to add data_type = args.data_type in the previous line to avoid the following error:

Traceback (most recent call last):
File "data_preprocess.py", line 112, in
if data_type == "spider":
NameError: name 'data_type' is not defined

Help wanted! What is the significance of extracting the skeleton?

I am very interested in your articles and code. When I read the code, I found that you use sentence-transformers to calculate Euclidean distance to extract examples. At the same time, I also noticed that you have extracted the question skeleton and SQL skeleton and they were used when generating question.json.

    def __init__(self, tokenizer: str, *args, **kwargs):
        self.tokenizer = get_tokenizer(tokenizer)
        self.example_qualities = []
        self.pattern_similarities = []

    def record_example_quality(self, examples, target):
        quality_list = []
        for example in examples:
            quality_list.append(jaccard_similarity(example["query_skeleton"], target["query_skeleton"]))
        self.example_qualities.append(quality_list)

    def get_example_quality(self):
        if self.example_qualities:
            return np.mean([num for row in self.example_qualities for num in row])
        else:
            return 1

    def get_example_quality_for_each(self):
        if self.example_qualities:
            return [np.mean(row) for row in self.example_qualities]
        else:
            return []

    def record_pattern_similarity(self, examples, target):
        similarity_list = []
        for example in examples:
            similarity_list.append(jaccard_similarity(example["question_pattern"], target["question_pattern"]))
        self.pattern_similarities.append(similarity_list)

    def get_pattern_similarity(self):
        if self.pattern_similarities:
            return np.mean([num for row in self.pattern_similarities for num in row])
        else:
            return 1

However, to my surprise, you only used the skeleton to calculate the average for output and did not use them in ask_llm.py.

So I'm curious about what is the purpose of extracting skeletons?

I am very interested in your work, could you please help me clarify my doubts? Thank you.

zero-shot results for bird

Dear All,

Would it be possible to share zero-shot results for bird dataset?

Thank you!

Why the execution accuracy in spider-dev dataset is different from the paper?

I eval the spider-dev using official test-suit-sql-eval scripts, with the GPT-4 results in your repo, but the execution accuracy is different.

The paper result: 83.5%

But I get: 76.2

run cmd:

python3 evaluation.py --gold ./my_test/gold_sqls.txt --pred ./my_test/DAIL-SQL+GPT-4+self-consistency.txt --db ./database/database/ --etype exec

                 easy                 medium               hard                 extra                all

count 248 446 174 166 1034
===================== EXECUTION ACCURACY =====================
execution 0.903 0.818 0.661 0.506 0.762

Error During Data Processing

Hi!

Thanks for the great work! I followed the instructions in the readme but got no lock while running python data_preprocess.py

It stucks at "test section linking: " at 0%. Any clues on what could be the possible reason?

Thanks a lot!

Database schema

Hello, I would like to ask, in the text2sql experiment,
is your database schema bound with natural language (referring to the question based on the database schema),
or is the natural language question independent of the database schema (according to the natural language question to find the matching database schema into the corresponding prompt)?

support qwen from vllm?

Error during generate questions

thanks for your good work！But I have a question when generate questions.How can I fix it

What is the result on the bird dataset with GPT-3.5-turbo

I would like to understand the performance of DIAL-SQL on the bird dataset under GPT-3.5-turbo. May I ask if you have any results?

issue with generate_question.py

Hi there!
I was initially implementing DAIL-SQL, and the version I pulled at the beginning of December worked perfectly fine. However, with the new commits and the updated version, I'm encountering the following error when I run this command:

python generate_question.py \
-- data_type spider \
--split test \
--tokenizer gpt-3.5-turbo \
--max_seq_len 4096 \
--prompt_repr SQL \
--k_shot 9 \
--example_type QA \
--selector_type  EUCDISQUESTIONMASK

Error:

File "generate_question.py", line 29, in <module>
    REPR_TYPE.OPENAI_DEMOSTRATION_WFK,
AttributeError: type object 'REPR_TYPE has no attribute  'OPENAI_DEMOSTRATION_WFK'

Help wanted! Question about the bird `Question-Schema Linking`

Looking at the source code, it seems that I only see Spider using RAT SQL to calculate q-s linking code, but I do not see bird using RAT SQL to calculate q-s linking code.In the exposed code, bird is more like a few shot without q-s linking:

I would like to ask whether the result on the list uses RAT to calculate q-s linking? Thank you.

System prompt to be used with OpenAI models

Hi!

Thanks for the paper and this code. Really insightful for Text2SQL work!

I want to clarify if my understanding of the setup as described below is correct:

Given the parameter messages in a format as shown here: https://platform.openai.com/docs/api-reference/chat/create,

In your work for the role: system, the content is set as the default value of "You are a helpful assistant." and the content corresponding to role: user is what is modified using the different prompt templates like ODp and others.

Do you change the content corresponding to role: system to some other value?

What is the purpose of "data_preprocess.py"

I am a little dizzy about "data_preprocess.py". Could u please tell me roughly how it works？

Questions about execution accuracy in spider-dev dataset and DAIL-SQL

I am interested in your research and admire your state-of-the-art results, but I have two questions:

First, I tested with "--selector_type EUCDISMASKPRESKLSIMTHR" and gpt-3.5-turbo. These are the results. I don’t understand why the execution accuracy is so low (72.3%). Could you please help me with this problem?

PS C:\Users\86158\Desktop\text2sql\test-suite-sql-eval-master\test-suite-sql-eval-master> python evaluation.py --gold dev_gold.txt --pred RESULTS_MODEL-gpt-3.5-turbo.txt --db C:\Users\86158\Desktop\text2sql\test-suite-sql-eval-master\test-suite-sql-eval-master\database --etype exec
OK
easy medium hard extra all
count 248 446 174 166 1034
===================== EXECUTION ACCURACY =====================
execution 0.883 0.771 0.661 0.422 0.723

Second, in the paper titled "DAIL-SQL," you use cosine similarity, but the algorithm in the corresponding code (EUCDISMASKPRESKLSIMTHR) uses Euclidean distance. Am I choosing the wrong algorithm?

module 'torchtext' has no attribute 'vocab'

在我运行python data_preprocess.py命令时，得到了如下报错：

Traceback (most recent call last):
File "data_preprocess.py", line 131, in
schema_linking_producer(spider_dev, spider_train, spider_table, spider_db, spider_dir)
File "data_preprocess.py", line 35, in schema_linking_producer
word_emb = GloVe(kind='42B', lemmatize=True)
File "/Users/minstrel./PythonProjects/DAIL-SQL-main/utils/pretrained_embeddings.py", line 46, in init
self.glove = torchtext.vocab.GloVe(name=kind, cache=cache)
AttributeError: module 'torchtext' has no attribute 'vocab'

请问要怎么解决？

beachwang / dail-sql Goto Github PK

dail-sql's People

Contributors

Stargazers

Watchers

Forkers

dail-sql's Issues

Recommend Projects

Recommend Topics

Recommend Org