Name		Name	Last commit message	Last commit date
parent directory ..
CTA		CTA
misc		misc
.gitkeep		.gitkeep
README.md		README.md
adveta_1.0.zip		adveta_1.0.zip

README.md

Robustness of Text-to-SQL Models

This repository contains the data and code in the following paper:

Towards Robustness of Text-to-SQL Models Against Natural and Realistic Adversarial Table Perturbation
Xinyu Pi*, Bing Wang*, Yan Gao, Jiaqi Guo, Zhoujun Li, Jian-Guang Lou
ACL 2022 Long Papers

Introduction

This repository is the official implementation of our paper Towards Robustness of Text-to-SQL Models Against Natural and Realistic Adversarial Table Perturbation. In this paper, we curate ADVETA, the first robustness evaluation benchmark featuring natural and realistic adversarial table perturbation. To defend against this perturbation, we build a systematic adversarial training example generation framework CTA, tailored for better contextualization of tabular data.

ADVETA

We manually curate the ADVErsarial Table perturbAtion (ADVETA) benchmark based on three mainstream Text-to-SQL datasets, Spider, WikiSQL and WTQ. For each table from the original development set, we conduct RPL/ADD annotation separately, perturbing only table columns. We release our data in adveta_1.0.zip file.

CTA

Requirement

python: 3.8
cuda: 10.1
torch: 1.7.1

install dependencies:

conda create -n cta python=3.8  -y
conda activate cta
conda install pytorch==1.7.1  cudatoolkit=10.1 -c pytorch -y
python -m spacy download en_core_web_sm
pip install -r requirements.txt

Introduction

Contextualized Table Augmentation (CTA) framework as an adversarial training example generation approach tailored for tabular data. Before you run pipeline.ipynb, you should download data files and checkpoints from Google Drive.

notes:

We download number-batch word embedding from here as ./data/nb_emb.txt.
We pre-compute processed-WDC tables using Tapas dense retrieval models. Store output to ./wdc/wdc_dense_A.txt and ./wdc/wdc_dense_B.txt (Tapas have two encoders).

Run

Just run the pipeline.ipynb and have fun.

Cite

@inproceedings{pi-etal-2022-towards,
    title = "Towards Robustness of Text-to-{SQL} Models Against Natural and Realistic Adversarial Table Perturbation",
    author = "Pi, Xinyu  and Wang, Bing  and Gao, Yan  and Guo, Jiaqi  and Li, Zhoujun  and Lou, Jian-Guang",
    month = may,
    year = "2022",
    address = "Dublin, Ireland",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2022.acl-long.142",
    pages = "2007--2022"
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

robustness_of_text_to_sql

robustness_of_text_to_sql

README.md

Robustness of Text-to-SQL Models

Introduction

ADVETA

CTA

Requirement

Introduction

Run

Cite

Files

robustness_of_text_to_sql

Directory actions

More options

Directory actions

More options

Latest commit

History

robustness_of_text_to_sql

Folders and files

parent directory

README.md

Robustness of Text-to-SQL Models

Introduction

ADVETA

CTA

Requirement

Introduction

Run

Cite