SonglinLyu · SonglinLyu · Jun 24, 2024 · Jun 28, 2024 · Jul 15, 2024 · Jul 23, 2024
diff --git a/.gitignore b/.gitignore
@@ -14,19 +14,36 @@ data/eval
 output_pred/
 wandb/
 src/dbgpt-hub-sql/dbgpt_hub_sql/data/*
+src/dbgpt-hub-gql/dbgpt_hub_gql/data/*
+src/dbgpt-hub-sql/codellama/*
+src/dbgpt-hub-gql/codellama/*
+src/dbgpt-hub-sql/wandb/*
+src/dbgpt-hub-gql/wandb/*
 # But track the data/eval_data folder itself
 !src/dbgpt-hub-sql/dbgpt_hub_sql/data/eval_data/
 !src/dbgpt-hub-sql/dbgpt_hub_sql/data/dataset_info.json
 !src/dbgpt-hub-sql/dbgpt_hub_sql/data/example_text2sql.json
+!src/dbgpt-hub-gql/dbgpt_hub_gql/data/tugraph-db-example
+!src/dbgpt-hub-gql/dbgpt_hub_gql/data/dataset_info.json
+!src/dbgpt-hub-gql/dbgpt_hub_gql/data/example_text2sql.json
 
 # Ignore everything under dbgpt_hub_sql/ouput/ except the adapter directory
+src/dbgpt-hub-sql/dbgpt_hub_sql/output/
 src/dbgpt-hub-sql/dbgpt_hub_sql/output/adapter/*
 !src/dbgpt-hub-sql/dbgpt_hub_sql/output/adapter/.gitkeep
 src/dbgpt-hub-sql/dbgpt_hub_sql/output/logs/*
 !src/dbgpt-hub-sql/dbgpt_hub_sql/output/logs/.gitkeep
 src/dbgpt-hub-sql/dbgpt_hub_sql/output/pred/*
 !src/dbgpt-hub-sql/dbgpt_hub_sql/output/pred/.gitkeep
 
+src/dbgpt-hub-gql/dbgpt_hub_gql/output/
+src/dbgpt-hub-gql/dbgpt_hub_gql/output/adapter/*
+!src/dbgpt-hub-gql/dbgpt_hub_gql/output/adapter/.gitkeep
+src/dbgpt-hub-gql/dbgpt_hub_gql/output/logs/*
+!src/dbgpt-hub-gql/dbgpt_hub_gql/output/logs/.gitkeep
+src/dbgpt-hub-gql/dbgpt_hub_gql/output/pred/*
+!src/dbgpt-hub-gql/dbgpt_hub_gql/output/pred/.gitkeep
+
 # Ignore NLU output
 src/dbgpt-hub-nlu/output
 src/dbgpt-hub-nlu/data

diff --git a/README.md b/README.md
@@ -24,9 +24,17 @@
   </p>
 
 
+
 [**简体中文**](README.zh.md) | [**Discord**](https://discord.gg/7uQnPuveTY) | [**Wechat**](https://github.com/eosphoros-ai/DB-GPT/blob/main/README.zh.md#%E8%81%94%E7%B3%BB%E6%88%91%E4%BB%AC) | [**Huggingface**](https://huggingface.co/eosphoros) | [**Community**](https://github.com/eosphoros-ai/community)
+[**Text2SQL**](README.zh.md) | [**Text2GQL**](src/dbgpt-hub-gql/README.zh.md) | [**Text2NLU**](src/dbgpt-hub-nlu/README.zh.md)
+
 </div>
 
+
+## 🔥🔥🔥 News
+- Support [Text2NLU](src/dbgpt-hub-nlu/README.zh.md) fine-tuning to improve semantic understanding accuracy.
+- Support [Text2GQL](src/dbgpt-hub-gql/README.zh.md) fine-tuning to generate graph query.
+
 ## Baseline
 - update time: 2023/12/08
 - metric: execution accuracy (ex)
@@ -675,14 +683,16 @@ Our work is primarily based on the foundation of numerous open-source contributi
 Thanks to all the contributors, especially @[JBoRu](https://github.com/JBoRu) who raised the [issue](https://github.com/eosphoros-ai/DB-GPT-Hub/issues/119) which reminded us to add a new promising evaluation way, i.e. Test Suite. As the paper 《SQL-PALM: IMPROVED LARGE LANGUAGE MODEL ADAPTATION FOR TEXT-TO-SQL》 mentioned, "We consider two commonly-used evaluation metrics: execution accuracy (EX) and test-suite accuracy (TS). EX measures whether the SQL execution outcome matches ground truth (GT), whereas TS measures whether the SQL passes all EX evaluations for multiple tests, generated by database augmentation. Since EX contains false positives, we consider TS as a more reliable evaluation metric".
 
 ## 7. Citation
-Please consider citing our project if you find it useful:
+If you find `DB-GPT-Hub` useful for your research or development, please cite the following <a href="https://arxiv.org/abs/2406.11434" target="_blank">paper</a>:
 
 ```bibtex
-@software{db-gpt-hub,
-    author = {DB-GPT-Hub Team},
-    title = {{DB-GPT-Hub}},
-    url = {https://github.com/eosphoros-ai/DB-GPT-Hub},
-    year = {2023}
+@misc{zhou2024dbgpthub,
+      title={DB-GPT-Hub: Towards Open Benchmarking Text-to-SQL Empowered by Large Language Models}, 
+      author={Fan Zhou and Siqiao Xue and Danrui Qi and Wenhui Shi and Wang Zhao and Ganglin Wei and Hongyang Zhang and Caigai Jiang and Gangwei Jiang and Zhixuan Chu and Faqiang Chen},
+      year={2024},
+      eprint={2406.11434},
+      archivePrefix={arXiv},
+      primaryClass={id='cs.DB' full_name='Databases' is_active=True alt_name=None in_archive='cs' is_general=False description='Covers database management, datamining, and data processing. Roughly includes material in ACM Subject Classes E.2, E.5, H.0, H.2, and J.1.'}
 }
 ```
 

diff --git a/README.zh.md b/README.zh.md
@@ -23,10 +23,16 @@
   </p>
 
 
+
 [**英文**](README.md) | [**Discord**](https://discord.gg/7uQnPuveTY) | [**Wechat**](https://github.com/eosphoros-ai/DB-GPT/blob/main/README.zh.md#%E8%81%94%E7%B3%BB%E6%88%91%E4%BB%AC) | [**Huggingface**](https://huggingface.co/eosphoros) | [**Community**](https://github.com/eosphoros-ai/community)
+[**Text2SQL**](README.zh.md) | [**Text2GQL**](src/dbgpt-hub-gql/README.zh.md) | [**Text2NLU**](src/dbgpt-hub-nlu/README.zh.md)
 </div>
 
 
+## 🔥🔥🔥 News
+- 支持 [Text2NLU](src/dbgpt-hub-nlu/README.zh.md)微调,提升自然语言理解准确率。
+- 支持 [Text2GQL](src/dbgpt-hub-gql/README.zh.md)微调,可以通过自然语言生成图查询语句。
+
 ## Baseline
 - 更新日期: 2023/12/08
 - 评价指标: execution accuracy (ex)
@@ -648,14 +654,16 @@ poetry run python dbgpt_hub_sql/eval/evaluation.py --plug_value --input  Your_mo
  **20231104** ,尤其感谢 @[JBoRu](https://github.com/JBoRu) 提的[issue](https://github.com/eosphoros-ai/DB-GPT-Hub/issues/119)， 指出我们的之前按照官方网站的95M的数据库去评估的方式的不足，如论文《SQL-PALM: IMPROVED LARGE LANGUAGE MODEL ADAPTATION FOR TEXT-TO-SQL》 指出的 "We consider two commonly-used evaluation metrics: execution accuracy (EX) and test-suite accuracy (TS) [32]. EX measures whether SQL execution outcome matches ground truth (GT), whereas TS measures whether the SQL passes all EX evaluation for multiple tests, generated by database-augmentation. Since EX contains false positives, we consider TS as a more reliable evaluation metric" 。
 
 ## 七、引用
-如果您觉得我们的项目对您的科研项目或者实际生产项目有帮助，请考虑在您的参考文献里引用`DB-GPT-Hub`:
+如果您发现`DB-GPT-Hub`对您的研究或开发有用，请引用以下<a href="https://arxiv.org/abs/2406.11434" target="_blank">论文</a>：
 
 ```bibtex
-@software{db-gpt-hub,
-    author = {DB-GPT-Hub Team},
-    title = {{DB-GPT-Hub}},
-    url = {https://github.com/eosphoros-ai/DB-GPT-Hub},
-    year = {2023}
+@misc{zhou2024dbgpthub,
+      title={DB-GPT-Hub: Towards Open Benchmarking Text-to-SQL Empowered by Large Language Models}, 
+      author={Fan Zhou and Siqiao Xue and Danrui Qi and Wenhui Shi and Wang Zhao and Ganglin Wei and Hongyang Zhang and Caigai Jiang and Gangwei Jiang and Zhixuan Chu and Faqiang Chen},
+      year={2024},
+      eprint={2406.11434},
+      archivePrefix={arXiv},
+      primaryClass={id='cs.DB' full_name='Databases' is_active=True alt_name=None in_archive='cs' is_general=False description='Covers database management, datamining, and data processing. Roughly includes material in ACM Subject Classes E.2, E.5, H.0, H.2, and J.1.'}
 }
 ```