diff --git a/README.md b/README.md index 3da2982..55f0c5f 100644 --- a/README.md +++ b/README.md @@ -1,6 +1,6 @@ # Reward-SQL: Boosting Text-to-SQL via Stepwise Reasoning and Process-Supervised Rewards -## :thought_balloon: Introduction +## :thought_balloon:Introduction This repository contains the code for our paper "[Reward-SQL: Boosting Text-to-SQL via Stepwise Reasoning and Process-Supervised Rewards](https://arxiv.org/pdf/2505.04671)". @@ -11,6 +11,15 @@ RewardSQL enhances Text-to-SQL generation through a comprehensive process-level ![Overview](overview.jpg) +## :inbox_tray: Downloads +| **Model and Dataset** | **Download Latest** | +|-----------|------------------| +| Bird-Schema-Data | [🤖 Modelscope](https://www.modelscope.cn/datasets/QIANME/bird_schema_data), [🤗 HuggingFace](https://huggingface.co/datasets/QIAN-ME/bird_schema_data) | +| CoCTE SFT Model| [🤖 Modelscope](https://www.modelscope.cn/models/QIANME/CTE_SFT_Model), [🤗 HuggingFace](https://huggingface.co/QIAN-ME/CoCTE_SFT_Model) | +| Process Reward Model| [🤖 Modelscope](https://www.modelscope.cn/models/QIANME/PRM_Model), [🤗 HuggingFace](https://huggingface.co/QIAN-ME/PRM_Model) | +| GRPO Trained Model | [🤖 Modelscope](https://www.modelscope.cn/models/QIANME/GRPO_Model), [🤗 HuggingFace](https://huggingface.co/QIAN-ME/GRPO_Model) | + + ## :open_file_folder: Data Preparation We provide all necessary datasets in our Google Drive repository. @@ -60,9 +69,9 @@ mkdir -p results ## :zap: Quick Start ### Download pre-trained models -- [CoCTE SFT Model](https://drive.google.com/file/d/1hP8FO_VA7Lf9wwqHz_Uqvs3ccrSP_x66/view?usp=sharing): Put it under `checkpoints/cocte_model`. -- [Process Reward Model](https://drive.google.com/file/d/1hP8FO_VA7Lf9wwqHz_Uqvs3ccrSP_x66/view?usp=sharing): Put it under `checkpoints/prm_model`. -- [GRPO Trained Model](https://drive.google.com/file/d/1hP8FO_VA7Lf9wwqHz_Uqvs3ccrSP_x66/view?usp=sharing): Put it under `checkpoints/grpo_model`. +- [CoCTE SFT Model](https://huggingface.co/QIAN-ME/CoCTE_SFT_Model): Put it under `checkpoints/cocte_model`. +- [Process Reward Model](https://huggingface.co/QIAN-ME/PRM_Model): Put it under `checkpoints/prm_model`. +- [GRPO Trained Model](https://huggingface.co/QIAN-ME/GRPO_Model): Put it under `checkpoints/grpo_model`. ### Text-to-SQL inference @@ -130,4 +139,4 @@ We implement our reinforcement learning algorithm extending from [veRL](https:// - [ ] Models used in the paper - [ ] Evaluation code - [ ] Datasets -- [ ] GRPO training code --> \ No newline at end of file +- [ ] GRPO training code -->