Most modern Text2SQL systems prompt large language models (LLMs) with entire schemas -- mostly column information -- alongside the user's question. While effective on small databases, this approach fails on real-world schemas that exceed LLM context limits, even for commercial models. The recent Spider 2.0 benchmark exemplifies this with hundreds of tables and tens of thousands of columns, where existing systems often break. Current mitigations either rely on costly multi-step prompting pipelines or filter columns by ranking them against user's question independently, ignoring inter-column structure. To scale existing systems, we introduce \toolname, an open-source, LLM-efficient schema filtering framework that compacts Text2SQL prompts by (i) ranking columns with a query-aware LLM encoder enriched with values and metadata, (ii) reranking inter-connected columns via a lightweight graph transformer over functional dependencies, and (iii) selecting a connectivity-preserving sub-schema with a Steiner-tree heuristic. Experiments on real datasets show that \toolname achieves near-perfect recall and higher precision than CodeS, SchemaExP, Qwen rerankers, and embedding retrievers, while maintaining sub-second median latency and scaling to schemas with 23,000+ columns. Our source code is available at https://github.com/thanhdath/grast-sql.
翻译:大多数现代Text2SQL系统会将整个模式(主要是列信息)与用户问题一同输入大型语言模型(LLM)进行提示。虽然这种方法在小型数据库上有效,但对于超出LLM上下文限制的真实世界模式(即使对于商业模型)则会失效。最新的Spider 2.0基准测试以数百个表和数万列为例证,现有系统在此类场景下常出现故障。当前的缓解方案要么依赖昂贵的多步提示流水线,要么通过将列与用户问题独立排序来进行筛选,忽略了列间结构关系。为扩展现有系统规模,我们提出了\\toolname,这是一个开源、LLM高效的模式过滤框架,通过以下方式压缩Text2SQL提示:(i) 使用融合数值和元数据的查询感知LLM编码器对列进行排序,(ii) 通过基于函数依赖的轻量级图变换器对互关联列进行重排序,(iii) 采用斯坦纳树启发式算法选择保持连通性的子模式。在真实数据集上的实验表明,\\toolname在实现接近完美召回率的同时,其精确度优于CodeS、SchemaExP、Qwen重排序器及嵌入检索器,且保持亚秒级中位延迟,可扩展至包含23,000+列的模式。我们的源代码发布于https://github.com/thanhdath/grast-sql。