Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: complete after error syntax #334

Open
wants to merge 3 commits into
base: next
Choose a base branch
from

Conversation

liuxy0551
Copy link
Collaborator

@liuxy0551 liuxy0551 commented Jul 25, 2024

在错误语法的 SQL 后进行自动补全

现状举例

  1. 前方 SQL 语法错误导致光标所在位置无法准确的提示 INSERT 等关键字:
SELECT FROM tb1;
I|

image

  1. 错误语法后的 SELECT * FROM 被解析为多个 statement,无法进行准确的自动补全:
SELECT FROM tb1;
SELECT * FROM |

image

预期举例

  1. 有分号分隔时,以分号后一位作为左边界,右边界不变,将区间内的内容给到 antlr4-c3 进行解析。此时期望能够提示 INSERT 等关键字
SELECT  FROM tb1;
I|
  1. 没有分号分隔时,无法感知第一行的 sql 语句已经结束。此时无法准确的自动补全
SELECT FROM tb1
I|

改动思路

提到的左边界和右边界可以参考 dt-sql-parser #231 的描述。

通过分隔符进行切分(通常是 ;),这里依旧保留现状寻找最小合适范围的策略,并在此策略上继续优化,借助两种方式进一步缩小解析范围。

  1. 在已经获取到的合适范围中以光标为起点,向左查找 ; 的 tokenIndex,并以此为左边界;
  2. 在已经获取到的合适范围中以光标为起点,向右查找 ; 的 tokenIndex,并以此为右边界;

通常在写 SQL 时,一般不会先写当前语句的 ;,所以右边界一般不会再次改变。如果左右没有查找到 ; 则不修改左右边界。

实现效果

2024-09-27 15 40 13

@liuxy0551 liuxy0551 force-pushed the feat_complete branch 2 times, most recently from f6b1a3d to 56e4d0d Compare July 30, 2024 16:41
@liuxy0551
Copy link
Collaborator Author

liuxy0551 commented Jul 31, 2024

过程中遇到的问题

尝试过以独立语句开头的关键词(如:SELECT, INSERT)进行切分,遇到了一些问题:

  1. pg 的个别语法
REVOKE SELECT (co_name) ON table_name |FROM PUBLIC;

GRANT SELECT (column_name) ON table_name TO |role_specification;

MERGE INTO wines w USING wine_stock_changes s ON s.winename = w.winename 
WHEN NOT MATCHED AND stock_delta > 0 
THEN INSERT (col_name) |VALUES(s.winename, s.stock_delta);

WITH with_query_name (col_name) AS (SELECT id FROM table_expression) SEARCH DEPTH 
FIRST BY column_name SET column_name 
CYCLE col_name SET col_name 
USING col_name SELECT|;

上述语法中的 SELECT ON 等和常规独立语句不同,不属于一个语句,此时切分得到的语句依旧无法进行正确的自动补全。

  1. 子查询(复杂)
SELECT c.customer_id, c.customer_name, c.email, total_orders.total_amount, total_orders.order_count
FROM customers c
JOIN (
  SELECT o.customer_id, SUM(o.total_amount) AS total_amount, COUNT(o.order_id) AS order_count
  FROM orders o
  WHERE o.order_date BETWEEN '2024-08-01' AND '2024-08-31'
  GROUP BY o.customer_id
  HAVING COUNT(o.order_id) > 5
) AS total_orders
ON c.customer_id = total_orders.customer_id
WHERE| c.status = 'active'
ORDER BY total_orders.total_amount DESC;

上述语句中存在多层级的子查询,此时如果在子查询后出现光标,且光标位置和子查询不是同一层级,那么会出现较为明显的切分错误,结果如下,连子查询的括号都不完整,更不谈正确进行自动补全了。

  SELECT o.customer_id, SUM(o.total_amount) AS total_amount, COUNT(o.order_id) AS order_count
  FROM orders o
  WHERE o.order_date BETWEEN '2024-08-01' AND '2024-08-31'
  GROUP BY o.customer_id
  HAVING COUNT(o.order_id) > 5
) AS total_orders
ON c.customer_id = total_orders.customer_id
WHERE|

因此,放弃通过以独立语句开头的关键词进行切分,仅通过分隔符进行切分(通常是 ;)。

@liuxy0551 liuxy0551 force-pushed the feat_complete branch 2 times, most recently from b610cb3 to d841e3c Compare August 26, 2024 08:54
@liuxy0551 liuxy0551 force-pushed the feat_complete branch 8 times, most recently from 944a97f to 417c063 Compare September 27, 2024 07:57
@liuxy0551 liuxy0551 marked this pull request as ready for review September 27, 2024 07:58
@liuxy0551 liuxy0551 changed the title test: complete after error syntax feat: complete after error syntax Sep 27, 2024
@liuxy0551
Copy link
Collaborator Author

已发 beta 包在离线中验证效果符合预期,[email protected], [email protected]

src/parser/flink/index.ts Show resolved Hide resolved
src/parser/common/basicSQL.ts Outdated Show resolved Hide resolved
@liuxy0551 liuxy0551 force-pushed the feat_complete branch 2 times, most recently from 9ce9722 to 6cebe8e Compare October 15, 2024 14:14
@openai0229
Copy link

所以左右边界都是采取分号来做划分吗?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants