SQLGraphExplorer‐4

0. 一周总结

横向代码写了些，纵向代码却没写多少。
了解了ER图，但发现没用（~~也有点用，背景和相关工作能写一写~~）。
bug倒是修了不少，全是前期框架设计的锅。

1. 目前的整体框架

snt

2. Graph 优化

区分表和列(颜色和箭头格式)
原样获取文本，避免粘连

// old-version
private final MultiMap<String, String> edges = new MultiMap<>();
// new-version
private final MultiMap<Node, Node> edges = new MultiMap<>();

public class Node {
    public NodeType nodeType;
    public String name;
    
    @Override
    public boolean equals(Object obj) {...}
    @Override
    public int hashCode() {...}
}

public enum NodeType {
    JOIN,
    WHERE,
    GROUP_BY,
    ORDER_BY,
    FUNCTION,
    UNION,
    CASE,
    MERGE,
    MERGE_UPDATE,
    MERGE_INSERT,
    TABLE,
    QUOTED_STRING,
    COLUMN;
}

String attribute = "[shape=\"%s\", color=\"%s\", style=filled]";
...
certainAttribute = switch (node.nodeType){
    case JOIN -> String.format(attribute, "parallelogram", "lightblue2");
    case WHERE -> String.format(attribute, "diamond", "lightblue2");
    case GROUP_BY -> String.format(attribute, "trapezium", "lightblue2");
    case ORDER_BY -> String.format(attribute, "house", "lightblue2");
    case FUNCTION -> String.format(attribute, "cds", "lightyellow2");
    case UNION ->  String.format(attribute, "circle", "lightblue2");
    case CASE ->  String.format(attribute, "rectangle", "lightyellow2");
    case MERGE ->  String.format(attribute, "polygon", "lightblue2");
    case MERGE_UPDATE ->  String.format(attribute, "fivepoverhang", "lightblue2");
    case MERGE_INSERT ->  String.format(attribute, "primersite", "lightblue2");
    default -> "";
};
...

3. Visitor的进展

3.1 整体进度

图太多就没贴

数据驱动的开发使得我去找了些sql文件（因为原来的sql文件没有order by、group by、from muti-tables、select *）

这些sql文件是antlr 4官方pl sql语法文件里的数据例子，但都是小文件。我想找真实的银行用的sql文件，但没找到。。。链接

3.2 表

create
merge
from muti-tables
order by
group by
with … as

3.3 列

select *
select table.*
numric_function

3.4 行

底层Graph写了一点，但是遇到些问题，后面会重点说。

4. ER Diagram in Sql

Peter Pin-Shan Chen于1975年发表《The Entity-Relationship Model-Toward a Unified View of Data》，介绍了实体-关系模型（Entity-Relationship Model）在数据领域的应用。文章首先讨论了信息检索请求基于实体-关系模型的情况，然后详细介绍了实体和关系的组织方式以及与之相关的信息结构。接着，文章对当前的三种主要数据模型进行了比较，分析了它们各自的优点和不足，并提出了实体-关系模型作为三种模型的综合的解决方案。最后，文章介绍了使用实体-关系模型进行数据库设计的方法和步骤。

Software to create database diagram - SSMS

4.1 相关产品

SSMS Diagram tool
dbForge Studio’s SQL Designer

4.2 特性

Database Diagram Visualization Tools
Tracking logical relations between tables
Creating and editing database objects on a diagram

4.3 有些差距

这个更多地是粗粒度的表现Sql表关系，而且ER图更多用于产品经理在梳理产业业务逻辑的过程中，梳理各个业务对象之间的关系。
而且它们做的事有一部分正向的VQL，也有一部分反向的QV。而QV做的却很模糊，对我的参考意义不大。。。

5. bug

5.1 语法问题

希望的图是： case -> SFGG ; case -> NBJGH

  --有bug
  CASE
    WHEN A.A01AN IN ('01', '02', '03', '04', '05', '06', '07', '09') THEN '是'
    ELSE '否'
  END SFGG,
  --没bug
  CASE
    WHEN A.B0110 = '0101' THEN B.B010A
    ELSE B.B01AA
  END AS NBJGH,

select_list_elements
    : tableview_name '.' ASTERISK
    | expression column_alias?
    ;

column_alias
    : AS? (identifier | quoted_string)
    | AS
    ;

// expression -> ... -> simple_case_statement
simple_case_statement
    : label_name? ck1=CASE expression simple_case_when_part+  case_else_part? END CASE? label_name?
    ;

由于antlr对于二义性匹配采用左优先，即前面匹配上了就直接采用。

SFGG -> lable_name

NBJGH -> column_alias

这块也是头疼了好一阵：

改Visitor吧，属于将错就错，因为语法文件就匹配错了，而且也很不好改
改语法文件吧，不敢。。。
最后采用暴力方式，改sql文件。(寄希望于写sql写的规范点。。。。)

5.2 栈的管理

之前说是维护表栈和列栈，但现在写的过程中发现栈的管理混乱地很，于是规范了下相关代码。比如：

Stack<Node> tableSrc = new Stack<>();
@Override
public String visitQuery_block(PlSqlParser.Query_blockContext ctx) {
    Node base = tableSrc.peek();
    int tableRefsOldSize = tableRefs.size();
    int tableSrcOldSize = tableSrc.size();

    // From -> Where? -> GroupBy? -> OrderBy? -> dstTable/union(out of select)
    if(ctx.order_by_clause() != null) { // OrderBy -> dstTable/union
        this.tableSrc.add(this.graph.addOrderBy(visitOrder_by_clause(ctx.order_by_clause())));
        tableSrcPopDst();
    }
    if(ctx.group_by_clause() != null) { // GroupBy -> OrderBy
        this.tableSrc.add(this.graph.addGroupBy(visitGroup_by_clause(ctx.group_by_clause())));
        tableSrcPopDst();
    }
    if (ctx.where_clause() != null) { // Where -> GroupBy
        this.tableSrc.add(this.graph.addWhere(visitWhere_clause(ctx.where_clause())));
        tableSrcPopDst();
    }
    // From -> Where
    Node dst = this.tableSrc.peek();
    visitFrom_clause(ctx.from_clause());
    while(this.tableSrc.size() > tableSrcOldSize)
        this.graph.addEdge(this.tableSrc.pop(),dst);
    // reverse base
    this.tableSrc.pop();
    this.tableSrc.add(base);
    if(base.nodeType == NodeType.TABLE)
        this.curDstTableName = tableSrc.peek().name;

    if (ctx.selected_list() != null) {
        this.tableRefsSize.put(ctx, tableRefsOldSize);
        visitSelected_list(ctx.selected_list());
        this.tableRefsSize.remove(ctx);
    }

    while(tableRefs.size() > tableRefsOldSize) tableRefs.pop();
    return ctx.getText();
}

5.3 行级别的溯源还是很困难

上次澄清了一个问题，就是行级别的关系无需动态溯源，即无需根据当前表中数据来精确判断某行数据究竟来自于哪里。
举例来说：假设目标表有若干行数据，我们想要找到某行的某列数据来自于哪里，这个依赖于表关系和列关系，现在表关系和列关系已经展示在图上面了，完全可以根据人工去查询。如果想要自动生成行数据来自于哪里，则不可避免需要sql执行引擎。
我的观点：sql文件上面只有表级别和列级别的关系，我尝试去解析行级别的关系的话，就不可避免需要依赖于底层数据库中的数据。换言之，我需要维护行数据。

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

SQLGraphExplorer‐4

0. 一周总结

1. 目前的整体框架

2. Graph 优化

3. Visitor的进展

3.1 整体进度

3.2 表

3.3 列

3.4 行

4. ER Diagram in Sql

4.1 相关产品

4.2 特性

4.3 有些差距

5. bug

5.1 语法问题

5.2 栈的管理

5.3 行级别的溯源还是很困难

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally