Skip to content

Direct Dataflow

This article introduces some SQL elements that will generate direct dataflow.

1. Select

1
2
3
SELECT a.empName "eName"
FROM scott.emp a
Where sal > 1000

the data of target column "eName" comes from scott.emp.empName so we have a direct dataflow like this:

1
scott.emp.empName -> direct -> RS-1."eName"

the resultset RS-1 generated by the select list is a relation, which includes columns and rows.

dataflow in XML

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<dlineage>
    <table id="2" schema="scott" name="scott.emp" alias="a" type="table" coordinate="[2,6,0],[2,17,0]">
        <column id="3" name="empName" coordinate="[1,8,0],[1,17,0]"/>
    </table>
    <resultset id="5" name="RS-1" type="select_list" coordinate="[1,8,0],[1,25,0]">
        <column id="6" name=""eName"" coordinate="[1,8,0],[1,25,0]"/>
    </resultset>
    <relation id="1" type="fdd" effectType="select">
        <target id="6" column=""eName"" parent_id="5" parent_name="RS-1" coordinate="[1,8,0],[1,25,0]"/>
        <source id="3" column="empName" parent_id="2" parent_name="scott.emp" coordinate="[1,8,0],[1,17,0]"/>
    </relation>
</dlineage>

The relation represents a dataflow from source column with id=3 to the target column with id=6

diagram

2. Function

During the dataflow analyzing, function plays a key role. It accepts columns as arguments and generate result which maybe a scalar value or a set value.

1
select round(salary) as sal from scott.emp

A direct dataflow is generated from column salary to the round function in the above SQL :

1
scott.emp.salary -> direct -> round(salary) -> direct -> sal

dataflow in xml

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<dlineage>
    <table id="2" schema="scott" name="scott.emp" type="table" coordinate="[1,34,0],[1,43,0]">
        <column id="3" name="salary" coordinate="[1,14,0],[1,20,0]"/>
    </table>
    <resultset id="5" name="RS-1" type="select_list" coordinate="[1,8,0],[1,28,0]">
        <column id="6" name="sal" coordinate="[1,8,0],[1,28,0]"/>
    </resultset>
    <resultset id="8" name="FUNCTION-1" type="function" coordinate="[1,8,0],[1,21,0]">
        <column id="9" name="round" coordinate="[1,8,0],[1,13,0]"/>
    </resultset>
    <relation id="1" type="fdd" effectType="select">
        <target id="6" column="sal" parent_id="5" parent_name="RS-1" coordinate="[1,8,0],[1,28,0]"/>
        <source id="9" column="round" parent_id="8" parent_name="FUNCTION-1" coordinate="[1,8,0],[1,13,0]"/>
    </relation>
    <relation id="2" type="fdd" effectType="function">
        <target id="9" column="round" parent_id="8" parent_name="FUNCTION-1" coordinate="[1,8,0],[1,13,0]"/>
        <source id="3" column="salary" parent_id="2" parent_name="scott.emp" coordinate="[1,14,0],[1,20,0]"/>
    </relation>
</dlineage>

diagram

if you turn off the show function setting with /if option, the result is:

3. References

  1. xml code used in this article is generated by DataFlowAnalyzer tools
  2. digram used in this article is generated by the Gudu SQLFlow Cloud version