3

I am using postgreSQL queries. I want to extract all information from a SQL query, for example

sql = " select d_year, s_nation, p_category, sum(lo_revenue - lo_supplycost) as profit from DATES, CUSTOMER, SUPPLIER, PART, LINEORDER where lo_custkey  =  c_custkey and lo_suppkey  =  s_suppkey and lo_partkey  =  p_partkey and lo_orderdate  =  d_datekey and c_region  =  'AFRICA' and s_region  =  'AFRICA' and (d_year  =  1996 or d_year  =  1997) and (p_mfgr  =  'MFGR#2' or p_mfgr  =  'MFGR#4') group by d_year, s_nation, p_category order by d_year, s_nation, p_category "

I want to get all tables concerned, all selection predicate, and all join predicate, group by part, and order by part.

I used sqlparse and I found a way to get only the tables concerned. Is there any examples of how to extract this information?

1
  • 1
    You can use Antlr to parse SQL statements and extract the AST. Commented Nov 2, 2019 at 9:58

1 Answer 1

2

This algorithm gives the exact element between each keyword. I used sqlparse

parsed = sqlparse.parse(sql)
stmt = parsed[0]
from_seen = False
select_seen = False
where_seen = False
groupby_seen = False
orderby_seen = False

for token in stmt.tokens:
    if select_seen:
        if isinstance(token, IdentifierList):
            for identifier in token.get_identifiers():
                print("{} {}\n".format("Attr = ", identifier))
        elif isinstance(token, Identifier):
            print("{} {}\n".format("Attr = ", token))
    if from_seen:
        if isinstance(token, IdentifierList):
            for identifier in token.get_identifiers():
                print("{} {}\n".format("TAB = ", identifier))
        elif isinstance(token, Identifier):
            print("{} {}\n".format("TAB = ", token))
    if orderby_seen:
        if isinstance(token, IdentifierList):
            for identifier in token.get_identifiers():
                print("{} {}\n".format("ORDERBY att = ", identifier))
        elif isinstance(token, Identifier):
            print("{} {}\n".format("ORDERBY att = ", token))
    if groupby_seen:
        if isinstance(token, IdentifierList):
            for identifier in token.get_identifiers():
                print("{} {}\n".format("GROUPBY att = ", identifier))
        elif isinstance(token, Identifier):
            print("{} {}\n".format("GROUPBY att = ", token))

    if isinstance(token, Where):
        select_seen = False
        from_seen = False
        where_seen = True
        groupby_seen = False
        orderby_seen = False
        for where_tokens in token:
            if isinstance(where_tokens, Comparison):
                print("{} {}\n".format("Comparaison = ", where_tokens))
            elif isinstance(where_tokens, Parenthesis):
                print("{} {}\n".format("Parenthesis = ", where_tokens))
                # tables.append(token)
    if token.ttype is Keyword and token.value.upper() == "GROUP BY":
        select_seen = False
        from_seen = False
        where_seen = False
        groupby_seen = True
        orderby_seen = False
    if token.ttype is Keyword and token.value.upper() == "ORDER BY":
        select_seen = False
        from_seen = False
        where_seen = False
        groupby_seen = False
        orderby_seen = True
    if token.ttype is Keyword and token.value.upper() == "FROM":
        select_seen = False
        from_seen = True
        where_seen = False
        groupby_seen = False
        orderby_seen = False
    if token.ttype is DML and token.value.upper() == "SELECT":
        select_seen = True
        from_seen = False
        where_seen = False
        groupby_seen = False
        orderby_seen = False
Sign up to request clarification or add additional context in comments.

3 Comments

``` if isinstance(token, Where):``` what is where here?
@RakeshV it's an Instance. When using an IDE the "Where" will return in a different color. Same as isinstance(token, Identifier) - isinstance(token, IdentifierList) in this program
@RakeshV the where is from sqlparse.sql.Where

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.