1

I am wokring on SQL code parser in python for further code analysis. I decided to use "sqlparse" python library and after parsing the text I would like to know particular token type such as "Identifier", "Keyword", ....

Globally after running:

 raw = 'select * from foo join fuu on foo.id = fuu.id where id in (select id from bar);'
 statements = sqlparse.parse(raw)
 print(statements[0].tokens)

I can see:

[<DML 'select' at 0x298851E3DC0>, <Whitespace ' ' at 0x2988523C400>, <Wildcard '*' at 0x2988523C100>, <Whitespace ' ' at 0x2988523C160>, <Keyword 'from' at 0x2988523C1C0>, <Whitespace ' ' at 0x2988523CB20>,

Where it is visible whether the token is "whitespace" or any other type. But it is only visible when printing whole list in print statement. How do I get info in code whether it is Whitespace, Keyword or something else?

I was trying multiple functions from library, but I still cannot get it. (Documentation of the library) https://buildmedia.readthedocs.org/media/pdf/sqlparse/latest/sqlparse.pdf

There exists something such as ttype, but it doesn't work. One token is type <class 'sqlparse.sql.Token'>.

Any suggestions?

1 Answer 1

0

I'm still new to sqlparse myself, but here is my solution. If you know how to improve it, please leave a comment!

def get_token_type(sql_query: str = None, tokens: list = None, type_list: list = None) -> list:
statements = tokens or sqlparse.parse(sql_query.strip())[0]
type_list = type_list or []
for token in statements:
    if token.ttype:
        type_list.append(token.ttype)
    else:
        get_token_type(tokens=token.tokens, type_list=type_list)
return type_list
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.