5

I know by using prepared statement we can set the column values. Here what I want is, I already have a list of queries written to execute on same table but with different column values. E.g.

select * from tableName as t1 where t1.tableColumnId=4 and t1.tableColumnName='test' inner join tableName2 as t2 on t1.tableColumnId=t2.tableColumnId

select * from tableName as t1 where t1.tableColumnId=6 and t1.tableColumnName='test' inner join tableName2 as t2 on t1.tableColumnId=t2.tableColumnId

As you can see both the queries are almost same except the tableColumnId value. I want to save this in collection as

select * from tableName as t1 where t1.tableColumnId=? and t1.tableColumnName='test' inner join tableName2 as t2 on t1.tableColumnId=t2.tableColumnId

So that I won't have duplicate queries (where values are not considered).

How can I do this?

2
  • 1
    use preparedStatement method Commented Jul 25, 2017 at 13:53
  • By using prepared statements we can pass the values, but here I want to replace the values already existing in the query with some character and store it in collection. Please tell me how, if we can achieve this using preparedStatement. Commented Jul 25, 2017 at 14:32

2 Answers 2

4
  1. One approach would consist in defining a subset of the SQL grammar that would be sufficient to parse your queries, then write a parser for that grammar,
  2. compare the queries and find the parts that are identical, and those that differ,
  3. locate the literal values like 4, 6, 'test' in you queries, build (flat) syntactic tree, and compare the trees to one another to identify those literal values that may differ from one query to another.

UPDATE

To parse the SQL, you could user a parser generator like ANTLR or JavaCC. ANTLR and JavaCC grammars exist for SQL, and you could start with one of them.

That said, I think this approach would be overkill in this instance; I would rather use the third.

UPDATE 2: (third method)

To locate literal strings and numbers, you can use a regexp:

private static final Pattern CONST_PATTERN
        = Pattern.compile("([^0-9a-zA-Z])((?:[0-9]+(?:\\.[0-9]*)?|[0-9]*\\.[0-9]+)"
                + "(?:[Ee][+-][0-9]+])?"
                + "|(?:\\'[^']*\\')+)", Pattern.CASE_INSENSITIVE);

You can parse a query while generating the following structure:

private static class ParameterizedQuery {
    final String sql;
    final Parameter[] params;

    ParameterizedQuery(String sql, Parameter[] params) {
        this.sql = sql;
        this.params = params.clone();
    }
}

private static class Parameter {
    final int position;
    final String value;

    Parameter(int position, String value) {
        this.position = position;
        this.value = value;
    }
}

The resulting sql query is the input query with all the literals replaced with question marks. The parsing is done as follows:

private static ParameterizedQuery parse(String query) {
    List<Parameter> parms = new ArrayList<>();
    Matcher matcher = CONST_PATTERN.matcher(query);
    int start = 0;
    StringBuilder buf = new StringBuilder();
    while (matcher.find()) {
        int pos = matcher.start();
        buf.append(query, start, pos)
                .append(matcher.group(1))
                .append("?");
        parms.add(new Parameter(buf.length()-1,matcher.group(2)));
        start = matcher.end();
    }
    buf.append(query, start, query.length());
    return new ParameterizedQuery(
            buf.toString(), parms.toArray(new Parameter[parms.size()]));
}

Now, if you have a list of queries, and you want to keep as parameter only those that are not equal in all the input queries, you parse all your queries, producing an array of ParameterizedQuery, and the simplify that array:

private static ParameterizedQuery[] simplify(ParameterizedQuery[] queries) {
    if (queries.length == 0) {
        return queries;
    }
    ParameterizedQuery prev = null;
    boolean[] diff = null;
    for (ParameterizedQuery cur: queries) {
        if (prev == null) {
            diff = new boolean[cur.params.length];
        } else {
            if (!cur.sql.equals(prev.sql)) {
                throw new RuntimeException(
                        "Queries are too different: [" + prev.sql
                        + "] and [" + cur.sql + "]");
            } else if (cur.params.length != prev.params.length) {
                throw new RuntimeException(
                        "Different number of parameters: ["
                        + prev.params.length
                        + "] and [" + cur.params.length + "]");
            }
            for (int i = 0; i < diff.length; ++i) {
                if (!cur.params[i].value.equals(prev.params[i].value)) {
                    diff[i] = true;
                }
            }
        }
        prev = cur;
    }
    if (and(diff)) {
        return queries;
    }
    ParameterizedQuery[] result = new ParameterizedQuery[queries.length];
    result[0] = expandQuery(queries[0].sql, queries[0].params, diff);
    for (int i = 1; i < queries.length; ++i) {
        result[i] = new ParameterizedQuery(result[0].sql,
                keep(queries[i].params, result[0].params, diff));
    }
    return result;
}

private static boolean and(boolean[] arr) {
    for (boolean b: arr) {
        if (!b) {
            return false;
        }
    }
    return true;
}

private static ParameterizedQuery expandQuery(String query,
        Parameter[] params, boolean[] diff) {
    int count = 0;
    for (boolean b: diff) {
        if (b) {
            ++count;
        }
    }
    Parameter[] result = new Parameter[count];
    int r = 0;
    int start = 0;
    StringBuilder buf = new StringBuilder();
    for (int i = 0; i < diff.length; ++i) {
        Parameter parm = params[i];
        if (!diff[i]) {
            // expand param
            buf.append(query, start, parm.position);
            buf.append(parm.value);
            start = parm.position+1;
        } else {
            buf.append(query, start, parm.position);
            result[r++] = new Parameter(buf.length(), parm.value);
            start = parm.position;
        }
    }
    buf.append(query, start, query.length());
    return new ParameterizedQuery(buf.toString(), result);
}

private static Parameter[] keep(Parameter[] params, Parameter[] ref,
        boolean[] diff) {
    Parameter[] result = new Parameter[ref.length];
    int j = 0;
    for (int i = 0; i < params.length; ++i) {
        if (diff[i]) {
            result[j] = new Parameter(ref[j].position, params[i].value);
            ++j;
        }
    }
    return result;
}

Here's the program that resolves your example:

public class Main {
    private static final String[] QUERIES = {
        "select * from tableName as t1 where t1.tableColumnId=4 and t1.tableColumnName='test' inner join tableName2 as t2 on t1.tableColumnId=t2.tableColumnId",
        "select * from tableName as t1 where t1.tableColumnId=6 and t1.tableColumnName='test' inner join tableName2 as t2 on t1.tableColumnId=t2.tableColumnId",
    };
    private static final Pattern CONST_PATTERN
            = Pattern.compile("([^0-9a-zA-Z])((?:[0-9]+(?:\\.[0-9]*)?|[0-9]*\\.[0-9]+)"
                    + "(?:[Ee][+-][0-9]+])?"
                    + "|(?:\\'[^']*\\')+)", Pattern.CASE_INSENSITIVE);

    private static class ParameterizedQuery {
        final String sql;
        final Parameter[] params;

        ParameterizedQuery(String sql, Parameter[] params) {
            this.sql = sql;
            this.params = params.clone();
        }
    }

    private static class Parameter {
        final int position;
        final String value;

        Parameter(int position, String value) {
            this.position = position;
            this.value = value;
        }
    }

    public static void main(String[] args) {
        ParameterizedQuery[] queries = new ParameterizedQuery[QUERIES.length];
        for (int i = 0; i < QUERIES.length; ++i) {
            queries[i] = parse(QUERIES[i]);
        }
        for (ParameterizedQuery cur: queries) {
            System.out.println(cur.sql);
            int i = 0;
            for (Parameter parm: cur.params) {
                System.out.println("    " + (++i) + ": " + parm.value);
            }
        }
        queries = simplify(queries);
        for (ParameterizedQuery cur: queries) {
            System.out.println(cur.sql);
            int i = 0;
            for (Parameter parm: cur.params) {
                System.out.println("    " + (++i) + ": " + parm.value);
            }
        }
    }

    private static ParameterizedQuery parse(String query) {
        List<Parameter> parms = new ArrayList<>();
        Matcher matcher = CONST_PATTERN.matcher(query);
        int start = 0;
        StringBuilder buf = new StringBuilder();
        while (matcher.find()) {
            int pos = matcher.start();
            buf.append(query, start, pos)
                    .append(matcher.group(1))
                    .append("?");
            parms.add(new Parameter(buf.length()-1,matcher.group(2)));
            start = matcher.end();
        }
        buf.append(query, start, query.length());
        return new ParameterizedQuery(
                buf.toString(), parms.toArray(new Parameter[parms.size()]));
    }

    private static ParameterizedQuery[] simplify(ParameterizedQuery[] queries) {
        if (queries.length == 0) {
            return queries;
        }
        ParameterizedQuery prev = null;
        boolean[] diff = null;
        for (ParameterizedQuery cur: queries) {
            if (prev == null) {
                diff = new boolean[cur.params.length];
            } else {
                if (!cur.sql.equals(prev.sql)) {
                    throw new RuntimeException(
                            "Queries are too different: [" + prev.sql
                            + "] and [" + cur.sql + "]");
                } else if (cur.params.length != prev.params.length) {
                    throw new RuntimeException(
                            "Different number of parameters: ["
                            + prev.params.length
                            + "] and [" + cur.params.length + "]");
                }
                for (int i = 0; i < diff.length; ++i) {
                    if (!cur.params[i].value.equals(prev.params[i].value)) {
                        diff[i] = true;
                    }
                }
            }
            prev = cur;
        }
        if (and(diff)) {
            return queries;
        }
        ParameterizedQuery[] result = new ParameterizedQuery[queries.length];
        result[0] = expandQuery(queries[0].sql, queries[0].params, diff);
        for (int i = 1; i < queries.length; ++i) {
            result[i] = new ParameterizedQuery(result[0].sql,
                    keep(queries[i].params, result[0].params, diff));
        }
        return result;
    }

    private static boolean and(boolean[] arr) {
        for (boolean b: arr) {
            if (!b) {
                return false;
            }
        }
        return true;
    }

    private static ParameterizedQuery expandQuery(String query,
            Parameter[] params, boolean[] diff) {
        int count = 0;
        for (boolean b: diff) {
            if (b) {
                ++count;
            }
        }
        Parameter[] result = new Parameter[count];
        int r = 0;
        int start = 0;
        StringBuilder buf = new StringBuilder();
        for (int i = 0; i < diff.length; ++i) {
            Parameter parm = params[i];
            if (!diff[i]) {
                // expand param
                buf.append(query, start, parm.position);
                buf.append(parm.value);
                start = parm.position+1;
            } else {
                buf.append(query, start, parm.position);
                result[r++] = new Parameter(buf.length(), parm.value);
                start = parm.position;
            }
        }
        buf.append(query, start, query.length());
        return new ParameterizedQuery(buf.toString(), result);
    }

    private static Parameter[] keep(Parameter[] params, Parameter[] ref,
            boolean[] diff) {
        Parameter[] result = new Parameter[ref.length];
        int j = 0;
        for (int i = 0; i < params.length; ++i) {
            if (diff[i]) {
                result[j] = new Parameter(ref[j].position, params[i].value);
                ++j;
            }
        }
        return result;
    }
}

The output is:

select * from tableName as t1 where t1.tableColumnId=? and t1.tableColumnName=? inner join tableName2 as t2 on t1.tableColumnId=t2.tableColumnId
    1: 4
    2: 'test'
select * from tableName as t1 where t1.tableColumnId=? and t1.tableColumnName=? inner join tableName2 as t2 on t1.tableColumnId=t2.tableColumnId
    1: 6
    2: 'test'
select * from tableName as t1 where t1.tableColumnId=? and t1.tableColumnName='test' inner join tableName2 as t2 on t1.tableColumnId=t2.tableColumnId
    1: 4
select * from tableName as t1 where t1.tableColumnId=? and t1.tableColumnName='test' inner join tableName2 as t2 on t1.tableColumnId=t2.tableColumnId
    1: 6
Sign up to request clarification or add additional context in comments.

1 Comment

thanks for the suggestion. Can you please tell me how to define the grammar in java.
4

jOOQ has a SQL parser / translator that can do these kinds of things. You can access it over the jOOQ website, for example: https://www.jooq.org/translate.

  1. Fix your SQL query as it uses a wrong syntactic order of WHERE and INNER JOIN
  2. Specify "Bind Variables" to be "Force Indexed", which parses literals and turns them into bind variables (all of them, not just specific ones)
  3. Paste your query in the input textarea

You'll get:

Input

select * 
from tableName as t1 
inner join tableName2 as t2 
on t1.tableColumnId=t2.tableColumnId 
where t1.tableColumnId=4 and t1.tableColumnName='test'

Output

select *
from tableName as t1
  join tableName2 as t2
    on t1.tableColumnId = t2.tableColumnId
where (
  t1.tableColumnId = ?
  and t1.tableColumnName = ?
)

You can also use this API programmatically using:

import org.jooq.*;
import org.jooq.conf.*;
import org.jooq.impl.*;

public class Test {
    public static void main(String[] args) throws Exception {
        DSLContext ctx = DSL.using(
            new DefaultConfiguration().set(
                new Settings().withParamType(ParamType.FORCE_INDEXED)));

        Parser parser = ctx.parser();
        Query query = parser.parseQuery(
            "select * "
          + "from tableName as t1 "
          + "inner join tableName2 as t2 "
          + "on t1.tableColumnId=t2.tableColumnId "
          + "where t1.tableColumnId=4 and t1.tableColumnName='test'");

        System.out.println(ctx
            .renderContext()
            .paramType(ParamType.FORCE_INDEXED)
            .visit(query)
            .render());
    }
}

Disclaimer: I work for the vendor.

2 Comments

Can the JOOQ parser be used for free in a commercial software?
@humbleCoder: Yes, it has a lot of support for free features, see the manual: jooq.org/doc/latest/manual/sql-building/sql-parser. Some features (e.g. to extend the grammar with custom parser logic) are commercial only, but most is available in the jOOQ Open Source Edition

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.