I have the following dataframe:
dataframe1
+-----------------------+
|ID |
+-----------------------+
|[10,80,60,] |
|[20,40,] |
+-----------------------+
And another dataframe:
dataframe2
+------------------+----------------+
|ID_2 | name |
+------------------+----------------+
|40 | XYZZ |
|200 | vbb |
+------------------+----------------+
I want the following output:
+------------------+----------------+
|ID_2 | name |
+------------------+----------------+
|40 | XYZZ |
+------------------+----------------+
I'm using the following code to select from the second dataframe rows witch ID_2 == ID.
for (java.util.Iterator<Row> iter = dataframe1.toLocalIterator(); iter.hasNext();) {
String item = (iter.next()).get(0).toString();
dataframe2.registerTempTable("data2");
Dataset<Row> res = sparkSession.sql("select * from data2 where ID_2 IN ("+item+")");
res.show();
}
But I get the following exception :
Exception in thread "main" org.apache.spark.sql.catalyst.parser.ParseException:
mismatched input 'from' expecting <EOF>(line 1, pos 9)
== SQL ==
select * from data2 where ID_2 IN ([10,80,60,])
---------^^^
at org.apache.spark.sql.catalyst.parser.ParseException.withCommand(ParseDriver.scala:241)
at org.apache.spark.sql.catalyst.parser.AbstractSqlParser.parse(ParseDriver.scala:117)
at org.apache.spark.sql.execution.SparkSqlParser.parse(SparkSqlParser.scala:48)
at org.apache.spark.sql.catalyst.parser.AbstractSqlParser.parsePlan(ParseDriver.scala:69)
at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:642)
at factory.Geofencing_Alert.check(Geofencing_Alert.java:84)
at factory.Geofencing_Alert.main(Geofencing_Alert.java:158)
How can I fix this?
...get(0).toString()try...get(0).mkString(",").