Newest 'flink-sql' Questions

0 votes

1 answer

49 views

How to emit keyed records for a compacted topic (SimpleStringSchema ClassCastException)?

I'm upgrading a PyFlink job to 2.0 and want to write to a Kafka compacted topic using the new KafkaSink. The stream produces (key, value) tuples (key is a string, value is a JSON payload). I configure ...

Sudhakar

28

asked Nov 5 at 8:07

0 votes

1 answer

42 views

High CPU usage from RowData serialization in Flink Table API despite ObjectReuse optimization

I have a Table API pipeline that does a 1-minute Tumbling Count aggregation over a set of 15 columns. FlameGraph shows that most of the CPU (~40%) goes into serializing each row, despite using ...

Tomás Cerdá

1

asked Aug 28 at 22:08

0 votes

1 answer

59 views

FLink sql with mini batch seems to trigger only on checkpoint

I have the following config set for my job 'table.exec.sink.upsert-materialize': 'NONE', 'table.exec.mini-batch.enabled': true, 'table.exec.mini-batch.allow-latency'...

hitesh

389

asked Aug 14 at 1:26

0 votes

0 answers

62 views

Flink SQL Job: com.starrocks.data.load.stream.exception.StreamLoadFailException: Could not get load state because

I'm encountering a Flink job failure and would appreciate any input on what might be misconfigured: 2025‑07‑28 17:30:52 org.apache.flink.runtime.JobException: Recovery is suppressed by ...

iman soltani

19

asked Jul 28 at 14:57

2 votes

1 answer

57 views

Files stuck as .inprogress, not rolling into final Parquet files

I'm running a Flink streaming job using the Table API, which reads from Kafka and writes to S3 (for now I'm using a local path to simulate S3). The job uses a filesystem connector to write data in ...

Tuan Duy

41

asked Jun 24 at 11:04

0 votes

0 answers

89 views

PyFlink Python UDF Fails in Remote Cluster from Jupyter Notebook – Connection Refused from Python Harness

I'm trying to develop a PyFlink streaming job using a Python UDF. When executing from Jupyter Notebook (local) with execution.target = remote, the job fails at the Python environment initialization ...

Apicha

33

asked Jun 3 at 17:38

0 votes

1 answer

65 views

Flink SQL - windows aggregation

I'm doing a Flink SQL stream processing, trying to do some windowing aggregation ... but I'm suffering that the job stops emitting new aggregations after some time (or after he catch up with all the ...

JPG

1,138

asked May 23 at 16:15

0 votes

1 answer

39 views

Unnable to run Flink Table API locally in version 1.19.2

i am trying to set up a local flink job that uses Table API to fetch data from a Kafka Source and printing it. Below is a snippet of the Flink Job public static void main(String[] args) { // ...

Ruoethren Pugunisparam

1

asked May 22 at 14:36

0 votes

1 answer

53 views

Flink sql resulting in chaining join, which is causing state to bloat

I have a this sql select `from all tables` FROM table1 i LEFT JOIN table2 ip ON i.tenantId = ip.tenantId AND i.id = ip.id LEFT JOIN table2 t ON i.tenantId = t.tenantId AND i.id = t.id LEFT ...

hitesh

389

asked May 16 at 11:55

0 votes

0 answers

73 views

Apache Iceberg table partitioning based on ID

Can I partition iceberg table in ID ranging in millions? Or Bucketing is the best option? Am pushing 40- 50 million records from sql which has ID identity column using pyflink. And then I want to ...

user3692929

1

asked May 7 at 17:24

0 votes

1 answer

30 views

Always missing one latest record when using count over partition

For Flik Kafka SQL source definition: CREATE TEMPORARY TABLE person ( payload STRING, `headers` MAP<STRING, BYTES> METADATA, `record_time` TIMESTAMP_LTZ(3) METADATA FROM 'timestamp', ...

王子1986

3,659

asked Apr 29 at 8:16

0 votes

0 answers

33 views

Flink SQL TaskExecutor Error: No Allocated Slots Despite Slot and Memory Configurations

I’ve been trying to create a table using the sqlserver-cdc connector in Flink with the following query: CREATE TABLE files ( Id INT, FileName STRING, FileContent STRING, CreatedAt TIMESTAMP(3),...

Mushegh Hovhannisyan

1

asked Apr 11 at 11:10

1 vote

0 answers

82 views

Get Exception after submit the pyFlink Job

I am a new for pyflink. I try to run submit a simple python to YARN application mode. But I got the error said cannot find the python file word_count.py. Below is my environment and the exception log. ...

Alvin Kam

61

asked Apr 2 at 0:06

0 votes

1 answer

68 views

Unable to start a pyFlink job from savepoint

I'm using Flink 1.20.0, and try to submit a pyFlink job and start it from aan existed savepoint, I execute in command line: flink run --fromSavepoint s3a://.../1a4e1e73910e5d953183b8eb1cd6eb84/chk-1 -...

Rinze

834

asked Mar 24 at 11:02

1 vote

1 answer

44 views

How sqlExecute queries run in Apache Flink when triggered via proccessFunction?? How are the SQL Tasks managed?

Context: So, I am trying to build a Flink application that runs rules dynamically. I have a rule Stream from where SQL rules are written, which Flink reads from and executes. I have connected the ...

Sai Ashrritth Patnana

11

asked Mar 7 at 8:31

0 votes

2 answers

145 views

flink sql Repeatedly parsing JSON problem

I have a flink sql like this: select json_value(json_str, '$.key1') as key1, json_value(json_str, '$.key2') as key2, json_value(json_str, '$.key3') as key3, json_value(json_str, '$....

xinfa

1

asked Mar 3 at 8:22

0 votes

0 answers

23 views

Why does Flink throw parser error when adding additional elements to JSON_OBJECT?

I am trying to create a VIEW on table CREATE TEMPORARY VIEW `transform_3` AS SELECT `first_name` AS `first_name`, `id` AS `id`, `event_time` AS `...

md1980

349

asked Feb 28 at 1:50

0 votes

0 answers

81 views

Flink and LAG function

I have created a table that reads from a Kafka topic. What I want is to sort by eventTime and add a new field that represents the previous value using the LAG function. The problem comes when two ...

Guille

2,320

asked Feb 25 at 11:59

1 vote

1 answer

48 views

How to read state generate by flink sql code

I have this table /** mode('streaming')*/ CREATE OR REPLACE TABLE eoj_table ( `tenantId` string, `id` string, `name` string, `headers` MAP<STRING, BYTES> METADATA , `hard_deleted` ...

hitesh

389

asked Feb 11 at 1:05

0 votes

1 answer

78 views

Time Attribute Type for a TUMBLE with Apache Flink

I getting the following Exception in Flink. The window function requires the timecol is a time attribute type, but is TIMESTAMP(3) Little bit research at the internet tells me, this problem is cause ...

posthumecaver

1,882

asked Jan 31 at 13:17

0 votes

0 answers

52 views

Apache Flink SQL API Temporary Table connector type

I am experimenting little bit with Flink SQL API, basically what I am trying to do is working but I am stuck at one point. I am reading some kafka topics and from the data I received I have to do some ...

posthumecaver

1,882

asked Jan 30 at 10:38

-1 votes

1 answer

42 views

Is key uniqueness enforced within partitions or across all partitions?

Question: I am working with Apache Flink (Flink SQL) to manage Hudi tables, and I noticed that Hudi supports multiple index types. According to the official documentation on Index Types in Hudi, these ...

wancrin potter

1

asked Jan 22 at 12:55

3 votes

0 answers

182 views

How to handle an array of objects in apache flink with the Table API

I'm consuming a Kafka topic with Flink using the Table API, pretty much like this: CREATE TEMPORARY TABLE data_topic ( `hash` STRING, `date` STRING, `values` ARRAY<STRING>, ...

pfeigl

587

asked Jan 15 at 9:01

0 votes

0 answers

54 views

BQ query to Flink SQL

WITH shopping_items_agg AS ( SELECT order_number, SUM(s.quantity) AS total_quantity FROM `booking`, UNNEST(shopping_items) AS s WHERE TIMESTAMP_TRUNC(event_timestamp, DAY) = ...

Rahul Bera

73

asked Jan 7 at 10:39

0 votes

0 answers

74 views

Unable to insert data to Kafka topic in Apache Flink using Upsert Kafka Connector in Python

I am working on building Apache Flink pipeline where source is SQL server CDC, and sink is Upsert Kafka, everything looks fine until when executing insert I got error of Caused by: java.lang....

KS Bhale

1

asked Dec 19, 2024 at 7:14

Collectives™ on Stack Overflow

How to emit keyed records for a compacted topic (SimpleStringSchema ClassCastException)?

High CPU usage from RowData serialization in Flink Table API despite ObjectReuse optimization

FLink sql with mini batch seems to trigger only on checkpoint

Flink SQL Job: com.starrocks.data.load.stream.exception.StreamLoadFailException: Could not get load state because

Files stuck as .inprogress, not rolling into final Parquet files

PyFlink Python UDF Fails in Remote Cluster from Jupyter Notebook – Connection Refused from Python Harness

Flink SQL - windows aggregation

Unnable to run Flink Table API locally in version 1.19.2

Flink sql resulting in chaining join, which is causing state to bloat

Apache Iceberg table partitioning based on ID

Always missing one latest record when using count over partition

Flink SQL TaskExecutor Error: No Allocated Slots Despite Slot and Memory Configurations

Get Exception after submit the pyFlink Job

Unable to start a pyFlink job from savepoint

How sqlExecute queries run in Apache Flink when triggered via proccessFunction?? How are the SQL Tasks managed?

flink sql Repeatedly parsing JSON problem

Why does Flink throw parser error when adding additional elements to JSON_OBJECT?

Flink and LAG function

How to read state generate by flink sql code

Time Attribute Type for a TUMBLE with Apache Flink

Apache Flink SQL API Temporary Table connector type

Is key uniqueness enforced within partitions or across all partitions?

How to handle an array of objects in apache flink with the Table API

BQ query to Flink SQL

Unable to insert data to Kafka topic in Apache Flink using Upsert Kafka Connector in Python

Hot Network Questions