Cassandra Java driver: high latency when executing many async reads in a loop

Question

We're using DSE 6.0.18 (Cassandra version 3.11) and our application is both read and write heavy.

I have a situation where I need to fire N number of read queries for each API (by partition key) using session.executeAsync(...) in a loop. The number of queries (N) differs for each API based on the use-case. Most of the API's will fire less than 30 queries and some 20% of API's fire large no of queries like up to 50 to 300. For the APIs with a lesser number of queries fired, the overall Cassandra response time is less than 1 to 1.5 secs at max, but for APIs with a larger number of queries, the response time drastically increase to 20 to 30 secs.

I'm executing individual SELECT queries per partition key (e.g., SELECT * FROM table WHERE id = ?) using executeAsync().

My questions:

Why does firing more async reads in parallel degrade performance so drastically?
Is there a limit on in-flight or async requests from the Cassandra Java driver?
How can I throttle or batch async reads efficiently without losing parallelism?

I read in a different post on tuning Native Transport Request(NTR), like native_transport_max_threads and max_queued_native_transport_requests. We have default values for these parameters, will tuning these help us?

Additional information:

We have 12 node cluster with over 15TB data.
Data is not evenly distributed and having little issues in the cluster like large partitions, uneven data distribution, older versions. We are working on moving data to newer hardware and fix all these later.
No custom throttling currently implemented.
On an average daily 70M to 110M reads are happening.
The table has large no of cols like 400 and in the query some 300 cols were fetched each time.

Yasin Ahmed · Accepted Answer · 2025-05-28 15:29:13Z

1

Async execution of queries does not necessarily imply true parallelism. While executeAsync() is non-blocking, spawning 300 queries could overwhelm the system. Not all threads will be in the RUNNING state simultaneously. Many may be waiting or queued. This queuing likely explains the drastic increase in response time for APIs executing large numbers of queries.

I would recommend checking CPU utilization, thread pool stats (via nodetool tpstats), and capturing a thread dump to confirm thread contention or queuing bottlenecks on the Cassandra nodes.

Additionally, Cassandra does not support batch reads (Refer: Batch select in Cassandra). Also, upgrading hardware may not help much if the application logic and query patterns are inefficient. It would be more effective to optimize the queries, avoid wide rows or fetching 300 columns unless necessary, and limit concurrency with bounded async execution strategies rather than spawning hundreds of concurrent queries without control.

answered May 28 at 15:29

Yasin Ahmed

18110 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Dinesh Babu M May 30 at 5:37

Well, the problem is for our clients, we suggested them to use less no of cols in select unless absolutely necessary. Changing app logic and data model needs to be done in future, but they are expecting for some quick fix like any config changes that might help. I'll check the CPU util and tpstats and check futher

Collectives™ on Stack Overflow

Cassandra Java driver: high latency when executing many async reads in a loop

1 Answer 1

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related