0

I have the following setup:

  • cassandra 4.1.7 running in docker on Ubuntu host, single node
  • java client with datastax driver 3.11.5, jdk 11.0.25

The code:

public class Main {

    public static void main(String[] args) throws Exception {
        try (Cluster cluster = connect()) {
            Session session = cluster.connect();
            session.execute("DROP KEYSPACE IF EXISTS lwt_test");
            session.execute("CREATE KEYSPACE lwt_test WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : 1 } AND DURABLE_WRITES = true");
            session.execute("CREATE TABLE lwt_test.lwt (key timeuuid, dummy text, value text, PRIMARY KEY(key))");
            session.execute("USE lwt_test");

            final UUID key = UUIDs.timeBased();
            final String dummy = "ABC";
            final String value = "DEF";
            session.execute("insert into lwt(key, dummy, value) values(?, ?, ?)", key, dummy, value);

            final String value2 = "XYZ";
            session.execute("update lwt set value=? where key=?", value2, key);
            String actualValue = session.execute("select value from lwt where key=?", key).one().getString("value");
            if (!value2.equals(actualValue)) {
                throw new RuntimeException("Should be " + value + " but was " + actualValue);
            }

            // (1) uncomment next line to make the problem go away
//            Thread.sleep(1000);

            ResultSet rs = session.execute("update lwt set value='MUHAHA' where key=? if value=?", key, value2);
            if (rs.wasApplied()) {
                actualValue = session.execute("select value from lwt where key=?", key).one().getString("value");
                if (!"MUHAHA".equals(actualValue)) {
                    throw new RuntimeException("Should be MUHAHA but was " + actualValue);
                }
                System.out.println("SUCCESS: actual value is " + actualValue);
            }
        }
    }

    private static Cluster connect() {
        return Cluster.builder()
                .addContactPoints("remote-cluster")
                .withPort(9042)
                .withQueryOptions(new QueryOptions().setConsistencyLevel(ConsistencyLevel.LOCAL_QUORUM))
                // (2) uncomment next line to make the problem go away
//                .withTimestampGenerator(ServerSideTimestampGenerator.INSTANCE)
                .build();
    }

}

When I run this code on Windows 10, its failing with exception:

Exception in thread "main" java.lang.RuntimeException: Should be MUHAHA but was XYZ
    at org.example.Main.main(Main.java:37)

Which means the LWT update did not apply despite being executed later and wasApplied method returned true.

At the same time the same code on MacOS works without any error.

The workaround is either to wait before executing update with LWT (1) or use server-side timestamps (2) which could introduce another problems in case of multi-node cluster according to discussion in https://issues.apache.org/jira/browse/CASSANDRA-6178

I also tried different versions of cassandra (3.11 and 5.0.2) and java driver 4.17.0 - the problem persists in all combinations.

It does not look like expected behaviour because of different outcome in Windows and MacOS and it does look more like a driver issue than server itself.

Is there a way to make it work without 'sleep' using default (driver-side) timestamp generator on Windows?

1 Answer 1

0

The issue you described appears to be due to combined use of conditional writes (UPDATE ... IF ...) with regular queries and is a variation of the problem reported in this post.

In @Andy-Tolbert's response, he explained that the default behaviour for the Java driver is to use client-side timestamps where the write timestamp is generated by the driver. For conditional writes (LWTs), the write timestamp is generated by the Paxos mechanism on the server-side.

If there is a discrepancy between the client(s) clock and cluster node(s) clock, query results can be unpredictable. Adding an artificial delay (with sleep()) in fact confirms this problem.

Check the clocks on all the servers and clients to make sure they are in-sync and that you've configured NTP since they are critical with Cassandra's distributed architecture. Cheers!

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.