15

I'm working on a project that runs in a clustered environment, where there are many nodes and a single database. The project uses Spring-data-JPA (1.9.0) and Hibernate (5.0.1). I'm having trouble resolving how to prevent duplicate row issues.

For sake of example, here's a simple table

@Entity
@Table(name = "scheduled_updates")
public class ScheduledUpdateData {
    public enum UpdateType {
        TYPE_A,
        TYPE_B
    }

    @Id
    @GeneratedValue(strategy = GenerationType.AUTO)
    @Column(name = "id")
    private UUID id;

    @Column(name = "type", nullable = false)
    @Enumerated(EnumType.STRING)
    private UpdateType type;

    @Column(name = "source", nullable = false)
    private UUID source;
}

The important part is that there is a UNIQUE(type, source) constraint.

And of course, matching example repository:

@Repository
public class ScheduledUpdateRepository implements JpaRepository<ScheduledUpdateData, UUID> {
    ScheduledUpdateData findOneByTypeAndSource(final UpdateType type, final UUID source);

    //...
}

The idea for this example is that parts of the system can insert rows to be schedule for something that runs periodically, any number of times between said runs. When whatever that something is actually runs, it doesn't have to worry about operating on the same thing twice.

How can I write a service method that would conditionally insert into this table? A few things I've tried that don't work are:

  1. Find > Act - The service method would use the repository to see if a entry already exists, and then either update the found entry or save a new one as needed. This does not work.
  2. Try insert > Update if fail - The service method would try to insert, catch the exception due to the unique constraint, and then do an update instead. This does not work since the transaction will already be in a rolled-back state and no further operations can be done in it.
  3. Native query with "INSERT INTO ... WHERE NOT EXISTS ..."* - The repository has a new native query:

    @Repository
    public class ScheduledUpdateRepository implements JpaRepository<ScheduledUpdateData, UUID> {
        // ...
    
        @Modifying
        @Query(nativeQuery = true, value = "INSERT INTO scheduled_updates (type, source)" +
                                           " SELECT :type, :src" +
                                           " WHERE NOT EXISTS (SELECT * FROM scheduled_updates WHERE type = :type AND source = :src)")
        void insertUniquely(@Param("type") final String type, @Param("src") final UUID source);
    }
    

    This unfortunately also does not work, as Hibernate appears to perform the SELECT used by the WHERE clause on its own first - which means in the end multiple inserts are tried, causing a unique constraint violation.

I definitely don't know a lot of the finer points of JTA, JPA, or Hibernate. Any suggestions on how insert into tables with unique constraints (beyond just the primary key) across multiple JVMs?

Edit 2016-02-02

With Postgres (2.3) as a database, tried using Isolation level SERIALIZABLE - sadly by itself this still caused constraint violation exceptions.

9
  • Try to add @version annotation to entity and version field to table. Commented Jan 28, 2016 at 21:17
  • 1
    Retry the entire transaction (second time with updating instead of inserting). Commented Jan 28, 2016 at 21:36
  • @sky_light Unfortunately @Version doesn't help, since the error occurs when two transactions think the row does not exist yet. Commented Jan 28, 2016 at 21:50
  • @DraganBozanovic In the full project a single transaction could be updating multiple tables, any or all of which could have unique constraints, so retrying the whole transaction sounds like it would require introducing an entirely new layer of framework over the existing @Services - which doesn't seem right Commented Jan 28, 2016 at 21:56
  • 1
    If serviceA runs in a transaction, and invokes serviceB which runs in its own transaction (i.e. use REQUIRES_NEW), then you can catch the exception throw by serviceB in serviceA, and that won't rollback serviceA. Commented Jan 28, 2016 at 22:23

4 Answers 4

2

You are trying to ensure that only 1 node can perform this operation at a time. The best (or at least most DB-agnostic) way to do this is with a 'lock' table. This table will have a single row, and will act as a semaphore to ensure serial access.

Make sure that this method is wrapped in a transaction

// this line will block if any other thread already has a lock
// until that thread's transaction commits
Lock lock = entityManager.find(Lock.class, Lock.ID, LockModeType.PESSIMISTIC_WRITE);

// just some change to the row, it doesn't matter what
lock.setDateUpdated(new Timestamp(System.currentTimeMillis()));  
entityManager.merge(lock);
entityManager.flush();

// find your entity by unique constraint
// if it exists, update it
// if it doesn't, insert it
Sign up to request clarification or add additional context in comments.

Comments

1

It sounds like an upsert case, that can be handled as suggested here.

Comments

0

Hibernate and its query language offer support for an insert statement. So you can actually write that query with HQL. See here for more information. http://docs.jboss.org/hibernate/orm/5.0/userguide/html_single/Hibernate_User_Guide.html#_hql_syntax_for_insert

1 Comment

Whilst this may theoretically answer the question, it would be preferable to include the essential parts of the answer here, and provide the link for reference.
0

Find > Act - The service method would use the repository to see if a entry already exists, and then either update the found entry or save a new one as needed. This does not work.

Why does this not work?

Have you considered "optimistic locking"?

These two posts may help:

1 Comment

None of this will solve the update-if-exists-insert-otherwise issue. Locking only works on a single record. If the record doesn't yet exist at the time of the initial query, there is nothing to lock. It's possible to lock the entire table, but that will create severe performance bottlenecks. A better approach is to create a single-record "Lock" table in the DB, and use that as a semaphore. See my reply for details.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.