1

I’m working on a Spring Batch job with parallel chunk processing. The problem I’m facing is a LazyInitializationException due to nested lazy-loaded collections in my JPA entities.

I’m using JpaPagingItemReader for reading, JpaTransactionManager for managing transactions, and SimpleAsyncTaskExecutor for parallel processing.

Setup:

  • Spring Boot: 3.3.4
  • Spring Batch: 5.1.2
  • Transaction Manager: JpaTransactionManager
  • Task Executor: SimpleAsyncTaskExecutor

Simple example:

Customer entity:

@Entity
public class Customer {
    
    @Id
    private Long id;
    
    @ManyToOne
    @JoinColumn(name = "order_id")
    private Order order;
}

Order entity:

The @OneToMany relation is FetchType.LAZY by default.

@Entity
public class Order {

    @Id
    private Long id;

    @OneToMany(mappedBy = "order", cascade = CascadeType.ALL, orphanRemoval = true)
    private List<Type> types;

    // Other @OneToMany collection fields...

}

Spring Batch Step Configuration:

I’m using a JpaPagingItemReader and an ItemProcessor. The step is parallelized using a SimpleAsyncTaskExecutor.

@Bean
public JpaPagingItemReader<Customer> customerItemReader(EntityManagerFactory entityManagerFactory) {
    return new JpaPagingItemReaderBuilder<Customer>()
            .name("customerItemReader")
            .entityManagerFactory(entityManagerFactory)
            .queryString("SELECT c FROM Customer c")
            .pageSize(100)
            .build();
}

@Bean
public Step processCustomersStep(JobRepository jobRepository,
                         PlatformTransactionManager transactionManager,
                         JpaPagingItemReader<Customer> customerItemReader,
                         ItemProcessor<Customer, Customer> customerItemProcessor,
                         ItemWriter<Customer> customerItemWriter,
                         TaskExecutor taskExecutor) {
    return new StepBuilder("processCustomersStep", jobRepository)
            .<Customer, Customer>chunk(100, transactionManager)
            .reader(customerItemReader)
            .processor(customerItemProcessor)
            .writer(customerItemWriter)
            .taskExecutor(taskExecutor)
            .build();
}

When running the batch job, I get the following error when accessing the lazy-loaded order or its nested types collection in the ItemProcessor:

org.hibernate.LazyInitializationException: failed to lazily initialize a collection of role: com.myapp.Order.types: could not initialize proxy - no Session

What I’ve tried:

  1. Fetching with JOIN FETCH: I can’t use JOIN FETCH because the Order entity contains multiple nested lazy-loaded collections, leading to large, inefficient queries.
  2. Since I am using JpaTransactionManager, I assumed it would manage the transaction scope properly, but the LazyInitializationException still occurs when accessing the nested lazy-loaded collections in the ItemProcessor.

Question:

How can I prevent the LazyInitializationException in a Spring Batch step with parallel chunks when using JpaPagingItemReader and JpaTransactionManager?

Should I handle the EntityManager differently, or is there another pattern to process entities with nested lazy-loaded collections in parallel?

UPDATE (processor, writter and JPA transaction manager)

Item processor:

In the ItemProcessor, I’m creating a new Customer instance and copying the nested Order and its associated Types. This requires accessing the lazy-loaded collections in the ItemProcessor, which is where the LazyInitializationException occurs.

@Component
public class CustomerItemProcessor implements ItemProcessor<Customer, Customer> {

    @Override
    public Customer process(Customer customer) throws Exception {
        Customer newCustomer = new Customer();
        Order order = new Order();
        order.setTypes(customer.getOrder().getTypes()));
        newCustomer.setOrder(order);
        return newCustomer; 
    }

}

Item writer:

The ItemWriter uses the injected EntityManager to persist the processed entities.

@Bean
public JpaItemWriter<Customer> customerItemWriter(EntityManagerFactory entityManagerFactory) {
    return new JpaItemWriterBuilder<Customer>()
        .entityManagerFactory(entityManagerFactory)
        .build();
}

JPA transaction manager:

The JpaTransactionManager is used to manage the transaction scope within the batch job.

    @Bean
    public PlatformTransactionManager transactionManager(EntityManagerFactory entityManagerFactory) {
        return new JpaTransactionManager(entityManagerFactory);
    }
7
  • please also add the relevant code for the processor, writter and JpaTransactionManager. Commented Oct 4, 2024 at 23:54
  • @PanagiotisBougioukos I updated the question with the required information. Commented Oct 5, 2024 at 17:26
  • Why write your own writer instead of using the default one. Use FETCH JOIN to identify which collections need to be eagerly loaded or write a better query. Commented Oct 9, 2024 at 11:24
  • @M.Deinum By “default writer,” do you mean that if I don’t register a custom writer, Spring Batch will use the default one provided? Regarding JOIN FETCH, I’ve attempted that approach, but the entity contains several nested lazy-loaded collections, which causes the JpaPagingItemReader to fail. Could you clarify what you mean by “write a better query”? I’m open to suggestions on how to approach this more effectively. Commented Oct 11, 2024 at 17:07
  • 1
    Default writer as in the default JpaItemWriter no need to write one yourself. Better query, write a query that only selects what you need. Your comment also hints at that the code you show here isn't representative for what you really have. I also wonder what the error is you got when using FETCH JOIN. Commented Oct 14, 2024 at 7:53

1 Answer 1

0

The LazyInitializationException occurs because the ItemProcessor runs in a separate thread after the JPA transaction is complete — meaning the persistence context is closed and lazy collections can’t be initialized.

To solve this, I implemented a custom paging ItemReader that eagerly loads the required collections during the read phase, inside the transaction.

public class CustomerItemReader extends AbstractPagingItemReader<Customer> {

    private final EntityManagerFactory entityManagerFactory;

    private EntityManager entityManager;

    public CustomerItemReader(EntityManagerFactory entityManagerFactory, int pageSize) {
        super();
        this.entityManagerFactory = entityManagerFactory;
        setPageSize(pageSize);
        setName("customerItemReader");
    }
    
    @Override
    public void afterPropertiesSet() throws Exception {
        super.afterPropertiesSet();
        Assert.state(entityManagerFactory != null, "EntityManager cannot be null");
    }

    @Override
    protected void doOpen() throws Exception {
        super.doOpen();
        entityManager = entityManagerFactory.createEntityManager();
        if (entityManager == null) {
            throw new DataAccessResourceFailureException("Unable to obtain an EntityManager");
        }
    }

    @Override
    protected void doReadPage() {
        EntityTransaction tx = entityManager.getTransaction();
        tx.begin();

        entityManager.flush();
        entityManager.clear();

        if (results == null) {
            results = new CopyOnWriteArrayList<>();
        } else {
            results.clear();
        }

        // Step 1: Fetch Customers and their Orders
        List<Customer> customers = entityManager.createQuery("""
                SELECT c FROM Customer c
                LEFT JOIN c.order
                ORDER BY c.id
            """, Customer.class)
            .setFirstResult(getPage() * getPageSize())
            .setMaxResults(getPageSize())
            .getResultList();

        if (customers.isEmpty()) {
            tx.commit();
            return;
        }

        // Step 2: Fetch nested Order collections using IDs
        List<Long> orderIds = customers.stream()
            .map(Customer::getOrder)
            .map(Order::getId)
            .distinct()
            .toList();

        // Step 3: Load Order.types
        entityManager.createQuery("""
                SELECT DISTINCT o FROM Order o
                LEFT JOIN FETCH o.types
                WHERE o.id IN :orderIds
            """, Order.class)
            .setParameter("orderIds", orderIds)
            .getResultList();

        // Repeat step 3 to load other @OneToMany collections one by one to avoid MultipleBagFetchException...

        tx.commit();

        results.addAll(customers);
    }

    @Override
    protected void doClose() throws Exception {
        entityManager.close();
        super.doClose();
    }
}

This approach:

  • Avoids LazyInitializationException — collections are loaded within the active transaction
  • Avoids MultipleBagFetchException — by fetching each collection in its own query
  • Safe for parallel execution — no reliance on thread-bound persistence context

Related: If you’re using an aggregate ItemReader that returns a List<T> per read (such as AggregatePagingItemReader), you may run into similar LazyInitializationException issues when accessing nested lazy-loaded collections in parallel chunk processing.

I’ve written a separate answer that shows how to eagerly load nested @ElementCollection fields using separate batch queries to avoid LazyInitializationException, MultipleBagFetchException, and N+1 problems in that scenario.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.