I’m working on a Spring Batch job with parallel chunk processing. The problem I’m facing is a LazyInitializationException due to nested lazy-loaded collections in my JPA entities.
I’m using JpaPagingItemReader for reading, JpaTransactionManager for managing transactions, and SimpleAsyncTaskExecutor for parallel processing.
Setup:
- Spring Boot: 3.3.4
- Spring Batch: 5.1.2
- Transaction Manager:
JpaTransactionManager - Task Executor:
SimpleAsyncTaskExecutor
Simple example:
Customer entity:
@Entity
public class Customer {
@Id
private Long id;
@ManyToOne
@JoinColumn(name = "order_id")
private Order order;
}
Order entity:
The @OneToMany relation is FetchType.LAZY by default.
@Entity
public class Order {
@Id
private Long id;
@OneToMany(mappedBy = "order", cascade = CascadeType.ALL, orphanRemoval = true)
private List<Type> types;
// Other @OneToMany collection fields...
}
Spring Batch Step Configuration:
I’m using a JpaPagingItemReader and an ItemProcessor. The step is parallelized using a SimpleAsyncTaskExecutor.
@Bean
public JpaPagingItemReader<Customer> customerItemReader(EntityManagerFactory entityManagerFactory) {
return new JpaPagingItemReaderBuilder<Customer>()
.name("customerItemReader")
.entityManagerFactory(entityManagerFactory)
.queryString("SELECT c FROM Customer c")
.pageSize(100)
.build();
}
@Bean
public Step processCustomersStep(JobRepository jobRepository,
PlatformTransactionManager transactionManager,
JpaPagingItemReader<Customer> customerItemReader,
ItemProcessor<Customer, Customer> customerItemProcessor,
ItemWriter<Customer> customerItemWriter,
TaskExecutor taskExecutor) {
return new StepBuilder("processCustomersStep", jobRepository)
.<Customer, Customer>chunk(100, transactionManager)
.reader(customerItemReader)
.processor(customerItemProcessor)
.writer(customerItemWriter)
.taskExecutor(taskExecutor)
.build();
}
When running the batch job, I get the following error when accessing the lazy-loaded order or its nested types collection in the ItemProcessor:
org.hibernate.LazyInitializationException: failed to lazily initialize a collection of role: com.myapp.Order.types: could not initialize proxy - no Session
What I’ve tried:
- Fetching with
JOIN FETCH: I can’t useJOIN FETCHbecause theOrderentity contains multiple nested lazy-loaded collections, leading to large, inefficient queries. - Since I am using
JpaTransactionManager, I assumed it would manage the transaction scope properly, but theLazyInitializationExceptionstill occurs when accessing the nested lazy-loaded collections in theItemProcessor.
Question:
How can I prevent the LazyInitializationException in a Spring Batch step with parallel chunks when using JpaPagingItemReader and JpaTransactionManager?
Should I handle the EntityManager differently, or is there another pattern to process entities with nested lazy-loaded collections in parallel?
UPDATE (processor, writter and JPA transaction manager)
Item processor:
In the ItemProcessor, I’m creating a new Customer instance and copying the nested Order and its associated Types. This requires accessing the lazy-loaded collections in the ItemProcessor, which is where the LazyInitializationException occurs.
@Component
public class CustomerItemProcessor implements ItemProcessor<Customer, Customer> {
@Override
public Customer process(Customer customer) throws Exception {
Customer newCustomer = new Customer();
Order order = new Order();
order.setTypes(customer.getOrder().getTypes()));
newCustomer.setOrder(order);
return newCustomer;
}
}
Item writer:
The ItemWriter uses the injected EntityManager to persist the processed entities.
@Bean
public JpaItemWriter<Customer> customerItemWriter(EntityManagerFactory entityManagerFactory) {
return new JpaItemWriterBuilder<Customer>()
.entityManagerFactory(entityManagerFactory)
.build();
}
JPA transaction manager:
The JpaTransactionManager is used to manage the transaction scope within the batch job.
@Bean
public PlatformTransactionManager transactionManager(EntityManagerFactory entityManagerFactory) {
return new JpaTransactionManager(entityManagerFactory);
}
FETCH JOINto identify which collections need to be eagerly loaded or write a better query.JOIN FETCH, I’ve attempted that approach, but the entity contains several nested lazy-loaded collections, which causes theJpaPagingItemReaderto fail. Could you clarify what you mean by “write a better query”? I’m open to suggestions on how to approach this more effectively.JpaItemWriterno need to write one yourself. Better query, write a query that only selects what you need. Your comment also hints at that the code you show here isn't representative for what you really have. I also wonder what the error is you got when usingFETCH JOIN.