I use JDBC to make inserts into postgres. For some reason performance seems to vary between runs. I observed it switching between 2 performance levels, "fast" and about 10x slower.
I've tested the issue with postgres 14.2, 14.3 and 13.6. It seems it is relevant for all the versions.
What could possibly be the reason for this behavior?
MRE (java 17):
package test;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import java.sql.Connection;
import java.sql.DriverManager;
public class App {
private static final Logger log = LoggerFactory.getLogger(App.class);
public static void main(String[] args) throws Exception {
Class.forName("org.postgresql.Driver");
try (Connection connection = DriverManager.getConnection("jdbc:postgresql://172.17.0.3:5432/postgres", "postgres", "postgres")) {
connection.setAutoCommit(false);
var uploadedCount = 0;
final var insertA = connection.prepareStatement("insert into A(AA) values(?)");
final var insertB = connection.prepareStatement("insert into B(BA, BB) values(?, ?)");
final var startTime = System.currentTimeMillis();
var timeA = 0;
var timeB = 0;
var batchSize = 0;
var batchesCount = 0;
for (long i = 0; i < 16_384; i++) {
insertA.setObject(1, i);
insertA.addBatch();
insertB.setObject(1, i);
insertB.setObject(2, i);
insertB.addBatch();
batchSize++;
if (batchSize == 32) {
var start = System.currentTimeMillis();
insertA.executeBatch();
timeA += System.currentTimeMillis() - start;
start = System.currentTimeMillis();
insertB.executeBatch();
timeB += System.currentTimeMillis() - start;
uploadedCount += batchSize * 2;
batchSize = 0;
batchesCount++;
if (batchesCount % 256 == 0) {
log.info("INSERT A CUMULATIVE TIME: {}", timeA);
log.info("INSERT B CUMULATIVE TIME: {}", timeB);
log.info("ROWS PER SEC: {}", uploadedCount * 1000L / (System.currentTimeMillis() - startTime));
}
}
}
}
}
}
MRE schema:
CREATE TABLE A (AA BIGINT NOT NULL, CONSTRAINT PK_AA PRIMARY KEY (AA));
CREATE TABLE B (BA BIGINT NOT NULL, BB BIGINT NOT NULL, CONSTRAINT PK_BA_BB PRIMARY KEY (BA, BB));
ALTER TABLE B ADD CONSTRAINT FK_BA_A FOREIGN KEY (BA) REFERENCES A (AA);
ALTER TABLE B ADD CONSTRAINT FK_BB_A FOREIGN KEY (BB) REFERENCES A (AA);
MRE slow run example logs:
2022-05-31 14:50:45 I main App INSERT A CUMULATIVE TIME: 108
2022-05-31 14:50:45 I main App INSERT B CUMULATIVE TIME: 4395
2022-05-31 14:50:45 I main App ROWS PER SEC: 3623
2022-05-31 14:50:58 I main App INSERT A CUMULATIVE TIME: 197
2022-05-31 14:50:58 I main App INSERT B CUMULATIVE TIME: 17046
2022-05-31 14:50:58 I main App ROWS PER SEC: 1896
MRE fast run example logs:
2022-05-31 14:51:02 I main App INSERT A CUMULATIVE TIME: 100
2022-05-31 14:51:02 I main App INSERT B CUMULATIVE TIME: 269
2022-05-31 14:51:02 I main App ROWS PER SEC: 43115
2022-05-31 14:51:03 I main App INSERT A CUMULATIVE TIME: 177
2022-05-31 14:51:03 I main App INSERT B CUMULATIVE TIME: 509
2022-05-31 14:51:03 I main App ROWS PER SEC: 46811
I have not obeserved the phenomenon for schema:
CREATE TABLE A (AA BIGINT NOT NULL, CONSTRAINT PK_AA PRIMARY KEY (AA));
CREATE TABLE B (BA BIGINT NOT NULL, CONSTRAINT PK_BA PRIMARY KEY (BA));
ALTER TABLE B ADD CONSTRAINT FK_BA_A FOREIGN KEY (BA) REFERENCES A (AA);
Bhas a compound PK where both are FKs toA? Maybe the planner has a bug or your test setup isn't good. Is there a real situation where you're seeing degraded performance? What if you rewrite batch statements, does it affect anything?