Skip to main content
deleted 3 characters in body
Source Link
Vetsin
  • 2.8k
  • 1
  • 25
  • 29

I have a table filemapping with over 140 million rows, and then commit a batch update (on, say, a million rows) with spring data like follows:

jdbcTemplate.batchUpdate("UPDATE filemapping SET checksum=? WHERE filePath=?", new BatchPreparedStatementSetter() {
    public void setValues(PreparedStatement stmt, int issueIndex) throws SQLException {
        stmt.setString(1, batchObjects[issueIndex].getChecksum());
        stmt.setString(2, batchObjects[issueIndex].getFilePath());
    }

    public int getBatchSize() {
        return 1000;
    }
});

Which may look like:

CREATE TABLE [dbo].[filemapping] (
    [id]                INT            IDENTITY (1, 1) NOT NULL,
    [filePath]          VARCHAR (3000) NULL,
    [project_id]        INT            NOT NULL,
    [checksum]          VARCHAR (255)  NULL,
    CONSTRAINT [PK_FM] PRIMARY KEY NONCLUSTERED ([id] ASC),
    CONSTRAINT [ReFileMap] FOREIGN KEY ([project_id]) REFERENCES [dbo].[project] ([id]) ON DELETE CASCADE
);

CREATE NONCLUSTERED INDEX [MapIndexOne]
    ON [dbo].[filemapping]([project_id] ASC, [fileName][filePath] ASC);

CREATE NONCLUSTERED INDEX [MapIndexChecksum]
    ON [dbo].[filemapping]([checksum] ASC);

As this table has grown the execution time for this has gone up orders of magnitude -- a series of updates which used to take a minute now takes hours. The sp_WhoIsActive informs me that we are getting locks on the filemapping table which, which to my understanding may explain why server resources are under-utilized and the update operations are so slow.

My questions are:

  • Foremost, what can be done to speed this up?
  • Are lower or higher batch sizes per transaction worth exploring?
  • Do the indexes matter, presuming it's the lock that's the limiting factor? How can I tell? (my statistics show CPU waits being the highest, and no normally only 1 cpu being used)
  • Why would a lock slow the updates down in the first place? Could it be something else?
  • Would Skipping locks matter at all for a series of updates with no selects?

I have a table filemapping with over 140 million rows, and then commit a batch update (on, say, a million rows) with spring data like follows:

jdbcTemplate.batchUpdate("UPDATE filemapping SET checksum=? WHERE filePath=?", new BatchPreparedStatementSetter() {
    public void setValues(PreparedStatement stmt, int issueIndex) throws SQLException {
        stmt.setString(1, batchObjects[issueIndex].getChecksum());
        stmt.setString(2, batchObjects[issueIndex].getFilePath());
    }

    public int getBatchSize() {
        return 1000;
    }
});

Which may look like:

CREATE TABLE [dbo].[filemapping] (
    [id]                INT            IDENTITY (1, 1) NOT NULL,
    [filePath]          VARCHAR (3000) NULL,
    [project_id]        INT            NOT NULL,
    [checksum]          VARCHAR (255)  NULL,
    CONSTRAINT [PK_FM] PRIMARY KEY NONCLUSTERED ([id] ASC),
    CONSTRAINT [ReFileMap] FOREIGN KEY ([project_id]) REFERENCES [dbo].[project] ([id]) ON DELETE CASCADE
);

CREATE NONCLUSTERED INDEX [MapIndexOne]
    ON [dbo].[filemapping]([project_id] ASC, [fileName] ASC);

CREATE NONCLUSTERED INDEX [MapIndexChecksum]
    ON [dbo].[filemapping]([checksum] ASC);

As this table has grown the execution time for this has gone up orders of magnitude -- a series of updates which used to take a minute now takes hours. The sp_WhoIsActive informs me that we are getting locks on the filemapping table which, which to my understanding may explain why server resources are under-utilized and the update operations are so slow.

My questions are:

  • Foremost, what can be done to speed this up?
  • Are lower or higher batch sizes per transaction worth exploring?
  • Do the indexes matter, presuming it's the lock that's the limiting factor? How can I tell? (my statistics show CPU waits being the highest, and no normally only 1 cpu being used)
  • Why would a lock slow the updates down in the first place? Could it be something else?
  • Would Skipping locks matter at all for a series of updates with no selects?

I have a table filemapping with over 140 million rows, and then commit a batch update (on, say, a million rows) with spring data like follows:

jdbcTemplate.batchUpdate("UPDATE filemapping SET checksum=? WHERE filePath=?", new BatchPreparedStatementSetter() {
    public void setValues(PreparedStatement stmt, int issueIndex) throws SQLException {
        stmt.setString(1, batchObjects[issueIndex].getChecksum());
        stmt.setString(2, batchObjects[issueIndex].getFilePath());
    }

    public int getBatchSize() {
        return 1000;
    }
});

Which may look like:

CREATE TABLE [dbo].[filemapping] (
    [id]                INT            IDENTITY (1, 1) NOT NULL,
    [filePath]          VARCHAR (3000) NULL,
    [project_id]        INT            NOT NULL,
    [checksum]          VARCHAR (255)  NULL,
    CONSTRAINT [PK_FM] PRIMARY KEY NONCLUSTERED ([id] ASC),
    CONSTRAINT [ReFileMap] FOREIGN KEY ([project_id]) REFERENCES [dbo].[project] ([id]) ON DELETE CASCADE
);

CREATE NONCLUSTERED INDEX [MapIndexOne]
    ON [dbo].[filemapping]([project_id] ASC, [filePath] ASC);

CREATE NONCLUSTERED INDEX [MapIndexChecksum]
    ON [dbo].[filemapping]([checksum] ASC);

As this table has grown the execution time for this has gone up orders of magnitude -- a series of updates which used to take a minute now takes hours. The sp_WhoIsActive informs me that we are getting locks on the filemapping table which, which to my understanding may explain why server resources are under-utilized and the update operations are so slow.

My questions are:

  • Foremost, what can be done to speed this up?
  • Are lower or higher batch sizes per transaction worth exploring?
  • Do the indexes matter, presuming it's the lock that's the limiting factor? How can I tell? (my statistics show CPU waits being the highest, and normally only 1 cpu being used)
  • Why would a lock slow the updates down in the first place? Could it be something else?
  • Would Skipping locks matter at all for a series of updates with no selects?
added 669 characters in body
Source Link
Vetsin
  • 2.8k
  • 1
  • 25
  • 29

I have a table filemapping with over 140 million rows, and then commit a batch update (on, say, a million rows) with spring data like follows:

jdbcTemplate.batchUpdate("UPDATE filemapping SET checksum=? WHERE filePath=?", new BatchPreparedStatementSetter() {
    public void setValues(PreparedStatement stmt, int issueIndex) throws SQLException {
        stmt.setString(1, batchObjects[issueIndex].getChecksum());
        stmt.setString(2, batchObjects[issueIndex].getFilePath());
    }

    public int getBatchSize() {
        return 1000;
    }
});

Which may look like:

CREATE TABLE [dbo].[filemapping] (
    [id]                INT            IDENTITY (1, 1) NOT NULL,
    [filePath]          VARCHAR (3000) NULL,
    [project_id]        INT            NOT NULL,
    [checksum]          VARCHAR (255)  NULL,
    CONSTRAINT [PK_FM] PRIMARY KEY NONCLUSTERED ([id] ASC),
    CONSTRAINT [ReFileMap] FOREIGN KEY ([project_id]) REFERENCES [dbo].[project] ([id]) ON DELETE CASCADE
);

CREATE NONCLUSTERED INDEX [MapIndexOne]
    ON [dbo].[filemapping]([project_id] ASC, [fileName] ASC);

CREATE NONCLUSTERED INDEX [MapIndexChecksum]
    ON [dbo].[filemapping]([checksum] ASC);

As this table has grown the execution time for this has gone up orders of magnitude -- a series of updates which used to take a minute now takes hours. The sp_WhoIsActive informs me that we are getting locks on the filemapping table which, which to my understanding may explain why server resources are under-utilized and the update operations are so slow.

My questions are:

  • Foremost, what can be done to speed this up?
  • Are lower or higher batch sizes per transaction worth exploring?
  • Do the indexes matter, presuming it's the lock that's the limiting factor? How can I tell? (my statistics show CPU waits being the highest, and no normally only 1 cpu being used)
  • Why would a lock slow the updates down in the first place? Could it be something else?
  • Would Skipping locks matter at all for a series of updates with no selects?

I have a table filemapping with over 140 million rows, and then commit a batch update (on, say, a million rows) with spring data like follows:

jdbcTemplate.batchUpdate("UPDATE filemapping SET checksum=? WHERE filePath=?", new BatchPreparedStatementSetter() {
    public void setValues(PreparedStatement stmt, int issueIndex) throws SQLException {
        stmt.setString(1, batchObjects[issueIndex].getChecksum());
        stmt.setString(2, batchObjects[issueIndex].getFilePath());
    }

    public int getBatchSize() {
        return 1000;
    }
});

As this table has grown the execution time for this has gone up orders of magnitude -- a series of updates which used to take a minute now takes hours. The sp_WhoIsActive informs me that we are getting locks on the filemapping table which, which to my understanding may explain why server resources are under-utilized and the update operations are so slow.

My questions are:

  • Foremost, what can be done to speed this up?
  • Are lower or higher batch sizes per transaction worth exploring?
  • Do the indexes matter, presuming it's the lock that's the limiting factor? How can I tell? (my statistics show CPU waits being the highest, and no normally only 1 cpu being used)
  • Why would a lock slow the updates down in the first place? Could it be something else?
  • Would Skipping locks matter at all for a series of updates with no selects?

I have a table filemapping with over 140 million rows, and then commit a batch update (on, say, a million rows) with spring data like follows:

jdbcTemplate.batchUpdate("UPDATE filemapping SET checksum=? WHERE filePath=?", new BatchPreparedStatementSetter() {
    public void setValues(PreparedStatement stmt, int issueIndex) throws SQLException {
        stmt.setString(1, batchObjects[issueIndex].getChecksum());
        stmt.setString(2, batchObjects[issueIndex].getFilePath());
    }

    public int getBatchSize() {
        return 1000;
    }
});

Which may look like:

CREATE TABLE [dbo].[filemapping] (
    [id]                INT            IDENTITY (1, 1) NOT NULL,
    [filePath]          VARCHAR (3000) NULL,
    [project_id]        INT            NOT NULL,
    [checksum]          VARCHAR (255)  NULL,
    CONSTRAINT [PK_FM] PRIMARY KEY NONCLUSTERED ([id] ASC),
    CONSTRAINT [ReFileMap] FOREIGN KEY ([project_id]) REFERENCES [dbo].[project] ([id]) ON DELETE CASCADE
);

CREATE NONCLUSTERED INDEX [MapIndexOne]
    ON [dbo].[filemapping]([project_id] ASC, [fileName] ASC);

CREATE NONCLUSTERED INDEX [MapIndexChecksum]
    ON [dbo].[filemapping]([checksum] ASC);

As this table has grown the execution time for this has gone up orders of magnitude -- a series of updates which used to take a minute now takes hours. The sp_WhoIsActive informs me that we are getting locks on the filemapping table which, which to my understanding may explain why server resources are under-utilized and the update operations are so slow.

My questions are:

  • Foremost, what can be done to speed this up?
  • Are lower or higher batch sizes per transaction worth exploring?
  • Do the indexes matter, presuming it's the lock that's the limiting factor? How can I tell? (my statistics show CPU waits being the highest, and no normally only 1 cpu being used)
  • Why would a lock slow the updates down in the first place? Could it be something else?
  • Would Skipping locks matter at all for a series of updates with no selects?
Source Link
Vetsin
  • 2.8k
  • 1
  • 25
  • 29

Hibernate SQLServer batch update transaction lock resulting in extremely poor performance?

I have a table filemapping with over 140 million rows, and then commit a batch update (on, say, a million rows) with spring data like follows:

jdbcTemplate.batchUpdate("UPDATE filemapping SET checksum=? WHERE filePath=?", new BatchPreparedStatementSetter() {
    public void setValues(PreparedStatement stmt, int issueIndex) throws SQLException {
        stmt.setString(1, batchObjects[issueIndex].getChecksum());
        stmt.setString(2, batchObjects[issueIndex].getFilePath());
    }

    public int getBatchSize() {
        return 1000;
    }
});

As this table has grown the execution time for this has gone up orders of magnitude -- a series of updates which used to take a minute now takes hours. The sp_WhoIsActive informs me that we are getting locks on the filemapping table which, which to my understanding may explain why server resources are under-utilized and the update operations are so slow.

My questions are:

  • Foremost, what can be done to speed this up?
  • Are lower or higher batch sizes per transaction worth exploring?
  • Do the indexes matter, presuming it's the lock that's the limiting factor? How can I tell? (my statistics show CPU waits being the highest, and no normally only 1 cpu being used)
  • Why would a lock slow the updates down in the first place? Could it be something else?
  • Would Skipping locks matter at all for a series of updates with no selects?