0

I have the code below where I call a service to fetch data, then save the data in a list (transfers). Then I loop through each transfer to update a specific row in the DB.

QUESTIONS -

  1. Do I need to use a cancellation token on the transaction?
  2. If yes, what do I do if an db exception is thrown?
  3. Is this the best way to update a large number of rows (Ex. 500)?
var service = new BalanceTransactionService();
var transfers = new List<(string, decimal?, long, string, string)>();

await foreach (BalanceTransaction balTransaction in service.ListAutoPagingAsync(listOptions, requestOptions))
{
    var charge = (Stripe.Charge)balTransaction.Source;
    transfers.Add((charge.SourceTransferId, balTransaction.ExchangeRate, balTransaction.Amount, balTransaction.Currency, stripePayoutId));
}

CancellationToken cancellationToken = new CancellationToken();
using var transaction = await _dbContext.Database.BeginTransactionAsync(cancellationToken);

try
{
    foreach (var transfer in transfers)
    {
        var stripeTransfersResult = await _dbContext.StripeTransfers
          .Where(p => p.TransferId == transfer.Item1)
          .ExecuteUpdateAsync(setters => setters
            .SetProperty(p => p.ExchangeRate, transfer.Item2)
            .SetProperty(p => p.ExchangeAmount, transfer.Item3)
            .SetProperty(p => p.ExchangeCurrency, transfer.Item4)
            .SetProperty(p => p.StripePayoutId, transfer.Item5)
            .SetProperty(p => p.EditedDate, DateTime.UtcNow)
          );
    }
    await transaction.CommitAsync(cancellationToken);
}
catch (DbException ex)
{
    // rollback if error
    await transaction.RollbackAsync(cancellationToken);
    _logger.LogInformation("Results not saved when associating payout with transfers. Stripe Payout Id: " + stripePayoutId);
    _emailService.SendEmailMessage(EmailType.Exceptions, "ReconcileTransferQueue Exception", "Exception thrown when associating payout with transfers. <br> Stripe Payout Id: " + stripePayoutId + "<br> Exception: <br>" + ex.ToString(), true);
}
10
  • If you do not pass a token a default token is used -> learn.microsoft.com/en-us/dotnet/api/… - this will take down 1. and 2. Commented Jul 2, 2024 at 3:36
  • What about RollbackAsync, if an exception gets thrown, do I need to rollback? I thought the whole reason for using a transaction is so if there is a exception and you don't get to the Commit part, it doesn't persist to the DB? Commented Jul 2, 2024 at 3:47
  • 1
    depends on. the using block will dispose the transaction and that will rollback uncommitted changes as the rollback method. If you want to have it immediately then call the method if you do not care let dispose handle it Commented Jul 2, 2024 at 3:58
  • well, this code is sitting in a Azure function and this snippet I posted is the end of the function, so I guess, according to what you're saying, I really don't need to rollback. Commented Jul 2, 2024 at 4:03
  • 3
    This isn't a bulk update operation at all. This is a slow, Row-By-Agonizing-Row process with individual single-row UPDATEs. Had you loaded all the matching StripeTransfers objects, a single SaveChanges at the end would actually persisting all changes in a single transaction. Unlike the rest of EF, ExecuteUpdateAsync and ExecuteDeleteAsync acts as little more than SQL generators. Commented Jul 2, 2024 at 7:19

1 Answer 1

1
  1. Do I need to use a cancellation token on the transaction?

You'll need a cancellation token when you have lots of record to update and take a long time(like 100,000), or you want stop the process immediately(like shutdown)

500 records only take several milliseconds, maybe you don't want to do this.

  1. If yes, what do I do if an db exception is thrown?

The options we can have when DB exception happens:

  • Retry the failed records
  • Logs
  • Any action your think is necessary

You are not required:

  • Manually rollback a transaction, but nice to have(the framework will do that for you)
  1. Is this the best way to update a large number of rows (Ex. 500)?

According to the total number of record you're going to update, in my experience, if you have no more than 1M records to update, the batch size 500 is just ok. When you have 100M records, the 100,000~200,000 batch size is good for you. For large amount records, the batch size also depend on you server ability, my experience server config is 40 core, 256GB memory for billions of records and 200,000 records in a batch is just ok.

You might want add 100ms break for each batch updates, it make database server have time to recycle resources and let's other operation have chance to update the database, since large number sequence updating will consume all system resource, other operations will pending for long time.

For 500 records update, maybe you don't really need SqlBuldCopy, but still a good way to try.

Sign up to request clarification or add additional context in comments.

4 Comments

I might have somewhere between 10-500 updates every time the Azure Function executes. Ex. Uber driver gets a payout and he wants to see all his rides over a pay period. He might have 5 rides, if he didn't work that much or a couple hundred. My Azure Function runs when I receive a payout webhook from Stripe. I associate the payout to all the "rides" and do a few other calculations. I don't think there'll be that many rows to update per payout, but I might be receiving payouts very frequently. Question - do you think this would help github.com/borisdj/EFCore.BulkExtensions ?
Involve a lib other than framework is under serious consideration, you should absolutely be sure you need it. Before you involve an open source&third part code in production, please make sure: the library is well tested; the library have been successfully used in production; the library author will quickly response issue when required; the library is extensible when new requirement relate to change, and you can manually change the code yourself to handle emergency problems; the library is LTS(long term supported); community activities and document support; etc...
so the question is, do you really need it or you just want a try?
here's the thing, I'm not that concerned with the time it takes to run through the loop and update because it's running on the server in the cloud. My concern is how it would effect the overall DB performance once in production and thousands and hopefully millions are using the DB at once. If I just loop through a few hundred rows and update them, do I really need a library for bulk CRUD operations? EFCore.BulkExtensions seems pretty straight forward. The aurther is responding to me on another SO post I created. stackoverflow.com/a/78699557/1186050

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.