I am trying to bulk insert data from SQL to ElasticSearch index. Below is the code I am using and total number of records is around 1.5 million. I think it something to do with connection setting but I am not able to figure it out. Can someone please help with this code or suggest better way to do it?
public void InsertReceipts
{
IEnumerable<Receipts> receipts = GetFromDB() // get receipts from SQL DB
const string index = "receipts";
var config = ConfigurationManager.AppSettings["ElasticSearchUri"];
var node = new Uri(config);
var settings = new ConnectionSettings(node).RequestTimeout(TimeSpan.FromMinutes(30));
var client = new ElasticClient(settings);
var bulkIndexer = new BulkDescriptor();
foreach (var receiptBatch in receipts.Batch(20000)) //using MoreLinq for Batch
{
Parallel.ForEach(receiptBatch, (receipt) =>
{
bulkIndexer.Index<OfficeReceipt>(i => i
.Document(receipt)
.Id(receipt.TransactionGuid)
.Index(index));
});
var response = client.Bulk(bulkIndexer);
if (!response.IsValid)
{
_logger.LogError(response.ServerError.ToString());
}
bulkIndexer = new BulkDescriptor();
}
}
Code works fine but takes around 10 mins to complete. When I try to increase batch size, it fails with below error:
Invalid NEST response built from a unsuccessful low level call on POST: /_bulk
Invalid Bulk items: OriginalException: System.Net.WebException: The underlying connection was closed: An unexpected error occurred on a send. ---> System.IO.IOException: Unable to write data to the transport connection: An existing connection was forcibly closed by the remote host. ---> System.Net.Sockets.SocketException: An existing connection was forcibly closed by the remote host