0

I have to get data from 2 tables and then merge them into a CSV file.

Table 1 has 50 columns, table 2 has 20 columns. I have 30k records in table 1 and 10k in table 2.

Code is working fine and takes 1 to 2 minutes for complete processing. Database query is quick it is just for each loop taking time due to number of rows.

Can this code be improved? It will be deployed in an azure function app and it will take lot more time then running locally on my computer.

Basically, I need to write all fields from Table 1 and then a "Comment" column at the end. Comment value come form a different table, so I perform a lookup in memory and find a row. If found, I write the value of the column otherwise I write empty.

My code is

var records = await ExecuteQuery(all_variations_query, tableName);

using var writer = new StreamWriter($"C:\\Work\\{fileName}");
using var csvWriter = new CsvWriter(writer, CultureInfo.InvariantCulture);
var options = new TypeConverterOptions { Formats = ["dd/MM/yyyy HH:mm:ss"] };
csvWriter.Context.TypeConverterOptionsCache.AddOptions<DateTime>(options);
DataTable dt = records.Tables[0];
foreach (DataColumn dc in dt.Columns)
{
    csvWriter.WriteField(dc.ColumnName);
}
csvWriter.WriteField("Comment");
csvWriter.NextRecord();
var commentriesDataset = await ExecuteQuery(all_commentries_query, tableName);
var commentryRows = commentriesDataset.Tables[0].Rows;
int htmlCommentaryLength = 5000;
foreach (DataRow dr in dt.Rows)
{
    foreach (DataColumn dc in dt.Columns)
    {
        csvWriter.WriteField(dr[dc]);
    }
    string? commentryValue = dr["VariationReasonsCommentary"]?.ToString();
    if (string.IsNullOrEmpty(commentryValue) == false)
    {
        bool notNumber = commentryValue.Any(x => char.IsDigit(x) == false);
        if (notNumber)
        {
            csvWriter.WriteField(commentryValue.Length >= htmlCommentaryLength ? commentryValue[..htmlCommentaryLength] : commentryValue);
        }
        else
        {
            var commentryRow = commentryRows.Cast<DataRow>().Where(x => x["CommentaryID"].ToString() == commentryValue).FirstOrDefault();
            if (commentryRow != null)
            {
                commentryValue = commentryRow?["Comment"].ToString();
                csvWriter.WriteField(commentryValue?.Length >= htmlCommentaryLength ? commentryValue[..htmlCommentaryLength] : commentryValue);
            }
            else
            {
               csvWriter.WriteField("");
            }
        }
    }
    else
    {
        csvWriter.WriteField("");
    }
    csvWriter.NextRecord();
}
1
  • 2
    Use a Dictionary<string, string> for fast commentary lookups and replace the loop with csvWriter.WriteRecords() for batch writing to improve performance. Commented Feb 11 at 7:34

1 Answer 1

0

I used the Dictionary for faster lookups, successfully merged the tables and saved the data into a CSV file both locally and in production.

Here is the complete code in my GitHub repository.

CsvHelperService :

using System.Data;
using System.Globalization;
using System.Text;
using CsvHelper;
using CsvHelper.TypeConversion;

namespace FunctionApp27.Helpers
{
    public class CsvHelperService
    {
        public void WriteToCsv(DataTable variations, DataTable commentaries, string filePath)
        {
            using var writer = new StreamWriter(filePath, false, Encoding.UTF8, bufferSize: 65536);
            using var csvWriter = new CsvWriter(writer, CultureInfo.InvariantCulture);
            var options = new TypeConverterOptions { Formats = ["dd/MM/yyyy HH:mm:ss"] };
            csvWriter.Context.TypeConverterOptionsCache.AddOptions<DateTime>(options);

            var commentaryDict = commentaries.AsEnumerable()
                .ToDictionary(row => row["CommentaryID"].ToString(), row => row["Comment"].ToString());

            foreach (DataColumn col in variations.Columns)
            {
                csvWriter.WriteField(col.ColumnName);
            }
            csvWriter.WriteField("Comment");
            csvWriter.NextRecord();
            foreach (DataRow row in variations.Rows)
            {
                foreach (DataColumn col in variations.Columns)
                {
                    csvWriter.WriteField(row[col]);
                }

                string commentId = row["VariationReasonsCommentary"]?.ToString();
                if (!string.IsNullOrEmpty(commentId) && commentaryDict.TryGetValue(commentId, out string comment))
                {
                    csvWriter.WriteField(comment.Length > 5000 ? comment[..5000] : comment);
                }
                else
                {
                    csvWriter.WriteField("");
                }
                csvWriter.NextRecord();
            }
        }
    }
}

Function1.cs :

using FunctionApp27.Helpers;
using FunctionApp27.Services;
using Microsoft.AspNetCore.Http;
using Microsoft.AspNetCore.Mvc;
using Microsoft.Azure.Functions.Worker;
using Microsoft.Extensions.Logging;

namespace FunctionApp27
{
    public class Function1
    {
        private readonly ILogger<Function1> _logger;
        private readonly DatabaseService _databaseService;
        private readonly CsvHelperService _csvHelperService;

        public Function1(ILogger<Function1> logger, DatabaseService databaseService, CsvHelperService csvHelperService)
        {
            _logger = logger;
            _databaseService = databaseService;
            _csvHelperService = csvHelperService;
        }

        [Function("ExportToCsv")]
        public async Task<IActionResult> Run([HttpTrigger(AuthorizationLevel.Function, "get", "post")] HttpRequest req)
        {
            try
            {
                _logger.LogInformation("Starting CSV Export process...");

                var variations = await _databaseService.GetVariationsAsync();
                var commentaries = await _databaseService.GetCommentariesAsync();
                string filePath = GetFilePath();
                _csvHelperService.WriteToCsv(variations, commentaries, filePath);
                return new OkObjectResult($"CSV file generated at: {filePath}");
            }
            catch (Exception ex)
            {
                _logger.LogError($"Error processing request: {ex.Message}");
                return new BadRequestObjectResult("Failed to process CSV export.");
            }
        }

        private string GetFilePath()
        {
            if (Environment.GetEnvironmentVariable("AZURE_FUNCTIONS_ENVIRONMENT") == "Development")
            {
                return Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.LocalApplicationData), "Temp", "exported_data.csv");
            }
            else
            {
                return Path.Combine("C:\\home\\site", "exported_data.csv");
            }
        }
    }
}

local.settings.json :

{
    "IsEncrypted": false,
  "Values": {
    "AzureWebJobsStorage": "<StorageConnecString>",
    "FUNCTIONS_WORKER_RUNTIME": "dotnet-isolated",
    "ConnectionStrings:SqlDatabase": "<SQLConnString>"
  }
}

Local Output :

enter image description here

The tables were successfully merged and saved as a CSV file at the below path.

enter image description here

enter image description here

enter image description here

After deployment, I added the SQL connection string in the Azure Function App's > Environment variables > App settings.

"ConnectionStrings:SqlDatabase": "<SQLConnString>"

enter image description here

I successfully ran the function, the tables were merged and saved into a CSV file in below path.

enter image description here

enter image description here

enter image description here

Sign up to request clarification or add additional context in comments.

2 Comments

Thank you for that, Is it possible for you to show me how much time would it reduce especially in Azure with and without using the dictionary? With my Basic B3 Azure service plan and with 18k records, it is taking approximately 3 to 4 minutes.
@Ali I used the Consumption plan in an Azure Function App, and within 10 seconds, the table data was merged and a CSV file was created.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.