0

I used this guide to build an ASP.NET Web Form in VS 2019 with a button that executes a stored procedure on my SQL server, and writes the results to a CSV file for download through the browser.

It all works great, except that it's incredibly slow.

For example, it took over an hour for it to generate a 3.4 MB CSV with 20K rows.

Is there any way to speed this up at all?

Here's my code because I did modify it slightly from what I could find searching for this issue, but none of it helped, unfortunately:

    Protected Sub ExportCSV(ByVal sender As Object, ByVal e As EventArgs)

    Dim sqlcmd As String = "EXEC my_sp_name;"

    Dim constr As String = ConfigurationManager.ConnectionStrings("CONNSTR").ConnectionString

    Using con As New SqlConnection(constr)

        Using cmd As New SqlCommand(sqlcmd)

            Using sda As New SqlDataAdapter()

                cmd.Connection = con
                sda.SelectCommand = cmd

                Using dt As New DataTable()

                    sda.Fill(dt)

                    Dim csv As String = String.Empty

                    For Each column As DataColumn In dt.Columns
                        csv += column.ColumnName + ","
                    Next

                    csv += vbCrLf

                    For Each row As DataRow In dt.Rows

                        For Each column As DataColumn In dt.Columns
                            csv += row(column.ColumnName).ToString().Replace(",", ";") + ","
                        Next

                        csv += vbCrLf
                    Next

                    Response.Clear()
                    Response.ClearHeaders()
                    Response.ClearContent()
                    Response.Buffer = True
                    Response.AddHeader("content-disposition", "attachment;filename=Data.csv")
                    Response.Charset = Encoding.UTF8.WebName
                    Response.ContentType = "text/csv"
                    Response.Output.Write(csv)
                    Response.Flush()
                    Response.End()

                End Using
            End Using
        End Using
    End Using

End Sub

I see no activity on my SQL server while it's running. I can even manually execute the same stored procedure with the same parameters in seconds on my SQL server.

I have no idea what to do. So, any help is greatly appreciated.

1
  • Start by checking your database indexes. Commented Jan 4, 2022 at 8:56

1 Answer 1

2

Ok, there is a simple explain for why this runs so slow. And in fact, you can make this run SEVERAL MILLION times faster!

First up: Database and indexing? Not a problem with 20,000 rows. In fact, if you use some old computer you found in a dumpster? It will easy pull 100,000 rows per second - and do so against a database WITHOUT ANY indexing!! - even if you apply sorting and criteria.

So, no, this is not a database issue - you have HUGE speed, and you would in fact require some efforts to make the database run slow.

so, then what is the problem?

Why of course it is the string concatenation !!!!

Think of it this way:

Say you have to walk 150 miles

But, say when you reach 100 miles? To go to the 101st mile?

Well, you go back to mile 1, and then walk 100 miles + 1 mile.

Then for mile 102?

You go back to mile 1, and then walk 102 miles

Then for mile 103?

You go back to mile 1, and then walk 103 miles

Note how for just 100 to 103, you now walked OVER 300 miles!!!

The SAME is occurring for your string concatenation. In fact, after your string reaches about 2000 or so characters, you going to see HUGE slowdowns.

So, assume your string has now 500,000 characters?

And you go

 MyString = MyString & "Hello"

What does it do?

Why it starts at character position 1, and works its way though 500,000 characters, and THEN adds the 2nd part.

So, this is just like the walking problem!!!

By the way, most interesting that some were suggesting to check database speed. And in fact, if you go for a job interview at places like Microsoft, or Google?

They will ask a whole bunch of questions centered on you as a developer to answer questions in regards to how big, how far, how much.

In other words, they are looking for developers to have a sense of scale.

I mean, you can walk to the store 1 block away. But what about 10 miles away? Nope, you can only practical do that in one day!!! (10 miles to the store, and 10 back). So, you need to use some form of transportation. (and you need to realize you can't walk to the store anymore - can you??).

The cute little problem posted here is a PERFECT example is this simple type of thinking required as to why this runs so slow!

In fact, remove all database and all file operations.

Try this simple loop:

    Dim str1 As String = ""
    Dim str2 As String = Space(500)

    Dim t As Long = Date.Now.Ticks

    Dim i As Integer
    For i = 1 To 3000
        str1 = str1 + str2
    Next

    Dim td As Double = (Date.Now.Ticks - t) / 10000 / 1000

    TextBox1.Text = td

So, a simple loop to 3000?

(your computer can do for/next loop to 1 billion in WELL UNDER 1 second!!!)

it takes about 10 seconds (on a slow computer).

but, now lets try it for 6000. (but remember, you walking back to the start each time, and each time you walk even further to get to the end). It is not a linear growth in process - but exponential!!!!

So, will it take 20 seconds? No, because each time, you walk back to the start, traverse all to the end, and then add the 2nd string. And each walk each time gets farther (longer) each time!!!

So, with 6000? It should take 20 seconds, right? nope!!!

We get: 48 seconds!!! -- the time increase is exponential!!!1

Now you know why those hiring firms ask the above kind of job interview questions!!! - how high, how far, how big!!! You have to be able to think in terms of these types of probelms.

So, what happens if we create new row of data, but "add" it to say an array, or even a list (which then would be easy to write out as rows to a file).

Lets now try this:

    Dim str2 As String = Space(500)

    Dim myList As List(Of String) = New List(Of String)

    Dim t As Long = Date.Now.Ticks

    Dim i As Integer

    For i = 1 To 6000
        myList.Add(str2)
    Next

    Dim td As Double = (Date.Now.Ticks - t) / 10000 / 1000

    TextBox1.Text = td

Time: 0

it runs so fast, that it don't even register.

Lets go to 100,000 rows.

eg:

    For i = 1 To 100000
        myList.Add(str2)
    Next

time: 0.005033

5 1/1000th of a second!

Note how STUNNING fast this is!!!! - it not just 1000x, but in the 100's of times faster speed to run!!!!

So, for your code? Well, since the above is SO VERY fast, then we can get a bit lazy here, and not even have to create a row, and write it out. And because we are dealing with 20,000 rows? That's small, and again to save time, and ease of code?

We can use above. However, if this was to be 50k or 100k rows? Then yes, I would suggest the main loop does the create row, and THEN send it out to a file.

However, we can say do this:( air code follows!!!)

    Dim MyOutList As List(Of String) = New List(Of String)
    Dim csv As String = ""


    For Each column As DataColumn In dt.Columns
        csv += column.ColumnName + ","
    Next

    MyOutList.Add(csv)

    For Each row As DataRow In dt.Rows
        csv = ""
        For Each column As DataColumn In dt.Columns
            csv += row(column.ColumnName).ToString().Replace(",", ";") + ","
        Next

        MyOutList.Add(csv)

    Next


    Response.Clear()
    Response.ClearHeaders()
    Response.ClearContent()
    Response.Buffer = True
    Response.AddHeader("content-disposition", "attachment;filename=Data.csv")
    Response.Charset = Encoding.UTF8.WebName
    Response.ContentType = "text/csv"
    For Each OneLine As String In MyOutList
        Response.Write(OneLine & vbCrLf)
    Next
    Response.Flush()
    Response.End()

Try the above - it will run oh so much faster!!!

And do post back here how long it takes!!!

So, we need to process the database row by row, but NOT use some huge large string that we concatenate over and over.

Sign up to request clarification or add additional context in comments.

2 Comments

Your answer not only solved the issue, but gave me a good laugh. Thank you so much. I did have to modify the code slightly. Under each MyOutLIst.Add(csv) statement, I had to add csv = String.Empty so that it would clear the csv value that was already stored. I also had to change the "=" to a "+=" for it to write the entire row for each line. I can generate a 90MB CSV w/ 600K rows in seconds now.
Yes - I edited my code to empty the CSV string. Regardless, you well see that larger strings will run slow if you have to modify such larger strings a lot.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.