2

I want to do a potentially very large insert on my database, based on other data from my database. Performance matters on this application, and even using a SqlBulkCopy slows this process down a bit too much for my liking.

So the question: can one use Linq to EF to automatically generate and run SQL code on the database, never returning anything back to the user? It would seem possible to me, I just have no idea how to get around it.

Here's the premise: could something like the following ever work?

myContext.OutputTable.AddRange(
    myContext.TableA.Join(
        myContext.TableB,
        a => a.myAKey,
        b => b.myBKey,
        (a, b) => new { a.fieldA, a.fieldB, b.fieldC }).Select(
    output => new OutputTable
    {
        myOutputColumnA = output.FieldA,
        myOutputColumnB = output.FieldB,
        myOutputColumnC = output.FieldC
    })
);

Maybe what I'm trying to do here isn't obvious. I'm basically trying to insert data into "OutputTable" using data from both TableA and TableB, without letting EntityFramework return the data to the application. Now, I know that it's possible to use ExecuteNonQuery() to run this kind of insert statement, but ultimately, I don't want to do that for a couple of different reasons: I want to hopefully maintain the entirety of the code in C#; I also want to keep the debugging aspect that Linq offers (since I use Visual Studio, it helps to have a query fail in the same place that code fails).

I know that Linq to EF generates SQL that executes on the database, so it would seem possible to me to accomplish this (would possibly have to purposefully ignore lazy loading, so the code actually executes). The code above does not execute, I believe on the premise that a new object cannot be instantiated inside of a Select() statement in that manner (Linq to EF has no idea how to deal with it). So ultimately, is this a feasible path, and if not, are there any viable alternatives?

4
  • how much data are you talking about? hundreds? thousands? millions? Commented Jan 26, 2017 at 14:54
  • @Fran Realistically, in the range tens of thousands to millions. The first time that it's run may be tens of millions, although I don't expect it to be that big on a regular basis. Commented Jan 26, 2017 at 15:01
  • You can only achieve this by writing a stored procedure. Don't expect any safe .Net application to be able to manipulate data (i.e. memory content) it doesn't own itself. Anyway, "Entity Framework" and "fast" is a contradictio in terminis. Commented Jan 26, 2017 at 15:04
  • @GertArnold I had considered using a stored procedure, but I'm not a fan of separating the code base, just for maintainability purposes. If it's unavoidable, then I'll do it (from what I've seen, it might be the best option), because I ultimately care more about performance than maintainability for this particular piece. Commented Jan 26, 2017 at 15:15

1 Answer 1

1

You are not going to get faster than SqlBulkCopy by using EF. EntityFramework is an ORM so even if you turn off every option possible, it's still going to be slower than SqlBulkCopy because it wants to take that raw sql data and materialize objects.

SqlBulkCopy is like opening a firehose to the database. it will turn off contraints and uses a different mechanism that regular writes to the db.

Anything faster than this would have to be contained in the database like Gert's comment mentioned.

Here's an old, but good explanation of SqlBulkCopy here

http://www.sqlbi.com/wp-content/uploads/SqlBulkCopy-Performance-1.0.pdf

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.