0

Currently I am working on a project in which we read, process and store products. We are using Entity Framework 6 to read and write to a MySql database.

After building the prototype and fetching some statistics, we found that storing new products in the database takes (relatively) much time. I have been asked to improve this, but I can't really figure out what the best option is.

Currently, every read and write happens in a using block. Since this was my first time using Entity Framework 6, I did my research and the vast majority of StackOverflow said that you should always use a using block. So I did.

Code snippet of how it looks now;

public int GetSomeId(string SomeStringToMatchWith)
{
    using (var db = new MyDbContext())
    {
        return db.SomeTable.Where(t => t.SomeString == SomeStringToMatchWith).FirstOrDefault().id;
    }
}

public void SaveSomeData(int SomeId)
{
    using(var db = new MyDbContext())
    {
        db.SomeTable.Add(new SomeTable{ Id = SomeId });
        db.SaveChanges();
    }
}

I have been told that MySql would work faster if it is exposed to bulk data rather than to single data inserts. Also, after reading this question, I figured it might be better for the program to not immediately write the data to the database (and therefore, to not use using), but to create a simple Repository that saves data and write it to the database after a certain amount of time. Since I am willing to acccess the Repository through multiple threads, I figured a Singleton design would satisfy.

There is one nasty requirement though; the products have to be matched by certain values, and it could happen that product #2 has a match with product #1. In other words, I always need to be able to access the most recent data.

Something like this came to my mind;

public class Repository
{
    private static readonly object Lock = new object();

    private MyDbContext context { get; set; }

    private Repository()
    {        
        context = new MyDbContext();
    }

    private static Repository _Instance;
    public static Repository Instance
    {
        get
        {
            if (_Instance == null)
            {
                lock(Lock)
                {
                    if(_Instance == null)
                    {
                        _Instance = new Repository();
                    }
                }                    
            }
            return _Instance;
        }
    }

    //This method is called once in a while
    public void Commit()
    {
        context.SaveChanges();
        context.Dispose();                //Get rid of entities
        context = new MyDbContext();      //Create a fresh DbContext
    }   

    //Other Read/Write methods  
}

There are a few questions, actually;

  • Is it a hard task to make this whole class thread safe? Do I need to add a lock to every table to do so, or is there a smarter way?
  • Would it actually increase the performance? The linked question above does make me think it does.
  • Since this is not the recommended way and most of you will (probably) disagree; would there be a better way to implement this?

Please note that the current version works as-is. The program needs to process about 2.500.000 products and the only bottleneck seems to be the writing to the database. As a final note, I have also read Implementing the Repository and Unit of Work Patterns... which clearly tells me how to do it, but does not give me information why I should or should not use it.

6
  • Bunch of good questions! First, you're on right track of creating repository as it can at least alleviate constant opening and closing of DB. I'm not too familiar with Locking at DB layer but, would recommend that as it's a common thing for a DB to lock tables or rows. Locking at application layer is ok too and you are on the right track. The only problem locking there is that it could be too broad. But I do have some very good news for you which I'll post a an answer as it can eliminate issues and give you best of EF And SQL worlds at same time. Commented Nov 4, 2014 at 15:09
  • 1
    Since I am willing to acccess the Repository through multiple threads, I figured a Singleton design would satisfy. There is nothing about a singleton that guarantees it to be thread-safe. You will have to explicitly make it so throughout your code just as you would if you'd be using an instance. Commented Nov 4, 2014 at 15:10
  • Do also see this question for reference. As I understand it EF is not built for big bulk operations. Commented Nov 4, 2014 at 15:12
  • you can also play around with the AutoDetectChanges to improve performance. It has other bottlenecks in that you would need to keep track of changes.. but if it's only new entities you are adding I think it should be ok. Commented Nov 4, 2014 at 15:20
  • 2
    There is also a project on CodePlex for Bulk Inserts efbulkinsert.codeplex.com Commented Nov 4, 2014 at 15:33

1 Answer 1

0

EF is very powerful and allows the designer of the application to determine how deep into it they want to go! What does this mean? It means you can either do everything through EF or when needed you can interface directly with SQL... Below are a few examples of this:

//The normal EF pattern is:
using(var db = new Myentities()){//do something here}

The method above favors strong typing all the way, it allows for perfect LINQ integration and produces very nicely (perfectly?) formed queries.

//However you can use EF to hit SQL directly.
using(var db= new MyEntiteis()){ var stuff = db.DataBase<passInStrongType>(query,parms)

The query parameter is a string, with parameter placeholders like this.

"Select * from Table where Field== @Parm1"  //works for inserts, updates and deletes too.

The second value is often an SQLParameter Array like this:

 var parms = new List<SqlParameter>()
 parms.Add(new SqlParameter { Name="Parm1", value = "myvalue" }).ToArray();

Now the beautify of this Hybrid solution is that you can get the absolute fastest SQL response because you are hitting SQL directly, but you pick up the ability to have results returned using Strongly-typed models.

Any type of query is allowed, Updates, Inserts, Deletes...

This hybrid approach get's you closest to SQL layer whild still staying within EF. Any collection returned still allows for LINQ. This to me is the best of two worlds.

Sign up to request clarification or add additional context in comments.

8 Comments

Is it a hard task to make this whole class thread safe? Do I need to add a lock to every table to do so, or is there a smarter way? - I cannot find the answer to that question here..
Would it actually increase the performance? I cannot find the answer to that question here.
The question is about adding to a database through EntityFramework. What does an sql select query have to do with that? I fail to see how this answer would help the OP.
Some will say yes, others will say "it depends" First off query performance is a function of how the query is written. There's a lot of folks who know the EF side on writing well performing queries, however, there's a lot more DBAs who understand query performance using tools like SSMS. Bottom line is that the closer you get to the "metal" the faster things go. If speed is primary importance then interfacing directly to SQL is good.
I'm sorry, but I also fail to see this as an answer... Perhaps you misinterpreted my question?
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.