0

I'm using Entity Framework to build a database. There's two models; Workers and Skills. Each Worker has zero or more Skills. I initially read this data into memory from a CSV file somewhere, and store it in a dictionary called allWorkers. Next, I write the data to the database as such:

// Populate database using (var db = new SolverDbContext()) { // Add all distinct skills to database db.Skills.AddRange(allSkills .Distinct(StringComparer.InvariantCultureIgnoreCase) .Select(s => new Skill { Reference = s })); db.SaveChanges(); // Very quick var dbSkills = db.Skills.ToDictionary(k => k.Reference, v => v); // Add all workers to database var workforce = allWorkers.Values .Select(i => new Worker { Reference = i.EMPLOYEE_REF, Skills = i.GetSkills().Select(s => dbSkills[s]).ToArray(), DefaultRegion = "wa", DefaultEfficiency = i.TECH_EFFICIENCY }); db.Workers.AddRange(workforce); db.SaveChanges(); // This call takes 00:05:00.0482197 } 

The last db.SaveChanges(); takes over five minutes to execute, which I feel is far too long. I ran SQL Server Profiler as the call is executing, and basically what I found was thousands of calls to:

INSERT [dbo].[SkillWorkers]([Skill_SkillId], [Worker_WorkerId]) VALUES (@0, @1) 

There are 16,027 rows being added to SkillWorkers, which is a fair amount of data but not huge by any means. Is there any way to optimize this code so it doesn't take 5min to run?

Update: I've looked at other possible duplicates, such as this one, but I don't think they apply. First, I'm not bulk adding anything in a loop. I'm doing a single call to db.SaveChanges(); after every row has been added to db.Workers. This should be the fastest way to bulk insert. Second, I've set db.Configuration.AutoDetectChangesEnabled to false. The SaveChanges() call now takes 00:05:11.2273888 (In other words, about the same). I don't think this really matters since every row is new, thus there are no changes to detect.

I think what I'm looking for is a way to issue a single UPDATE statement containing all 16,000 skills.

11

1 Answer 1

1

One easy method is by using the EntityFramework.BulkInsert extension.

You can then do:

// Add all workers to database var workforce = allWorkers.Values .Select(i => new Worker { Reference = i.EMPLOYEE_REF, Skills = i.GetSkills().Select(s => dbSkills[s]).ToArray(), DefaultRegion = "wa", DefaultEfficiency = i.TECH_EFFICIENCY }); db.BulkInsert(workforce); 
Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.