I'm using Entity Framework to build a database. There's two models; Workers and Skills. Each Worker has zero or more Skills. I initially read this data into memory from a CSV file somewhere, and store it in a dictionary called allWorkers. Next, I write the data to the database as such:
// Populate database using (var db = new SolverDbContext()) { // Add all distinct skills to database db.Skills.AddRange(allSkills .Distinct(StringComparer.InvariantCultureIgnoreCase) .Select(s => new Skill { Reference = s })); db.SaveChanges(); // Very quick var dbSkills = db.Skills.ToDictionary(k => k.Reference, v => v); // Add all workers to database var workforce = allWorkers.Values .Select(i => new Worker { Reference = i.EMPLOYEE_REF, Skills = i.GetSkills().Select(s => dbSkills[s]).ToArray(), DefaultRegion = "wa", DefaultEfficiency = i.TECH_EFFICIENCY }); db.Workers.AddRange(workforce); db.SaveChanges(); // This call takes 00:05:00.0482197 } The last db.SaveChanges(); takes over five minutes to execute, which I feel is far too long. I ran SQL Server Profiler as the call is executing, and basically what I found was thousands of calls to:
INSERT [dbo].[SkillWorkers]([Skill_SkillId], [Worker_WorkerId]) VALUES (@0, @1) There are 16,027 rows being added to SkillWorkers, which is a fair amount of data but not huge by any means. Is there any way to optimize this code so it doesn't take 5min to run?
Update: I've looked at other possible duplicates, such as this one, but I don't think they apply. First, I'm not bulk adding anything in a loop. I'm doing a single call to db.SaveChanges(); after every row has been added to db.Workers. This should be the fastest way to bulk insert. Second, I've set db.Configuration.AutoDetectChangesEnabled to false. The SaveChanges() call now takes 00:05:11.2273888 (In other words, about the same). I don't think this really matters since every row is new, thus there are no changes to detect.
I think what I'm looking for is a way to issue a single UPDATE statement containing all 16,000 skills.
AutoDetectChangesEnabledto false. We'll see how much of a difference that makes.