And yet another question about yield return
So I need to execute remotely different SQL scripts. The scripts are in TFS so I get them from TFS automatically and the process iterates through all the files reading their content in memory and sending the content to the remote SQL servers.
So far the process works flawlessly. But now some of the scripts will contain bulk inserts increasing the size of the script to 500,000 MB or more.
So I built the code "thinking" that I was reading the content of the file once in memory but now I have second thoughts.
This is what I have (over simplified):
public IEnumerable<SqlScriptSummary> Find(string scriptsPath) { if (!Directory.Exists(scriptsPath)) { throw new DirectoryNotFoundException(scriptsPath); } var path = new DirectoryInfo(scriptsPath); return path.EnumerateFiles("*.sql", SearchOption.TopDirectoryOnly) .Select(x => { var script = new SqlScriptSummary { Name = x.Name, FullName = x.FullName, Content = File.ReadAllText(x.FullName, Encoding.Default) }; return script; }); } .... public void ExecuteScripts(string scriptsPath) { foreach (var script in Find(scriptsPath)) { _scriptRunner.Run(script.Content); } } My understanding is that EnumerateFiles will yield return each file at a time, so that's what made me "think" that I was loading one file at a time in memory.
But...
Once that I'm iterating them, in the ExecuteScripts method what happens with the script variable used in the foreach loop after it goes out of scope? Is that disposed? or does it remain in memory?
If it remains in memory that means that even when I'm using iterators and internally using
yield returnwhen I iterate through all of them they are still in memory right? so at the end it would be like usingToListjust with a lazy execution is that right?If the
scriptvariable is disposed when it goes out of scope then I think I would be fine
How could I re-design the code to optimize memory consumption, like forcing just to load the content of a script into memory one at a time
Additional questions:
How can I test (unit/integration test) that I'm loading just one script at a time in memory?
How can I test (unit/integration test) that each script is released/not released from memory?