1

I have a C# program that currently downloads data from several sites synchronously after which the code does some work on the data I've downloaded. I am trying to move this to do my downloads asynchronously and then process the data I've downloaded. I am having some trouble with this sequencing. Below is a snapshot of code I am using:

class Program { static void Main(string[] args) { Console.WriteLine("Started URL downloader"); UrlDownloader d = new UrlDownloader(); d.Process(); Console.WriteLine("Finished URL downloader"); Console.ReadLine(); } } class UrlDownloader { public void Process() { List<string> urls = new List<string>() { "http://www.stackoverflow.com", "http://www.microsoft.com", "http://www.apple.com", "http://www.google.com" }; foreach (var url in urls) { WebClient Wc = new WebClient(); Wc.OpenReadCompleted += new OpenReadCompletedEventHandler(DownloadDataAsync); Uri varUri = new Uri(url); Wc.OpenReadAsync(varUri, url); } } void DownloadDataAsync(object sender, OpenReadCompletedEventArgs e) { StreamReader k = new StreamReader(e.Result); string temp = k.ReadToEnd(); PrintWebsiteTitle(temp, e.UserState as string); } void PrintWebsiteTitle(string temp, string source) { Regex reg = new Regex(@"<title[^>]*>(.*)</title[^>]*>"); string title = reg.Match(temp).Groups[1].Value; Console.WriteLine(new string('*', 10)); Console.WriteLine("Source: {0}, Title: {1}", source, title); Console.WriteLine(new string('*', 10)); } } 

Essentially, my problem is this. My output from above is:

Started URL downloader Finished URL downloader "Results of d.Process()" 

What I want to do is complete the d.Process() method and then return to the "Main" method in my Program class. So, the output I am looking for is:

Started URL downloader "Results of d.Process()" Finished URL downloader 

My d.Process() method runs asynchronously, but I can't figure out how to wait for all of my processing to complete before returning to my Main method. Any ideas on how to do this in C#4.0? I am not sure how I'd go about 'telling' my Process() method to wait until all it's asynchronous activity is complete before returning to the Main method.

3
  • multiple questions exist regarding asynchronous operations: one example stackoverflow.com/questions/6906778/… Commented Jul 13, 2012 at 17:06
  • 1
    What version of C# are you using? .Net 4.0 provides the TPL using the Task object. Commented Jul 13, 2012 at 17:07
  • You could just do OpenRead, which does it synchronously and blocks the current thread. Commented Jul 13, 2012 at 17:13

2 Answers 2

8

If you are on .NET>=4.0 you can use TPL

Parallel.ForEach(urls, url => { WebClient Wc = new WebClient(); string page = Wc.DownloadString(url); PrintWebsiteTitle(page); }); 

I would also use HtmlAgilityPack to parse the page instead of regex.

void PrintWebsiteTitle(string page) { HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument(); doc.LoadHtml(page); Console.WriteLine(doc.DocumentNode.Descendants("title").First().InnerText); } 
Sign up to request clarification or add additional context in comments.

1 Comment

Thanks @L.B this works great! I'm new to asynchronous programming so wasn't familiar with TPL.
0

I would recommend using WebClient.DownloadDataAsync instead of writing your own. You could then use the Task Parallel Library to wrap the call to DownloadDataAsync in a TaskCompletionSource to get multiple Task objects you can wait on or continue with:

 webClient.DownloadDataAsync(myUri); webClient.DownloadDataCompleted += (s, e) => { tcs.TrySetResult(e.Result); }; if (wait) { tcs.Task.Wait(); Console.WriteLine("got {0} bytes", tcs.Task.Result.Length); } else { tcs.Task.ContinueWith(t => Console.WriteLine("got {0} bytes", t.Result.Length)); } 

To handle error conditions, you can expand the use of the TaskCompletionSource:

webClient.DownloadDataCompleted += (s, e) => { if(e.Error != null) tcs.SetException(e.Error); else if(e.Cancelled) tcs.SetCanceled(); else tcs.TrySetResult(e.Result); }; 

To do similar with multiple tasks:

Task.WaitAll(tcs.Task, tcs2.Task); 

or

Task.Factory.ContinueWhenAll(new Task[] {tcs.Task, tcs2.Task}, ts => { /* do something with all the results */ }); 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.