2

I am trying to block RequestHandler.ParseAll() with await ConsumerTask;, but when i set a breakpoint there, i ALWAYS get the "Done..." output first... and then Parse2() fails with a NullReferenceException. (thats my guess: "the GC starts cleaning up because _handler got out of scope")

Anyway, I can't figure out why that happens.

class MainClass { public async void DoWork() { RequestHandler _handler = new RequestHandler(); string[] mUrls; /* fill mUrls here with values */ await Task.Run(() => _handler.ParseSpecific(mUrls)); Console.WriteLine("Done..."); } } static class Parser { public static async Task<IEnumerable<string>> QueryWebPage(string url) { /*Query the url*/ } public static async Task Parse1(Query query) { Parallel.ForEach(/*Process data here*/); } public static async Task Parse2(Query query) { foreach(string line in query.WebPage) /* Here i get a NullReference exception because query.WebPage == null */ } } sealed class RequestHandler { private BlockingCollection<Query> Queue; private Task ConsumerTask = Task.Run(() => /* call consume() for each elem in the queue*/); private async void Consume(Query obj) { await (obj.BoolField ? Parser.Parse1(obj) : Parser.Parse2(obj)); } public async void ParseSpecific(string[] urls) { foreach(string v in urls) Queue.Add(new Query(await QueryWebPage(v), BoolField: false)); Queue.CompleteAdding(); await ConsumerTask; await ParseAll(true); } private async Task ParseAll(bool onlySome) { ReInit(); Parallel.ForEach(mCollection, v => Queue.Add(new Query(url, BoolField:false))); Queue.CompleteAdding(); await ConsumerTask; /* Process stuff further */ } } struct Query { public readonly string[] WebPage; public readonly bool BoolField; public Query(uint e, IEnumerable<string> page, bool b) : this() { Webpage = page.ToArray(); BoolField = b; } } 
7
  • Your example is pretty complicated (more so for having lots of async methods which don't have an Async suffix) - are you able to cut it down at all? Commented Mar 15, 2013 at 20:00
  • 2
    ParseSpecific returns on its first await statement. You don't await ParseSpecific Commented Mar 15, 2013 at 20:00
  • @JonSkeet: I cut it down as far as possible, without trying to leave anything important away... but i will try. Commented Mar 15, 2013 at 20:01
  • @Nefarion: CodesInChaos has figured it out in his comment - and I'm elaborating. Commented Mar 15, 2013 at 20:02
  • @CodesInChaos: I hope you don't mind me basically jumping on the fact that you spotted what was wrong :) Well caught, sir! Commented Mar 15, 2013 at 20:09

1 Answer 1

6

CodesInChaos has spotted the problem in comments. It stems from having async methods returning void, which you should almost never do - it means you've got no way to track them.

Instead, if your async methods don't have any actual value to return, you should just make them return Task.

What's happening is that ParseSpecific is only running synchronously until the first await QueryWebPage(v) that doesn't complete immediately. It's then returning... so the task started here:

await Task.Run(() => _handler.ParseSpecific(mUrls)); 

... completes immediately, and "Done" gets printed.

Once you've made all your async methods return Task, you can await them. You also won't need Task.Run at all. So you'd have:

public async void DoWork() { RequestHandler _handler = new RequestHandler(); string[] mUrls; await _handler.ParseSpecific(mUrls); Console.WriteLine("Done..."); } 

...

public async TaskParseSpecific(string[] urls) { foreach(string v in urls) { // Refactored for readability, although I'm not sure it really // makes sense now that it's clearer! Are you sure this is what // you want? var page = await QueryWebPage(v); Queue.Add(new Query(page, false); } Queue.CompleteAdding(); await ConsumerTask; await ParseAll(true); } 

Your Reinit method also needs changing, as currently the ConsumerTask will basically complete almost immediately, as Consume will return immediately as it's another async method returning void.

To be honest, what you've got looks very complex, without a proper understanding of async/await. I would read up more on async/await and then probably start from scratch. I strongly suspect you can make this much, much simpler. You might also want to read up on TPL Dataflow which is designed to make producer/consumer scenarios simpler.

Sign up to request clarification or add additional context in comments.

4 Comments

@Nefarion Just returning Task doesn't help. You still need to await it. And you only need TaskEx.Run if you want it to run on a separate thread(you probably don't).
Thank you two for the huge help, i did not know about the "async void" methods not beeing awaitable :) I fixed that all, and now i get a funny little error here: Parallel.ForEach(mCollection, v => Queue.Add(new Query(url, BoolField:false))); about the BlockingCollection not beeing open for writing, so i assume it continued without waiting for the parallel loop to complete... Any suggestions?
@Nefarion: Well as I said, there's a lot more to fix in your code - but rather than just try to fix it, I'd take a step back, learn more about async/await, and also look at Dataflow. Then take another look at your problem with a fresh perspective.
Thanks all for your help! @JonSkeet: My programm is made up from a few dll's and many other classes, i simplified and reduced the example as far as possible, and left out methods where i could, so i could give an overview. The real programm looks a lot different, but the example explained the situation i had very well, so i chose it. I just replaced the Parallel.ForEach with a simple foreach loop, that works fine, and i will have a look at the problem tonight. Thank you for the great tip with the async void's :)

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.