6

Is there any change that a multiple Background Workers perform better than Tasks on 5 second running processes? I remember reading in a book that a Task is designed for short running processes.

The reasong I ask is this:

I have a process that takes 5 seconds to complete, and there are 4000 processes to complete. At first I did:

for (int i=0; i<4000; i++) { Task.Factory.StartNewTask(action); } 

and this had a poor performance (after the first minute, 3-4 tasks where completed, and the console application had 35 threads). Maybe this was stupid, but I thought that the thread pool will handle this kind of situation (it will put all actions in a queue, and when a thread is free, it will take an action and execute it).

The second step now was to do manually Environment.ProcessorCount background workers, and all the actions to be placed in a ConcurentQueue. So the code would look something like this:

var workers = new List<BackgroundWorker>(); //initialize workers workers.ForEach((bk) => { bk.DoWork += (s, e) => { while (toDoActions.Count > 0) { Action a; if (toDoActions.TryDequeue(out a)) { a(); } } } bk.RunWorkerAsync(); }); 

This performed way better. It performed much better then the tasks even when I had 30 background workers (as much tasks as in the first case).

LE:

I start the Tasks like this:

 public static Task IndexFile(string file) { Action<object> indexAction = new Action<object>((f) => { Index((string)f); }); return Task.Factory.StartNew(indexAction, file); } 

And the Index method is this one:

 private static void Index(string file) { AudioDetectionServiceReference.AudioDetectionServiceClient client = new AudioDetectionServiceReference.AudioDetectionServiceClient(); client.IndexCompleted += (s, e) => { if (e.Error != null) { if (FileError != null) { FileError(client, new FileIndexErrorEventArgs((string)e.UserState, e.Error)); } } else { if (FileIndexed != null) { FileIndexed(client, new FileIndexedEventArgs((string)e.UserState)); } } }; using (IAudio proxy = new BassProxy()) { List<int> max = new List<int>(); if (proxy.ReadFFTData(file, out max)) { while (max.Count > 0 && max.First() == 0) { max.RemoveAt(0); } while (max.Count > 0 && max.Last() == 0) { max.RemoveAt(max.Count - 1); } client.IndexAsync(max.ToArray(), file, file); } else { throw new CouldNotIndexException(file, "The audio proxy did not return any data for this file."); } } } 

This methods reads from an mp3 file some data, using the Bass.net library. Then that data is sent to a WCF service, using the async method. The IndexFile(string file) method, which creates tasks is called for 4000 times in a for loop. Those two events, FileIndexed and FileError are not handled, so they are never thrown.

9
  • You might want to use BlockingCollection rather than ConcurrentQueue (it will use a ConcurrentQueue internally). It will make the code a bit cleaner and easier to use. Commented May 24, 2012 at 19:37
  • Thanks for the tip...I will change :) Commented May 24, 2012 at 19:39
  • Have you tried Parallel.Invoke with an array of actions? Commented May 24, 2012 at 19:41
  • 1
    Hmm...only 3-4 operations were completed in 1 minute? If they really do average 5 seconds then something is off here...way off. I'd be interested in see more about how you start the Tasks. Commented May 24, 2012 at 19:41
  • 1
    @BrianGideon agreed. I'm wondering if there's something about what the tasks are doing that's causing them to step on each other's toes or create bottlenecks from concurrency. (database deadlocks and the sort) Commented May 24, 2012 at 19:46

3 Answers 3

1

The reason why the performance for Tasks was so poor was because you mounted too many small tasks (4000). Remember the CPU needs to schedule the tasks as well, so mounting a lots of short-lived tasks causes extra work load for CPU. More information can be found in the second paragraph of TPL:

Starting with the .NET Framework 4, the TPL is the preferred way to write multithreaded and parallel code. However, not all code is suitable for parallelization; for example, if a loop performs only a small amount of work on each iteration, or it doesn't run for many iterations, then the overhead of parallelization can cause the code to run more slowly.

When you used the background workers, you limited the number of possible alive threads to the ProcessCount. Which reduced a lot of scheduling overhead.

Sign up to request clarification or add additional context in comments.

Comments

1

Given that you have a strictly defined list of things to do, I'd use the Parallel class (either For or ForEach depending on what suits you better). Furthermore you can pass a configuration parameter to any of these methods to control how many tasks are actually performed at the same time:

 System.Threading.Tasks.Parallel.For(0, 20000, new ParallelOptions() { MaxDegreeOfParallelism = 5 }, i => { //do something }); 

The above code will perform 20000 operations, but will NOT perform more than 5 operations at the same time.

I SUSPECT the reason the background workers did better for you was because you had them created and instantiated at the start, while in your sample Task code it seems you're creating a new Task object for every operation.

Alternatively, did you think about using a fixed number of Task objects instantiated at the start and then performing a similar action with a ConcurrentQueue like you did with the background workers? That should also prove to be quite efficient.

Comments

0

Have you considered using threadpool?

http://msdn.microsoft.com/en-us/library/system.threading.threadpool.aspx

If your performance is slower when using threads, it can only be due to threading overhead (allocating and destroying individual threads).

1 Comment

Well, tasks are retreived from ThreadPool? I am creating Task objects, not Thread.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.