1

I have this code periodically calls the load function which does very load work taking 10sec. Problem is when load function is being executed, it's blocking the main flow. If I send a simple GET request (like a health check) while load is being executed, the GET call is blocked until the load call is finished.

function setLoadInterval() { var self = this; this.interval = setInterval(function doHeavyWork() { // this takes 10 sec self.load(); self.emit('reloaded'); }, 20000); 

I tried async.parallel but still the GET call was blocked. I tried setTimeout but got the same result. How do I make load to running on background so that it doesn't block the main flow?

this.interval = setInterval(function doHeavyWork() { async.parallel([function(cb) { self.load(); cb(null); }], function(err) { if (err) { // log error } self.emit('reloaded'); }) }, 20000); 
4
  • 1
    What exacltly does the self.load(); do that takes 10 seconds? I mean if it can be done in separate process then just use one of fork|spawn|exec this way it won't block the event loop. Commented Jan 20, 2017 at 20:51
  • it reads 1M key values pairs in map and does some filtering. Commented Jan 20, 2017 at 21:08
  • So the question is isn't there a better way to do what you need? If the data are from DB why don't you let your db to the job for you? If you can be more specific there might be other way to do it maybe. Commented Jan 20, 2017 at 21:13
  • Thanks for your comment. I have a few alternatives but wanted to see if it's simple to run the load function completely on background .. looks like it's not easy in node world .. Commented Jan 20, 2017 at 21:18

1 Answer 1

1

Node.js is an event driven non-blocking IO model

Anything that is IO is offloaded as a separate thread in the underlying engine and hence parallelism is achieved. If the task is CPU intensive there is no way you can achieve parallelism as by default Javascript is a blocking sync language

However there are some ways to achieve this by offloading the CPU intensive task to a different process.

Option1:
exec or spawn a child process and execute the load() function in that spawned node app. This is okay if the interval fired is for every 20000 ms as by the time another one fired, the 10sec process will be completed.
Otherwise it is dangerous as it can spawn too many node applications eating up your Systems resources

Option2: I dont know how much data self.load() accepts and returns. If it is trivial and network overhead is acceptable, make that task a load-balanced web service (may be 4 webservers running in parallel) which accepts (rather point to) 1M records and returns back filtered records.

NOTE
It looks like you are using node async parallel function. But keep a note of this description from the documentation.

Note: parallel is about kicking-off I/O tasks in parallel, not about parallel execution of code. If your tasks do not use any timers or perform any I/O, they will actually be executed in series. Any synchronous setup sections for each task will happen one after the other. JavaScript remains single-threaded.

Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.