Skip to content

Conversation

@Swatinem
Copy link
Contributor

Currently, Cursor.each uses a really inefficient process.nextTick -> nextObject() -> recurse to self
loop.

This patch avoids this behavior and rather loops over a batch directly, providing a nice speedup up to 2x depending on workload.
I’ve run make test and its passing without issues.

The benchmarks I have done are available at Swatinem/mongobench. I can observe the following speedups:
line
(jitter graph is in log scale)
jitter

The patched version however uses more memory:
heap-total

I might need some help fixing that.

@christkv
Copy link
Contributor

Higher memory usage would be normal in this case as you're keeping around more items in each stack frame. process.nextTick with nextObject means the stack frame is tiny for each. But if you have a couple of thousands of objects in each process.nextTick GC will have to do more work collecting them as the last graph shows pretty well. More memory usage pr say is not a problem but it might be worth benchmarking it with higher concurrency (like 50-100 clients running cursors on the same size of data) and see what the memory impact is. I have been thinking about looking into using trampolining instead of process.nextTick but have had no luck getting it to work.

By the way process.nextTick is not necessarily a bad thing. Think of it as yielding to the event loop so it can schedule other concurrent in flight work.

@Swatinem
Copy link
Contributor Author

I have updated my benchmark to run with concurrency. You can also grab the code here to run the tests yourself.

A few more pretty graphs with concurrency=100

line
jitter
heap-used

The memory numbers are now actually better with my patch. I also noticed that I’m actually using Cursor.toArray() instead of .each() which surely adds some memory overhead.

Regarding process.nextTick:
nodejs/node-v0.x-archive@8546d18 got me a little worried. Seems like nextTick will be completely overhauled for 0.10.
However, I haven’t actually tried it so I can’t really say if it really is an issue for this workload.

@Swatinem
Copy link
Contributor Author

Since my use case involves Cursor.toArray, why not use batches there too instead of delegating to .each…
With some minor array optimizations, I get a small speedup as well
jitter
And with concurrency = 100
jitter

@christkv
Copy link
Contributor

I can't seem to get the code you provided to run, I assume it's using some stuff I cannot see in your environment. Can your provide some simple instructions on how to run it and generate the graphs as I'm interested in just playing with it a bit before taking a decision on the pull request. So far it looks good though but I need to test with varying sizes of documents being returned as well as concurrency.

@Swatinem
Copy link
Contributor Author

I’ve added a README: Swatinem/mongobench@cd43bb0

Should be straight forward, just copy/symlink mongodb into $directory and run the nodejs script and save the output to a .tsv file.
Then you need R to create the graphs.

@christkv christkv merged commit 3dee6b7 into mongodb:master Dec 19, 2012
@christkv
Copy link
Contributor

Had to back out this checkin due to problems with replicasets when it was in, will investigate it again after christmas

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

2 participants