Skip to main content
16 events
when toggle format what by license comment
May 27, 2020 at 21:00 history tweeted twitter.com/StackSoftEng/status/1265749654647472130
Apr 12, 2020 at 17:51 comment added candied_orange On a physical hard drive seek time is not only a function of where the file is physically located, but also where the head was before you asked for it. Keep in mind you're likely not the only thing moving the head. To see how much this is impacting you set up a ram disk and test against that. Should stabilize your variance (and be orders of magnitude faster). Doesn't solve the problem but it makes clear where it lies.
Apr 12, 2020 at 9:52 comment added Doc Brown @tera_789: the CPU usage you observe is a typical characteristic of an I/O bound problem, not very astonishing.
Apr 12, 2020 at 9:36 comment added tera_789 @DocBrown The thing is that CPU usage never goes really high when I run these scans (it is mostly 20-30%), sometimes even less.
Apr 12, 2020 at 9:17 answer added gnasher729 timeline score: 0
Apr 12, 2020 at 9:16 comment added tera_789 @DocBrown yeah I see that...it fluctuates a lot...hard to make a decision thus. At this point, I am starting to think that either network or storage device's OS is playing a big role here...sometimes it is ThreadPoolExecutor, which is faster in most tests, and, sometimes it is ProcessPoolExecutor...
Apr 12, 2020 at 9:11 comment added Doc Brown @tera_789: the test results don't seem to support what you wrote in your question - for 2 of them, ProcessPoolExecutor is faster, but for 3 of them, ThreadPoolExecutor.
Apr 12, 2020 at 8:21 history edited tera_789 CC BY-SA 4.0
added test results
Apr 12, 2020 at 8:17 comment added tera_789 @DocBrown I added test results
Apr 12, 2020 at 8:16 history edited tera_789 CC BY-SA 4.0
added test results
Apr 12, 2020 at 7:43 comment added Euphoric My gut instinct in this situation is that you cannot do any optimization unless you work on OS or even HW layer. Maybe try to find OS APIs that could be used to call that would return the size of the whole directory, instead of doing it yourself? This would give OS way to use it's own optimizations.
Apr 12, 2020 at 7:21 history edited Doc Brown CC BY-SA 4.0
Fixed wrong usage of the term parallelism
Apr 12, 2020 at 7:01 answer added Tfry timeline score: 4
Apr 12, 2020 at 6:57 answer added Netch timeline score: 2
Apr 11, 2020 at 22:30 review First posts
Apr 12, 2020 at 11:25
Apr 11, 2020 at 22:29 history asked tera_789 CC BY-SA 4.0