Skip to main content

Timeline for Filter a large file quickly

Current License: CC BY-SA 3.0

15 events
when toggle format what by license comment
Sep 5, 2014 at 16:25 comment added Keith Randall @Lembik: the code as is only sorts B*R=27 bit numbers. You now have 29 bit numbers - you need one more pass (R++) or one more bit per pass (B++). B++ is probably easier, R is hardcoded in a few unrolled loops.
Sep 5, 2014 at 12:03 comment added user9206 I was just going back to this code and changed the data creation to be nolines = 5000000 and then print random.randint(0,100*nolines-1), random.randint(0,100*nolines-1), random.randint(0,100*nolines-1) . That is I multiplied the range by 100. Your code now runs very slowly. Do you know why by any chance?
May 15, 2014 at 4:34 comment added primo Absurdly fast. I couldn't even get parse time under 30s!
May 14, 2014 at 21:14 comment added wolfhammer For some reason a for loop was even faster. for(temp=0;*p != ' ';temp = 10 * temp + (*(p++) - '0')) {} lines[i].a = temp; p++;
May 14, 2014 at 21:02 comment added wolfhammer I noticed faster parsing times when doing the integer calculation in one line. Try:10 * temp + (*(p++) - '0');
May 14, 2014 at 20:19 comment added user9206 Your entry was just as fast as James_pic's but I couldn't award the bounty to both so I gave it to the first one. I hope that's OK. I am about to make a follow up question which I hope you will find fun too.
May 14, 2014 at 19:46 vote accept CommunityBot moved from User.Id=9206 by developer User.Id=3572
May 14, 2014 at 9:53 comment added user9206 @Geobits The top two submissions basically take the same amount of time. You could submit yours and we would have a three way race :) (Also, I plan to post a follow up challenge soon on the same topic.)
May 13, 2014 at 21:51 comment added Geobits Well damn. I literally just created filter.c to do the same thing, came to the question and found this. +1
May 13, 2014 at 17:18 comment added Keith Randall Radix sort works well in a cache because there is one stream of reads and N streams of writes (in my code, N=512). As long as your cache has N+1 cache lines, everything can remain in cache.
May 13, 2014 at 9:42 comment added James_pic YES! I love it. I had a feeling that cache locality would make joining on T faster, but I always figured the sort stage would offset any gains. Using Radix sort pretty much eliminates that.
May 13, 2014 at 8:19 comment added user9206 This is equal first. I am surprised radix sort is so fast as normally you think of it as having terrible cache performance. I think I will need to test in single user mode to tell them apart as the timings are not exactly the same on each run even with the same test file.
May 13, 2014 at 4:51 history edited Keith Randall CC BY-SA 3.0
fix spacing
May 13, 2014 at 4:45 history edited Keith Randall CC BY-SA 3.0
use a bcount array to quickly eliminate most window searches
May 13, 2014 at 4:08 history answered Keith Randall CC BY-SA 3.0