Timeline for Filter a large file quickly

Current License: CC BY-SA 3.0

15 events

when toggle format	what		by	license	comment
Sep 5, 2014 at 16:25	comment	added	Keith Randall		@Lembik: the code as is only sorts B*R=27 bit numbers. You now have 29 bit numbers - you need one more pass (R++) or one more bit per pass (B++). B++ is probably easier, R is hardcoded in a few unrolled loops.
Sep 5, 2014 at 12:03	comment	added	user9206		I was just going back to this code and changed the data creation to be `nolines = 5000000` and then `print random.randint(0,100nolines-1), random.randint(0,100nolines-1), random.randint(0,100*nolines-1)` . That is I multiplied the range by 100. Your code now runs very slowly. Do you know why by any chance?
May 15, 2014 at 4:34	comment	added	primo		Absurdly fast. I couldn't even get parse time under 30s!
May 14, 2014 at 21:14	comment	added	wolfhammer		For some reason a for loop was even faster. for(temp=0;p != ' ';temp = 10 temp + (*(p++) - '0')) {} lines[i].a = temp; p++;
May 14, 2014 at 21:02	comment	added	wolfhammer		I noticed faster parsing times when doing the integer calculation in one line. Try:10 * temp + (*(p++) - '0');
May 14, 2014 at 20:19	comment	added	user9206		Your entry was just as fast as James_pic's but I couldn't award the bounty to both so I gave it to the first one. I hope that's OK. I am about to make a follow up question which I hope you will find fun too.
May 14, 2014 at 19:46	vote	accept	CommunityBot		moved from User.Id=9206 by developer User.Id=3572
May 14, 2014 at 9:53	comment	added	user9206		@Geobits The top two submissions basically take the same amount of time. You could submit yours and we would have a three way race :) (Also, I plan to post a follow up challenge soon on the same topic.)
May 13, 2014 at 21:51	comment	added	Geobits		Well damn. I literally just created `filter.c` to do the same thing, came to the question and found this. +1
May 13, 2014 at 17:18	comment	added	Keith Randall		Radix sort works well in a cache because there is one stream of reads and N streams of writes (in my code, N=512). As long as your cache has N+1 cache lines, everything can remain in cache.
May 13, 2014 at 9:42	comment	added	James_pic		YES! I love it. I had a feeling that cache locality would make joining on T faster, but I always figured the sort stage would offset any gains. Using Radix sort pretty much eliminates that.
May 13, 2014 at 8:19	comment	added	user9206		This is equal first. I am surprised radix sort is so fast as normally you think of it as having terrible cache performance. I think I will need to test in single user mode to tell them apart as the timings are not exactly the same on each run even with the same test file.
May 13, 2014 at 4:51	history	edited	Keith Randall	CC BY-SA 3.0	fix spacing
May 13, 2014 at 4:45	history	edited	Keith Randall	CC BY-SA 3.0	use a bcount array to quickly eliminate most window searches
May 13, 2014 at 4:08	history	answered	Keith Randall	CC BY-SA 3.0