Skip to main content

Timeline for Filter a large file quickly

Current License: CC BY-SA 3.0

20 events
when toggle format what by license comment
Jun 17, 2020 at 9:04 history edited CommunityBot
Commonmark migration
May 14, 2014 at 19:45 vote accept CommunityBot moved from User.Id=9206 by developer User.Id=3572
May 14, 2014 at 19:46
May 14, 2014 at 19:45 history bounty awarded CommunityBot
May 13, 2014 at 9:36 history edited James_pic CC BY-SA 3.0
added 1 character in body
May 13, 2014 at 9:34 comment added James_pic You're right. That'll teach me for not reading the instructions properly. It's a one-letter change in both cases, so I'll go make it.
May 12, 2014 at 20:15 comment added user9206 I think neither this code nor your C code allow T - U = 0.
May 12, 2014 at 18:45 comment added user9206 I am using your latest code. It is much slower using -J-server -J-XX:+AggressiveOpts -J-Xms6g for me.
May 11, 2014 at 8:48 comment added James_pic @Lembik, was that using the latest version of the code? It's had a few upgrades since my original submission.
May 10, 2014 at 18:50 comment added user9206 I managed to speed it up by just using time scala -J-Xmx6g Filterer largefile.file output.txt
May 9, 2014 at 16:13 history edited James_pic CC BY-SA 3.0
edited body
May 9, 2014 at 15:31 history edited James_pic CC BY-SA 3.0
added 566 characters in body
May 9, 2014 at 14:50 comment added James_pic @Geobits Yeah, String.split is currently a bottleneck, but StringTokenizer isn't much better right now - allocating in a tight inner loop is hurting my already strained GC. I'm working on an FSM which seems to have promise (whilst being complete overkill)
May 9, 2014 at 14:22 comment added Geobits Re input: I don't always use StringTokenizer, but when I do, I parse millions of strings.
May 9, 2014 at 13:54 history edited James_pic CC BY-SA 3.0
added 140 characters in body
May 9, 2014 at 11:56 history edited James_pic CC BY-SA 3.0
added 210 characters in body
May 9, 2014 at 9:38 history edited James_pic CC BY-SA 3.0
added 541 characters in body
May 9, 2014 at 8:03 comment added James_pic Yes, sorry. I figured you might have that issue. Ubuntu still ships with Scala 2.9, and string interpolation needs 2.10 or higher. I suspect it'll be faster still under Java 8, but Ubuntu only ships with 7, and that's a world of pain that you don't need!
May 8, 2014 at 18:36 comment added user9206 I got it to work with 2.10.3. It's a very nice solution although my poor computer is more or less unusable for a minute or so afterwards as it tries to deallocate 6GB of RAM.
May 8, 2014 at 16:39 comment added user9206 I get James_pic.scala:42: error: ')' expected but string literal found. output.write(s"$a $b $t\n$c $d $u\n") ^ one error found. This is on Scala compiler version 2.9.2
May 8, 2014 at 16:04 history answered James_pic CC BY-SA 3.0