Skip to main content
12 events
when toggle format what by license comment
Jul 20, 2022 at 16:17 review Suggested edits
Jul 23, 2022 at 2:04
May 3, 2022 at 10:02 comment added algae Yes! This is what I had in mind. Just couldn't see past the awk. It is also the fastest of all 4 solutions.
May 3, 2022 at 8:12 comment added muru @algae I added an alternative that should use much less memory
May 3, 2022 at 8:10 history edited muru CC BY-SA 4.0
added 1007 characters in body
May 3, 2022 at 6:23 vote accept algae
May 3, 2022 at 6:21 comment added Kusalananda @algae FNR is the line number in the current file. The fields of that line has to be added to the corresponding entries in each of the other files. When FNR increments, that just means you have started working on the next line from the same file.
May 3, 2022 at 6:20 comment added algae @Kusalananda Then I don't understand how the first for loop works. arr[NFR,i] points to the FNRth row and the ith column of all the file*.dat? After each FNR increment occurs you have a line ready to print. Though this would require another loop and not be worth it I guess
May 3, 2022 at 6:14 comment added Kusalananda @algae Since all the fields on each line have to be summed with the corresponding elements in the other files, you can't really print each line as you go, as the sum would not be "done" yet. Also, the amount of memory taken should be no more than the amount needed to store one of the files in RAM.
May 3, 2022 at 6:08 comment added algae @muru Thanks! Definitely faster than my script, at the cost of quite a bit of memory, which is fine. I hadn't realised that every line of the files supplied is looped over implicitly. Would it be faster / less memory expensive to print each line as you go? e.g. instead of sum[FNR,i] it is just sum[i], print the line and repeat.
May 3, 2022 at 5:53 comment added algae @Kusalananda Yes I've removed it. The script should have set ROWS=$(wc -l) or something.
May 3, 2022 at 5:52 comment added Kusalananda You could probably use rows == "" || FNR <= rows as the first condition, or remove it completely. I'm assuming that the ROWS variable in their code is a byproduct of their way of thinking about the problem rather than a necessary part of the solution.
May 3, 2022 at 5:20 history answered muru CC BY-SA 4.0