Return to Answer

added 1007 characters in body

edited May 3, 2022 at 8:10

78.3k
16
214
320

Another, more memory-efficient option could be to use paste to get all the relevant lines from each file together:

% paste -d '\n' file*.dat 1 1 3 0 1 3 4 8 9 0 5 9 10 11 3 9 2 4

And then use awk on them:

# cat rowsum-paste.awk NR > 1 && NF != prevNF { for (i = 1; i <= prevNF; i++) { printf "%s ", sum[i]; sum[i] = 0 }; printf "\n" } { for (i = 1; i <= NF; i++) sum[i] += $i; prevNF = NF }

% (paste -d '\n' file*.dat; echo) | awk -f rowsum-paste.awk 4 1 9 12 4 8 18 12 15

This awk code sums lines until the number of fields changes, and then prints and resets the current sums. The extra echo is to change the number of fields at the end and trigger the final print, which can also be done with the printing code duplicated in an END block.

Another, more memory-efficient option could be to use paste to get all the relevant lines from each file together:

% paste -d '\n' file*.dat 1 1 3 0 1 3 4 8 9 0 5 9 10 11 3 9 2 4

And then use awk on them:

# cat rowsum-paste.awk NR > 1 && NF != prevNF { for (i = 1; i <= prevNF; i++) { printf "%s ", sum[i]; sum[i] = 0 }; printf "\n" } { for (i = 1; i <= NF; i++) sum[i] += $i; prevNF = NF }

% (paste -d '\n' file*.dat; echo) | awk -f rowsum-paste.awk 4 1 9 12 4 8 18 12 15

Source Link

answered May 3, 2022 at 5:20

muru

78.3k
16
214
320

The following awk script can pretty much replace the whole shell script:

# cat rowsum.awk FNR <= rows { for (i = 1; i <= NF; i++) sum[FNR,i] += $i } END { for (i = 1; i <= rows; i++) { for (j = 1; j <= rows + 1; j++) { printf "%s ", sum[i, j] } printf "\n"; } }

Example:

% awk -f rowsum.awk -v rows=2 file*.dat 4 1 9 12 4 % awk -f rowsum.awk -v rows=3 file*.dat 4 1 9 12 4 8 18 12 15

This should be faster than going through all files again and again for each row.

Note: I'm assuming the nth row has n+1 columns. If not, save the number of columns for each row (e.g., cols[FNR]=NF) and use that in the final loop.