I want to have summary of difference between two files. Expected output is count of new, deleted and changed lines. Does diff readily provides such output? If not is there any script/utility available which helps in getting the summary.
4 Answers
I think you are looking for diffstat. Simply pipe the output of diff -u to diffstat and you should get something like this.
include/net/bluetooth/l2cap.h | 6 ++++++ net/bluetooth/l2cap.c | 18 +++++++++--------- 2 files changed, 15 insertions(+), 9 deletions(-) 2 Comments
Jens Kohl
For those of us on a Mac which installed Homebrew. Just install it via
brew install diffstat.Vser
To get the name of the file printed, you need the diff to be unified
diff -u foo1/bar.cpp foo2/bar.cpp | diffstat will yell bar.cpp | 6 ++++++. If not unified, the file name will be unknown.If you use diff -u it will generate a unified diff that has lines preceded with + and -. If you pipe that output through grep (to get only the + or -) and then to wc you get the counts for the + es and the - es respectively.
2 Comments
suyasha
Thanks, here is bash shell scriptlet for the same diff -u -s "$file1" "$file2" > "$diff_file" add_lines=
cat "$diff_file" | grep ^+ | wc -l del_lines=cat "$diff_file" | grep ^- | wc -l # igonre diff header (those starting with @@) at_lines=cat "$diff_file" | grep ^@ | wc -l chg_lines=cat "$diff_file" | wc -l chg_lines=expr $chg_lines - $add_lines - $del_lines - $at_lines # subtract header lines from count (those starting with +++ & ---) add_lines=expr $add_lines - 1 del_lines=expr $del_lines - 1 total_change=expr $chg_lines + $add_lines + $del_linesLightness Races in Orbit
@suyasha: Could you post that properly, as an answer with line breaks? I'd be interested to run it.
Here is the script by suyasha all formatted correctly with line breaks, with some added message output. Good job, suyasha, should have posted your reply as an answer. I would have voted for that.
#!/bin/bash # USAGE: diffstat.sh [file1] [file2] if [ ! $2 ] then printf "\n USAGE: diffstat.sh [file1] [file2]\n\n" exit fi diff -u -s "$1" "$2" > "/tmp/diff_tmp" add_lines=$(cat "/tmp/diff_tmp" | grep ^+ | wc -l) del_lines=$(cat "/tmp/diff_tmp" | grep ^- | wc -l) # ignore diff header (those starting with @@) at_lines=$(cat "/tmp/diff_tmp" | grep ^@ | wc -l) chg_lines=$(cat "/tmp/diff_tmp" | wc -l) chg_lines=$(expr $chg_lines - $add_lines - $del_lines - $at_lines) # subtract header lines from count (those starting with +++ & ---) add_lines=$(expr $add_lines - 1) del_lines=$(expr $del_lines - 1) total_change=$(expr $chg_lines + $add_lines + $del_lines) rm /tmp/diff_tmp printf "Total added lines: " printf "%10s\n" "$add_lines" printf "Total deleted lines:" printf "%10s\n" "$del_lines" printf "Modified lines: " printf "%10s\n" "$chg_lines" printf "Total changes: " printf "%10s\n" "$total_change" 1 Comment
Marcus Gröber
It should be noted that the definition of "modified lines" printed by this script may be a bit counter-intuitive: If you have the files test1.txt:
a b c d e and test2.txt a b xyz d e the output will be: Total added lines: 1 Total deleted lines: 1 Modified lines: 5 So, "modified lines" actually counts unmodified lines in the diff context, not adjacent add/remove pairs.