24

What would be a way to find largest commits (i.e. commits introducing most changes, for instance counted as the number of added/removed lines) in a git repo?

Note that I really want largest commits, not largest files, so git find fat commit is not helpful here.

3 Answers 3

15

you can use git log --format=format:"%H" --shortstat. It will output something like

b90c0895b90eb3a6d1528465f3b5d96a575dbda2 2 files changed, 32 insertions(+), 7 deletions(-) 642b5e1910e1c2134c278b97752dd73b601e8ddb 11 files changed, 835 insertions(+), 504 deletions(-) // other commits skipped 

Seems like an easily parsed text.

Sign up to request clarification or add additional context in comments.

4 Comments

I get an error when running the git log --format=format:"%H" --shortstat command. fatal: ambiguous argument '%H': unknown revision or path not in the working tree.
How do I sort by number of insertions/deletions?
@Boris and anyone else who wants to parse this. A quick and dirty way for me was to use some regex to find anything with over 9,999 insertions and 9,999 deletions. While in the git log view, type / to begin searching (just like vim). Then type this regex \d{5,} insertions\(\+\) \d{5,}. This means search for 5 or more digits, then the string insertions(+) , followed by another 5 or more digits. Change the 5s up or down to find more or less. If you find no matches, just start with the smallest regex \d to make sure it's working, then start adding the curlys, etc.
@Jason Thanks for that regex. I had to add a comma in there since the git output text must have changed a little.
11

For anyone wanting to get a simple list of largest to smallest commits (by the amount of changes made in a commit) I took @max's answer and parsed and ordered the result.

git log --format=format:"%H" --shortstat | perl -00 -ne 'my ($hash, $filesChanged, $insertions, $deletions) = $_ =~ /(?:[0-9a-f]+\n)*([0-9a-f]+)\n(?: (\d+) files? changed,)?(?: (\d+) insertions?...,?)?(?: (\d+) deletions?...)?/sg; print $hash, "\t", $insertions + $deletions, "\n"' | sort -k 2 -nr 

That takes all the commits, adds together the number of insertions and deletions for each, and then orders that list from highest to lowest. To get just the top ten largest commits add | head -10 to the end.

Comments

0

My solution:

shell script:

git log --after="2024-10-01" --pretty=format:"%H - %ad :" --date=format:"%a %b %d %T %Y %z" | while read line do commit_hash=$(echo $line | cut -d' ' -f1) commit_date=$(echo $line | cut -d' ' -f3-) commit_size=$(git show --format=format: --stat $commit_hash | tail -1 | awk '{print $4}') echo "$commit_date $commit_size $commit_hash" done | sort -nr --field-separator=' ' --key=8 | head -20 

output:

四 10 10 09:13:18 2024 +0800 : 28530 dd70787ad5121c0884e1884fa4fdfe4c1b044267 四 11 14 07:14:42 2024 +0800 : 5422 d6adca7d57a04e23e3076181b5a56d370b8fb92b 四 11 14 07:57:02 2024 +0800 : 5327 5cdf72dcb933b5da4e379e31f3a103352446d771 三 10 02 10:36:50 2024 +0800 : 370 af1ca597b18eb1d80d75401819afefaf74ecb9ef 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.