-1

I need to extract multiple words from git log.

I need the commit id and ticket number for each commit, I then need to run this through a loop and do some further processing for each commit.

commit 87b56bbd6496802bbc97c8abd0346597d7a15810 (HEAD -> main, tag: app_v4.151.6, origin/main, origin/HEAD) Author: daniel-johns-99 <[email protected]> Date: Thu Mar 27 21:11:18 2025 +1000 CI-451-template-change (#4497) commit 45bec837a19ac33822da2033bf1fb192b8cf2945 (tag: uk_v46, tag: app_v4.151.5) Author: Leon Duffus <[email protected]> Date: Thu Mar 27 09:16:51 2025 +0000 DTE-602 dsc breadcrumb component design changes (#4457) : 

given the above, I would need the following array that I could process

CI-451:87b56bbd6496802bbc97c8abd0346597d7a15810 DTE-602:45bec837a19ac33822da2033bf1fb192b8cf2945 

the following is my current attempt:

commits=$(git log "$previous_tag".."$tag") for i in $commits; do commit=$(echo $i | sed 's/^.*commit //') ticket=$(echo $i | grep --extended-regexp --only-matching --ignore-case "(${proj_regex})[ -][0-9]{2,5}") echo $commit done 
1
  • Get used to copy/pasting your shell scripts into shellcheck.net and fixing the issues it tells you about. Commented Mar 27 at 17:25

2 Answers 2

3

I wouldn't process the "default" log, ever.

You can get halfway there just with a different log format. In this case printing the Hash and Commit message only.

git log --pretty=format:'%H %s' 

And then use awk to finish up (making the assumption that ticket refs appear at the start of the commit message)

git log --pretty=format:'%H %s' | \ awk -F"[^a-zA-Z0-9]" '$2"-"$3 ~ /^[A-Z]+-[0-9]+$/ {print $2"-"$3":"$1}' 

Explaining that awk:

-F"[^a-zA-Z0-9]" 

Everything that isn't alpha or numeric is a field delimiter

$2"-"$3 ~ /^[A-Z]+-[0-9]+$/ 

In cases where field 2 and 3 (first two "words" of the commit message) look like they are probably a ticket ref, we continue

{print $2"-"$3":"$1} 

Print the first 3 fields in the desired order.


More robust alternative, based heavily on Ed Morton's answer (adapted lightly for the simpler custom log format):

git log --pretty=format:'%H %s' \ | awk -v OFS=: \ ' match($2, /^[[:upper:]]+-[[:digit:]]+/) { print(substr($2, 1, RLENGTH), $1) } ' 
2
  • can you change it so that the whole message is included and seprated by a pipe instead of just the number: ci-488: fetch blueprint (#4741)|f0fd5dc0f5e051e75b841b3830e3a54cdf779200 Commented May 7 at 16:48
  • Just use the basic git command and fiddle with the tokens git log --pretty=format:'%s|%H'. However, putting the tokens in this order could be problematic; IE if you have a | in a commit message it will make processing the output harder. Commented May 13 at 13:29
1

This might be what you want, using any POSIX awk:

$ cat file | awk -v OFS=':' ' /^commit/ { c=$2 } match($1, /^[[:upper:]]+-[[:digit:]]+/) { print substr($1,1,RLENGTH), c } ' CI-451:87b56bbd6496802bbc97c8abd0346597d7a15810 DTE-602:45bec837a19ac33822da2033bf1fb192b8cf2945 

Replace cat file with whatever command produces the sample input you show at the top of the question:

$ cat file commit 87b56bbd6496802bbc97c8abd0346597d7a15810 (HEAD -> main, tag: app_v4.151.6, origin/main, origin/HEAD) Author: daniel-johns-99 <[email protected]> Date: Thu Mar 27 21:11:18 2025 +1000 CI-451-template-change (#4497) commit 45bec837a19ac33822da2033bf1fb192b8cf2945 (tag: uk_v46, tag: app_v4.151.5) Author: Leon Duffus <[email protected]> Date: Thu Mar 27 09:16:51 2025 +0000 DTE-602 dsc breadcrumb component design changes (#4457) : 

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.