Return to Answer

format answer

edited Jan 31, 2021 at 10:39

1.4m
569
4.8k
5.7k

Since commit 8cbd431 ("git-add--interactive: replace hunk recounting with apply --recount", 2008-7-2, Git v1.6.0-rc0) if a hunk is skipped then we rely on the context lines to apply subsequent hunks in the right place.

While this works most of the time it is possible for hunks to end up being applied in the wrong place.

To fix this adjust the offset of subsequent hunks to correct for any change in the number of insertions or deletions due to the skipped hunk. The change in offset due to edited hunks that have the number of insertions or deletions changed is ignored here, it will be fixed in the next commit.

recount_edited_hunk() introduced in commit 2b8ea7f ("add -p: calculate offset delta for edited patches", 2018-03-05, Git v2.17.0) required all context lines to start with a space, empty lines are not counted.
This was intended to avoid any recounting problems if the user had introduced empty lines at the end when editing the patch.

However this introduced a regression into 'git add -p' as it seems it is common for editors to strip the trailing whitespace from empty context lines when patches are edited thereby introducing empty lines that should be counted.
'git apply' knows how to deal with such empty lines and POSIX states that whether or not there is an space on an empty context line is implementation defined (see diff command).

Fix the regression by counting lines that consist solely of a newline as well as lines starting with a space as context lines and add a test to prevent future regressions.

Commit fecc6f3 ("add -p: adjust offsets of subsequent hunks when one is skipped", 2018-03-01, Git v2.17.0-rc0) fixed adding hunks in the correct place when a previous hunk has been skipped.

However it did not address patches that are applied in reverse.

In that case we need to adjust the pre-image offset so that when apply reverses the patch the post-image offset is adjusted correctly.
We subtract rather than add the delta as the patch is reversed (the easiest way to think about it is to consider a hunk of deletions that is skipped - in that case we want to reduce offset so we need to subtract).

When skipping a hunk that adds a different number of lines than it removes, we need to adjust the subsequent hunk headers of non-skipped hunks: in pathological cases, the context is not enough to determine precisely where the patch should be applied.

This problem was identified in 23fea4c240 ("t3701: add failing test for pathological context lines", 2018-03-01, Git v2.17.0-rc0 -- merge ) and fixed in the Perl version in fecc6f3a68 ("add -p: adjust offsets of subsequent hunks when one is skipped", 2018-03-01, Git v2.17.0-rc0 -- merge).

And this patch fixes it in the C version of git add -p.

In contrast to the Perl version, we try to keep the extra text on the hunk header (which typically contains the signature of the function whose code is changed in the hunk) intact.

Note: while the C version does not support staging mode changes at this stage, we already prepare for this by simply skipping the hunk header if both old and new offset is 0 (this cannot happen for regular hunks, and we will use this as an indicator that we are looking at a special hunk).

Likewise, we already prepare for hunk splitting by handling the absence of extra text in the hunk header gracefully: only the first split hunk will have that text, the others will not (indicated by an empty extra text start/end range). Preparing for hunk splitting already at this stage avoids an indentation change of the entire hunk header-printing block later, and is almost as easy to review as without that handling.

When trying to stash part of the worktree changes by splitting a hunk and then only partially accepting the split bits and pieces, the user is presented with a rather cryptic error:

error: patch failed: <file>:<line> error: test: patch does not apply Cannot remove worktree changes

error: test: patch does not apply Cannot remove worktree changes

and the command would fail to stash the desired parts of the worktree changes (even if the stash ref was actually updated correctly).

We even have a test case demonstrating that failure, carrying it for four years already.

The explanation: when splitting a hunk, the changed lines are no longer separated by more than 3 lines (which is the amount of context lines Git's diffs use by default), but less than that.

So when staging only part of the diff hunk for stashing, the resulting diff that we want to apply to the worktree in reverse will contain those changes to be dropped surrounded by three context lines, but since the diff is relative to HEAD rather than to the worktree, these context lines will not match.

Example time. Let's assume that the file README contains these lines:

We the people

the people

and the worktree added some lines so that it contains these lines instead:

We are the kind people

are the kind people

and the user tries to stash the line containing "are", then the command will internally stage this line to a temporary index file and try to revert the diff between HEAD and that index file.
The diff hunk that git stash tries to revert will look somewhat like this:

@@ -1776,3 +1776,4  We +are the people

 We +are the people

It is obvious, now, that the trailing context lines overlap with the part of the original diff hunk that the user did not want to stash.

Keeping in mind that context lines in diffs serve the primary purpose of finding the exact location when the diff does not apply precisely (but when the exact line number in the file to be patched differs from the line number indicated in the diff), we work around this by reducing the amount of context lines: the diff was just generated.

Note: this is not a full fix for the issue.
Just as demonstrated in t3701's 'add -p works with pathological context lines' test case, there are ambiguities in the diff format. It is very rare in practice, of course, to encounter such repeated lines.

The full solution for such cases would be to replace the approach of generating a diff from the stash and then applying it in reverse by emulating git revert (i.e. doing a 3-way merge). However, in git stash -p it would not apply to HEAD but instead to the worktree, which makes this non-trivial to implement as long as we also maintain a scripted version of add -i.

Since commit 8cbd431 ("git-add--interactive: replace hunk recounting with apply --recount", 2008-7-2, Git v1.6.0-rc0) if a hunk is skipped then we rely on the context lines to apply subsequent hunks in the right place.

While this works most of the time it is possible for hunks to end up being applied in the wrong place.

To fix this adjust the offset of subsequent hunks to correct for any change in the number of insertions or deletions due to the skipped hunk. The change in offset due to edited hunks that have the number of insertions or deletions changed is ignored here, it will be fixed in the next commit.

recount_edited_hunk() introduced in commit 2b8ea7f ("add -p: calculate offset delta for edited patches", 2018-03-05, Git v2.17.0) required all context lines to start with a space, empty lines are not counted.
This was intended to avoid any recounting problems if the user had introduced empty lines at the end when editing the patch.

However this introduced a regression into 'git add -p' as it seems it is common for editors to strip the trailing whitespace from empty context lines when patches are edited thereby introducing empty lines that should be counted.
'git apply' knows how to deal with such empty lines and POSIX states that whether or not there is an space on an empty context line is implementation defined (see diff command).

Fix the regression by counting lines that consist solely of a newline as well as lines starting with a space as context lines and add a test to prevent future regressions.

Commit fecc6f3 ("add -p: adjust offsets of subsequent hunks when one is skipped", 2018-03-01, Git v2.17.0-rc0) fixed adding hunks in the correct place when a previous hunk has been skipped.

However it did not address patches that are applied in reverse.

In that case we need to adjust the pre-image offset so that when apply reverses the patch the post-image offset is adjusted correctly.
We subtract rather than add the delta as the patch is reversed (the easiest way to think about it is to consider a hunk of deletions that is skipped - in that case we want to reduce offset so we need to subtract).

When skipping a hunk that adds a different number of lines than it removes, we need to adjust the subsequent hunk headers of non-skipped hunks: in pathological cases, the context is not enough to determine precisely where the patch should be applied.

This problem was identified in 23fea4c240 ("t3701: add failing test for pathological context lines", 2018-03-01, Git v2.17.0-rc0 -- merge ) and fixed in the Perl version in fecc6f3a68 ("add -p: adjust offsets of subsequent hunks when one is skipped", 2018-03-01, Git v2.17.0-rc0 -- merge).

And this patch fixes it in the C version of git add -p.

In contrast to the Perl version, we try to keep the extra text on the hunk header (which typically contains the signature of the function whose code is changed in the hunk) intact.

Note: while the C version does not support staging mode changes at this stage, we already prepare for this by simply skipping the hunk header if both old and new offset is 0 (this cannot happen for regular hunks, and we will use this as an indicator that we are looking at a special hunk).

Likewise, we already prepare for hunk splitting by handling the absence of extra text in the hunk header gracefully: only the first split hunk will have that text, the others will not (indicated by an empty extra text start/end range). Preparing for hunk splitting already at this stage avoids an indentation change of the entire hunk header-printing block later, and is almost as easy to review as without that handling.

When trying to stash part of the worktree changes by splitting a hunk and then only partially accepting the split bits and pieces, the user is presented with a rather cryptic error:

error: patch failed: <file>:<line>

error: test: patch does not apply Cannot remove worktree changes

and the command would fail to stash the desired parts of the worktree changes (even if the stash ref was actually updated correctly).

We even have a test case demonstrating that failure, carrying it for four years already.

The explanation: when splitting a hunk, the changed lines are no longer separated by more than 3 lines (which is the amount of context lines Git's diffs use by default), but less than that.

So when staging only part of the diff hunk for stashing, the resulting diff that we want to apply to the worktree in reverse will contain those changes to be dropped surrounded by three context lines, but since the diff is relative to HEAD rather than to the worktree, these context lines will not match.

Example time. Let's assume that the file README contains these lines:

We

the people

and the worktree added some lines so that it contains these lines instead:

We

are the kind people

and the user tries to stash the line containing "are", then the command will internally stage this line to a temporary index file and try to revert the diff between HEAD and that index file.
The diff hunk that git stash tries to revert will look somewhat like this:

@@ -1776,3 +1776,4

 We +are the people

It is obvious, now, that the trailing context lines overlap with the part of the original diff hunk that the user did not want to stash.

Keeping in mind that context lines in diffs serve the primary purpose of finding the exact location when the diff does not apply precisely (but when the exact line number in the file to be patched differs from the line number indicated in the diff), we work around this by reducing the amount of context lines: the diff was just generated.

Note: this is not a full fix for the issue.
Just as demonstrated in t3701's 'add -p works with pathological context lines' test case, there are ambiguities in the diff format. It is very rare in practice, of course, to encounter such repeated lines.

The full solution for such cases would be to replace the approach of generating a diff from the stash and then applying it in reverse by emulating git revert (i.e. doing a 3-way merge). However, in git stash -p it would not apply to HEAD but instead to the worktree, which makes this non-trivial to implement as long as we also maintain a scripted version of add -i.

Since commit 8cbd431 ("git-add--interactive: replace hunk recounting with apply --recount", 2008-7-2, Git v1.6.0-rc0) if a hunk is skipped then we rely on the context lines to apply subsequent hunks in the right place.

While this works most of the time it is possible for hunks to end up being applied in the wrong place.

To fix this adjust the offset of subsequent hunks to correct for any change in the number of insertions or deletions due to the skipped hunk. The change in offset due to edited hunks that have the number of insertions or deletions changed is ignored here, it will be fixed in the next commit.

recount_edited_hunk() introduced in commit 2b8ea7f ("add -p: calculate offset delta for edited patches", 2018-03-05, Git v2.17.0) required all context lines to start with a space, empty lines are not counted.
This was intended to avoid any recounting problems if the user had introduced empty lines at the end when editing the patch.

However this introduced a regression into 'git add -p' as it seems it is common for editors to strip the trailing whitespace from empty context lines when patches are edited thereby introducing empty lines that should be counted.
'git apply' knows how to deal with such empty lines and POSIX states that whether or not there is an space on an empty context line is implementation defined (see diff command).

Fix the regression by counting lines that consist solely of a newline as well as lines starting with a space as context lines and add a test to prevent future regressions.

Commit fecc6f3 ("add -p: adjust offsets of subsequent hunks when one is skipped", 2018-03-01, Git v2.17.0-rc0) fixed adding hunks in the correct place when a previous hunk has been skipped.

However it did not address patches that are applied in reverse.

In that case we need to adjust the pre-image offset so that when apply reverses the patch the post-image offset is adjusted correctly.
We subtract rather than add the delta as the patch is reversed (the easiest way to think about it is to consider a hunk of deletions that is skipped - in that case we want to reduce offset so we need to subtract).

When skipping a hunk that adds a different number of lines than it removes, we need to adjust the subsequent hunk headers of non-skipped hunks: in pathological cases, the context is not enough to determine precisely where the patch should be applied.

This problem was identified in 23fea4c240 ("t3701: add failing test for pathological context lines", 2018-03-01, Git v2.17.0-rc0 -- merge ) and fixed in the Perl version in fecc6f3a68 ("add -p: adjust offsets of subsequent hunks when one is skipped", 2018-03-01, Git v2.17.0-rc0 -- merge).

And this patch fixes it in the C version of git add -p.

In contrast to the Perl version, we try to keep the extra text on the hunk header (which typically contains the signature of the function whose code is changed in the hunk) intact.

Note: while the C version does not support staging mode changes at this stage, we already prepare for this by simply skipping the hunk header if both old and new offset is 0 (this cannot happen for regular hunks, and we will use this as an indicator that we are looking at a special hunk).

Likewise, we already prepare for hunk splitting by handling the absence of extra text in the hunk header gracefully: only the first split hunk will have that text, the others will not (indicated by an empty extra text start/end range). Preparing for hunk splitting already at this stage avoids an indentation change of the entire hunk header-printing block later, and is almost as easy to review as without that handling.

When trying to stash part of the worktree changes by splitting a hunk and then only partially accepting the split bits and pieces, the user is presented with a rather cryptic error:
error: patch failed: <file>:<line> error: test: patch does not apply Cannot remove worktree changes 
and the command would fail to stash the desired parts of the worktree changes (even if the stash ref was actually updated correctly).

We even have a test case demonstrating that failure, carrying it for four years already.

The explanation: when splitting a hunk, the changed lines are no longer separated by more than 3 lines (which is the amount of context lines Git's diffs use by default), but less than that.

So when staging only part of the diff hunk for stashing, the resulting diff that we want to apply to the worktree in reverse will contain those changes to be dropped surrounded by three context lines, but since the diff is relative to HEAD rather than to the worktree, these context lines will not match.

Example time. Let's assume that the file README contains these lines:
We the people 
and the worktree added some lines so that it contains these lines instead:
We are the kind people 
and the user tries to stash the line containing "are", then the command will internally stage this line to a temporary index file and try to revert the diff between HEAD and that index file.
The diff hunk that git stash tries to revert will look somewhat like this:

@@ -1776,3 +1776,4  We +are the people 
It is obvious, now, that the trailing context lines overlap with the part of the original diff hunk that the user did not want to stash.

Keeping in mind that context lines in diffs serve the primary purpose of finding the exact location when the diff does not apply precisely (but when the exact line number in the file to be patched differs from the line number indicated in the diff), we work around this by reducing the amount of context lines: the diff was just generated.

Note: this is not a full fix for the issue.
Just as demonstrated in t3701's 'add -p works with pathological context lines' test case, there are ambiguities in the diff format. It is very rare in practice, of course, to encounter such repeated lines.

The full solution for such cases would be to replace the approach of generating a diff from the stash and then applying it in reverse by emulating git revert (i.e. doing a 3-way merge). However, in git stash -p it would not apply to HEAD but instead to the worktree, which makes this non-trivial to implement as long as we also maintain a scripted version of add -i.

add Git 2.29

Source Link

edited Sep 19, 2020 at 13:23

VonC

1.4m
569
4.8k
5.7k

Git 2.29 (Q4 2020) brings a leakfix to git add -p (used by stash -p)

See commit 324efcf (07 Sep 2020) by Phillip Wood (phillipwood).
^{(Merged by Junio C Hamano -- gitster -- in commit 3ad8d3e, 18 Sep 2020)}

add -p: fix memory leak

^{Signed-off-by: Phillip Wood}
^{Acked-by: Johannes Schindelin}

asan reports that the C version of add -p is not freeing all the memory it allocates.

Fix this by introducing a function to clear struct add_p_state`` and use it instead of freeing individual members.

Git 2.29 (Q4 2020) brings a leakfix to git add -p (used by stash -p)

See commit 324efcf (07 Sep 2020) by Phillip Wood (phillipwood).
^{(Merged by Junio C Hamano -- gitster -- in commit 3ad8d3e, 18 Sep 2020)}

add -p: fix memory leak

^{Signed-off-by: Phillip Wood}
^{Acked-by: Johannes Schindelin}

asan reports that the C version of add -p is not freeing all the memory it allocates.

Fix this by introducing a function to clear struct add_p_state`` and use it instead of freeing individual members.

add Git 2.27

Source Link

edited May 1, 2020 at 21:48

VonC

1.4m
569
4.8k
5.7k

Before Git 2.27 (Q2 2020), allowing the user to split a patch hunk while "git stash -p" does not work well; a band-aid has been added to make this (partially) work better.

See commit 7723436, commit 121c0d4 (08 Apr 2020) by Johannes Schindelin (dscho).
^{(Merged by Junio C Hamano -- gitster -- in commit e81ecff, 28 Apr 2020)}

stash -p: (partially) fix bug concerning split hunks

^{Signed-off-by: Johannes Schindelin}

When trying to stash part of the worktree changes by splitting a hunk and then only partially accepting the split bits and pieces, the user is presented with a rather cryptic error:

error: patch failed: <file>:<line>

error: test: patch does not apply Cannot remove worktree changes

and the command would fail to stash the desired parts of the worktree changes (even if the stash ref was actually updated correctly).

We even have a test case demonstrating that failure, carrying it for four years already.

The explanation: when splitting a hunk, the changed lines are no longer separated by more than 3 lines (which is the amount of context lines Git's diffs use by default), but less than that.

So when staging only part of the diff hunk for stashing, the resulting diff that we want to apply to the worktree in reverse will contain those changes to be dropped surrounded by three context lines, but since the diff is relative to HEAD rather than to the worktree, these context lines will not match.

Example time. Let's assume that the file README contains these lines:

We

the people

and the worktree added some lines so that it contains these lines instead:

We

are the kind people

and the user tries to stash the line containing "are", then the command will internally stage this line to a temporary index file and try to revert the diff between HEAD and that index file.
The diff hunk that git stash tries to revert will look somewhat like this:

@@ -1776,3 +1776,4

 We +are the people

It is obvious, now, that the trailing context lines overlap with the part of the original diff hunk that the user did not want to stash.

Keeping in mind that context lines in diffs serve the primary purpose of finding the exact location when the diff does not apply precisely (but when the exact line number in the file to be patched differs from the line number indicated in the diff), we work around this by reducing the amount of context lines: the diff was just generated.

Note: this is not a full fix for the issue.
Just as demonstrated in t3701's 'add -p works with pathological context lines' test case, there are ambiguities in the diff format. It is very rare in practice, of course, to encounter such repeated lines.

The full solution for such cases would be to replace the approach of generating a diff from the stash and then applying it in reverse by emulating git revert (i.e. doing a 3-way merge). However, in git stash -p it would not apply to HEAD but instead to the worktree, which makes this non-trivial to implement as long as we also maintain a scripted version of add -i.

Before Git 2.27 (Q2 2020), allowing the user to split a patch hunk while "git stash -p" does not work well; a band-aid has been added to make this (partially) work better.

See commit 7723436, commit 121c0d4 (08 Apr 2020) by Johannes Schindelin (dscho).
^{(Merged by Junio C Hamano -- gitster -- in commit e81ecff, 28 Apr 2020)}

stash -p: (partially) fix bug concerning split hunks

^{Signed-off-by: Johannes Schindelin}

When trying to stash part of the worktree changes by splitting a hunk and then only partially accepting the split bits and pieces, the user is presented with a rather cryptic error:

error: patch failed: <file>:<line>

error: test: patch does not apply Cannot remove worktree changes

and the command would fail to stash the desired parts of the worktree changes (even if the stash ref was actually updated correctly).

We even have a test case demonstrating that failure, carrying it for four years already.

The explanation: when splitting a hunk, the changed lines are no longer separated by more than 3 lines (which is the amount of context lines Git's diffs use by default), but less than that.

So when staging only part of the diff hunk for stashing, the resulting diff that we want to apply to the worktree in reverse will contain those changes to be dropped surrounded by three context lines, but since the diff is relative to HEAD rather than to the worktree, these context lines will not match.

Example time. Let's assume that the file README contains these lines:

We

the people

and the worktree added some lines so that it contains these lines instead:

We

are the kind people

and the user tries to stash the line containing "are", then the command will internally stage this line to a temporary index file and try to revert the diff between HEAD and that index file.
The diff hunk that git stash tries to revert will look somewhat like this:

@@ -1776,3 +1776,4

 We +are the people

It is obvious, now, that the trailing context lines overlap with the part of the original diff hunk that the user did not want to stash.

Keeping in mind that context lines in diffs serve the primary purpose of finding the exact location when the diff does not apply precisely (but when the exact line number in the file to be patched differs from the line number indicated in the diff), we work around this by reducing the amount of context lines: the diff was just generated.

Note: this is not a full fix for the issue.
Just as demonstrated in t3701's 'add -p works with pathological context lines' test case, there are ambiguities in the diff format. It is very rare in practice, of course, to encounter such repeated lines.

The full solution for such cases would be to replace the approach of generating a diff from the stash and then applying it in reverse by emulating git revert (i.e. doing a 3-way merge). However, in git stash -p it would not apply to HEAD but instead to the worktree, which makes this non-trivial to implement as long as we also maintain a scripted version of add -i.