In this way you do not risk pushing a dangling submodule reference, i.e., a reference in the parent repository to a commit in the submodule that has not been pushed.
To be extra sure, you can use git fsck with Git 2.31 (Q1 2021): the approach to "fsck" the incoming objects in "index-pack" is attractive for performance reasons (we have them already in core, inflated and ready to be inspected), but fundamentally cannot be applied fully when we receive more than one pack stream, as a tree object in one pack may refer to a blob object in another pack as ".gitmodules", when we want to inspect blobs that are used as ".gitmodules" file, for example.
Teach "index-pack" to emit objects that must be inspected later and check them in the calling "fetch-pack" process.
See commit 5476e1e, commit b664e9f, commit 27e35ba, commit 726b25a (22 Feb 2021) by Jonathan Tan (jhowtan).
(Merged by Junio C Hamano -- gitster -- in commit 6ee353d, 01 Mar 2021)
fetch-pack: print and use dangling .gitmodules
Signed-off-by: Jonathan Tan
Teach index-pack to print dangling .gitmodules links after its "keep" or "pack" line instead of declaring an error, and teach fetch-pack to check such lines printed.
This allows the tree side of the .gitmodules link to be in one packfile and the blob side to be in another without failing the fsck check, because it is now fetch-pack which checks such objects after all packfiles have been downloaded and indexed (and not index-pack on an individual packfile, as it is before this commit).
git index-pack now includes in its man page:
For internal use only.
Die if the pack contains broken objects. If the pack contains a tree pointing to a .gitmodules blob that does not exist, prints the hash of that blob (for the caller to check) after the hash that goes into the name of the pack/idx file (see "Notes").
And this is made more robust with Git 2.32 (Q2 2021):
See commit 3745e26, commit c96e184, commit 462f5ca, commit c15087d, commit 53692df, commit 394d5d3, commit 44e07da, commit 901f2f6, commit b549502, commit c72da1a, commit 30cf618, commit 1b32b59, commit e35d65a, commit 35af754, commit 034a7b7, commit f1abc2d, commit a1aad71, commit d385784 (28 Mar 2021), and commit fb79f5b (17 Mar 2021) by Ævar Arnfjörð Bjarmason (avar).
(Merged by Junio C Hamano -- gitster -- in commit 5644419, 07 Apr 2021)
fetch-pack: use new fsck API to printing dangling submodules
Signed-off-by: Ævar Arnfjörð Bjarmason
Refactor the check added in 5476e1e ("fetch-pack: print and use dangling .gitmodules", 2021-02-22, Git v2.31.0-rc1 -- merge) to make use of us now passing the "msg_id" to the user defined "error_func".
We can now compare against the FSCK_MSG_GITMODULES_MISSING instead of parsing the generated message.
"git fetch-pack -k -k"(man) without passing --lock-pack (which we never do ourselves) did not work at all, which has been corrected with Git 2.46 (Q3 2024), batch 17.
See commit 96a6621 (19 Jun 2024) by Jeff King (peff).
(Merged by Junio C Hamano -- gitster -- in commit b781a3e, 27 Jun 2024)
fetch-pack: fix segfault when fscking without --lock-pack
Reported-by: Kirill Smelkov
Signed-off-by: Jeff King
The fetch-pack internals have multiple options related to creating ".keep" lock-files for the received pack:
- if
args.lock_pack is set, then we tell index-pack to create a .keep file.
In the fetch-pack plumbing command, this is triggered by passing "-k" twice. - if the caller passes in a
pack_lockfiles string list, then we use it to record the path of the keep-file created by index-pack.
We get that name by reading the stdout of index-pack.
In the fetch-pack command, this is triggered by passing the (undocumented) --lock-pack option; without it, we pass in a NULL string list.
So it's possible to ask index-pack to create the lock-file (using "-k -k") but not ask to record it (by avoiding "--lock-pack").
This worked fine until 5476e1e ("fetch-pack: print and use dangling .gitmodules", 2021-02-22, Git v2.31.0-rc1 -- merge), but now it causes a segfault.
Before that commit, if pack_lockfiles was NULL, we wouldn't bother reading the output from index-pack at all.
But since that commit, index-pack may produce extra output if we asked it to fsck.
So even if nobody cares about the lockfile path, we still need to read it to skip to the output we do care about.
We correctly check that we didn't get a NULL lockfile path (which can happen if we did not ask it to create a .keep file at all), but we missed the case where the lockfile path is not NULL (due to "-k -k") but the pack_lockfiles string_list is NULL (because nobody passed "--lock-pack"), and segfault trying to add to the NULL string-list.
We can fix this by skipping the append to the string list when either the value or the list is NULL.
In that case we must also free the lockfile path to avoid leaking it when it's non-NULL.
Nobody noticed the bug for so long because the transport code used by "git fetch"(man) always passes in a pack_lockfiles pointer, and remote-curl (the main user of the fetch-pack plumbing command) always passes --lock-pack.