Flatten git subdirectory by converting dirs to new branches

Question

I have a git branch which contains a number of directories that I would like to flatten out into all new branches. It looks kind of like this:

git/dev/us -> us-one, us-two, us-three

I would like it to end up like this:

git/dev/us-one git/dev/us-two etc

It looks like this is possible using git-filter-branch, but I'm not sure how to use it with all the directories. Could I do something like:

for branch in `find . -type d -d 1`; do git filter-branch --subdirectory-filter $branch --prune-empty -- --all done

git filter-branch is for copying existing commits while applying some transformation(s) on each one (before making the new copy). The arguments to filter-branch provide the transformations and tell it what branch names to update to point them to the newly-copied commits instead of their original (now-copied) commits. It will not create any new branch names. — torek
– torek, Commented Jun 5, 2017 at 15:43
So it sounds like git-filter-branch could be used to almost cherry pick commits from one place to another? Does that mean if I'm transitioning a repo from SVN to git, and there are changes in the svn repo before the git change is complete, git-filter-branch could be used to move over ONLY those specific commits? — user3270760
– user3270760, Commented Jun 5, 2017 at 15:51
You certainly could abuse git filter-branch to do that sort of thing (cherry-picking), but that's not what it's designed for. (See also Mark Adelsberger's answer, of course.) — torek
– torek, Commented Jun 5, 2017 at 17:12

Mark Adelsberger · Accepted Answer · 2017-06-05 16:30:46Z

Well, I guess there's a way to do it. Pointing out a few things for posterity first...

Caveats

Since you're talking about using filter-branch, I guess you've thought through the implications of creating separate histories for each of these branches. Just in case, I'll point out that it may be difficult for someone who's used your old repo to locate the corresponding code version in the new repo. Also, if changes across directories were coordinated, those coordinations will be lost (unless you do something to recreate them).

Possibly related to that last point, if the structures in each of these directories are similar enough for this to make sense, then it would be surprising if they didn't sometimes need to "share" a change. Neither the "current state" structure you describe, nor the "goal state" you've asked for, really will support that very well, if it is indeed an issue.

But how would you do it?

In the simplest case you've just got one branch you want to "break out" for each directory. That's implied in your question, but just in case I'll provide a slightly more general procedure that should work for multiple branches.

So first of all, you need to actually create all the branches you'll want in the end. If you have tags, you might want to replicate them as well, but I'll come back to that.

For each branch that you have:

git checkout `branchA` git branch `branchA-one` git branch `branchA-two` ... git checkout `branchB` git branch `branchB-one` git branch `branchB-two` ... ...

Then you can run filter-branch on each set of refs. You'll definitely use a subdirectory-filter and if you have tags to preserve you'll want a tag-name-filter as well. If typical changes would affect some (but not all) subdirectories, then you'll need to decide if you want (a) to preserve parallel histories, or (b) to have filter-branch eliminate empty commits

So in the simplest case

git filter-branch --subdirectory-filter dir-one -- branchA-one branchB-one ...

If you want to preserve tags, the easiest thing is to copy them into each new history (using the same suffix as the branches), so

git filter-branch --subdirectory dir-one --tag-name-filter 'sed s/$/-one' -- branchA-one branchB-one

And you can throw in --prune-empty if you want to remove commits that don't affect the subtree for the branch.

Each run of filte-branch will create some "backup refs" that you'll want to clean up (refs/original/...)

Note that what this doesn't provide, most likely, is a common root commit. You will end up with literally independent histories in the repo. Often that doesn't matter. It's convenient if you decide later to split them into different repos. It's less convenient if you ever foresee merging across these histories.

In that last case, you can work around the independent histories most of the time (e.g. --allow-unrelated-histories in a merge), but if you don't want to have to, then you'd modify the above procedure. Before doing the filter-branches you would create an empty "shared root"

git branch --orphan newRoot git rm -r * git commit --allow-empty

And then you would include a parent-filter to graft the root of each rewrite onto newRoot.

Collectives™ on Stack Overflow

Flatten git subdirectory by converting dirs to new branches

1 Answer 1

Comments

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Related