19

I've been following this excellent answer to extract a subdirectory of my git repository into its own repository, while retaining the complete history.

My repository looks like:

src/ http/ math/ tests/ http/ math/ 

I want to create a new branch that only contains the src/math and tests/math directories.

If I run the following command:

git subtree split -P src/math -b math 

It creates a branch that contains the contents of the src/math directory, but discards the src/math/ prefix.

If I try the same command with two directories:

git subtree split -P src/math -P tests/math -b math 

It only extracts the contents of tests/math, ignoring src/math, and also discarding the tests/math prefix.

To summarize, I would like my final repository to look like:

src/ math/ tests/ math/ 

That is, keeping the original directory structure but discarding everything that's not explicitly mentioned in the command-line.

How can I do that?

6
  • 1
    I guess Downvoter did not understand the question. Commented Oct 8, 2014 at 10:05
  • this is not exactly a dup of stackoverflow.com/questions/2982055/… but it does ask for the same result. It's just that here the question is specific to git subtree split. I followed the procedure in the first answer and it works like a charm Commented Mar 25, 2015 at 21:06
  • It's a bit more work, but could you preserve history by moving each of the subdirectories to a new directory and then splitting that new common parent directory? You would have the extra commit from moving the files, but is that such a bad thing? No changes to prior commit hashes... Commented Aug 10, 2015 at 15:01
  • 1
    @cowboydan: FYI: I just confirmed that git subtree split does not maintain the rename history from outside of that directory, so this approach would be as effective as just copying the files to a new repository. It’s also worth noting that git subtree split itself rewrites your history, generating new hashes. Commented Dec 20, 2019 at 22:49
  • 1
    Does this answer your question? Detach many subdirectories into a new, separate Git repository Commented Oct 5, 2020 at 21:33

4 Answers 4

11

Use git-subtree add to split-in

# First create two *split-out* branches cd /repos/repo-to-split git subtree split --prefix=src/math --branch=math-src git subtree split --prefix=test/math --branch=math-test # Now create the new repo mkdir /repos/math cd /repos/math git init # This approach has a gotcha: # You must commit something so "revision history begins", # or `git subtree add` will complain about. # In this example, an empty `.gitignore` is commited. touch .gitignore git add .gitignore git commit -m "add empty .gitignore to allow using git-subtree" # Finally, *split-in* the two branches git subtree add --prefix=src/math ../repo-to-split math-src git subtree add --prefix=test/math ../repo-to-split math-test 

It worked for me with git --version 2.23.0. Also note that you can setup different prefixes at split-in time, i.e. add the src/math/ to src/ and test/math/ to test/.

Side note: use git log at the new repo before commiting to a remote, to see if resultant history is ok enought for you. In my case I have some commits with duplicated messages, because my repo history was so dirty, but it's ok for me.

Source

Sign up to request clarification or add additional context in comments.

3 Comments

You will get duplicates no matter how clean your commit history is because each time you call git subtree split it is rewriting your commit history for that branch. In this example, if you had ten commits that touched both src/math and test/math, those will now become twenty commits. Worse, each of those commits will, of course, be limited to modifications in one folder. That’s why it’s desirable instead to use something like git filter-branch (or, better yet, the third-party git filter-repo) so that you can include multiple folders in a single rewrite operation.
If you don't care about duplicate commit history as @JeremyCaney pointed out above, this is by far, the simplest correct answer.
This is the best answer for me. Other approaches are less intuitive, or inefficient.
10

Use git-filter-repo This is not part of git as of version 2.25. This requires Python3 (>=3.5) and git 2.22.0

git filter-repo --path src/math --path tests/math 

For my repo that contained ~12000 commits git-filter-branch took more than 24 hours and git-filter-repo took less than a minute.

Comments

5

Depending on your needs you might get away with git filter-branch.

I'm not entirely sure what you are trying to achieve, but if you merely want to have a repository with two directories removed (in the history?) this is probably your best shot.

See also Rewriting Git History.

$ git filter-branch --tree-filter 'rm -rf tests/http src/http' --prune-empty HEAD 

This will look into each commit and remove the two directories from this commit. Be aware that this rewrites history (i.e.: alters your commit sha) and will cause headaches if you have a common history with another repository.

10 Comments

Actually, my example was an oversimplification, and I have many more folders than just http and math. Is there a way I can just specify which ones to keep, and not which ones to delete?
in essence you can put a bash script in there. can rm deal with what you'd like to keep -> nope. can bash do, yes. try to figure out how to use find to match your requirements. gnu.org/software/findutils/manual/html_mono/find.html
furthermore you can use regular expressions to match your directories. it's simpler and faster to tell rm what to delete.
linuxjournal.com/content/bash-extended-globbing this might also be helpful to have more powerful file globbing in bash.
have you tried to read the manual of git filter-branch ? There is a --prune-empty to get rid of those commits.
|
0

To expand @laconbass answer

Initialize example git repository

mkdir repos mkdir repos/repo-to-split || mkdir repos\repo-to-split cd repos/repo-to-split || cd repos\repo-to-split git init mkdir src mkdir src/math1 || mkdir src\math1 mkdir test mkdir test/math2 || mkdir test\math2 echo "base readme" >> readme.md echo "src/math1 readme" >> src/math1/readme.md || echo "src/math1 readme" >> src\math1\readme.md echo "test/math2 readme" >> test/math2/readme.md || echo "test/math2 readme" >> test\math2\readme.md git add . git commit -m "readme's added" 

You need to

  1. Make sure there are no changes:
git add . && git stash 
  1. "git subtree split branch" a directory from master/main into its own branch
# Prefer directories in branch names like features/[feature name] git subtree split --prefix=src/math1 --branch=math/src git subtree split --prefix=test/math2 --branch=math/test 
  1. delete the directory you just created a branch for
# On windows, when you install git, select option to add unix utilities to PATH for rm rm -rf src/math rm -rf test/math git add . git commit -m "temporarily remove git subtree split folder branches (adding back in next step)" 
  1. "git subtree add" the branch back into master/main
# Finally, *split-in* the two branches back into master git subtree add --prefix=src/math1 ../repo-to-split math/src git subtree add --prefix=test/math2 ../repo-to-split math/test git add . git commit -m "added git subtree split features back in" 

Git subtree will keep a full copy of the branch in master/main

If you need to git subtree add to other remote/external repositories, that works too

Then I highly recommend creating hooks so git subtree push tries to push to the remote. Would be nice if permission failures while git subtree push would display as warnings and skip

Bonus:

cd ../../repos || cd ..\..\repos git clone --bare repo-to-split repo-to-split-worktree cd repo-to-split-worktree git worktree add master git worktree add math/src git worktree add math/test ls master || dir master la math || dir math 

You now have all branches in a single repository. You can navigate into each branch/trunk added and work like normal. While having access to other branches without needing to re-clone

2 Comments

above example should run on windows, mac, & linux
I currently have no idea how to be inside a "git subtree split feature" branch and push changes to master/main - stackoverflow.com/questions/77187985/… :-(

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.