2

I'm currently trying to reduce the size of my Git repository but faced many issues.

Introduction

I have a huge and complex Git repository containing thousands of commits and more than ten branches. It's current size is over 2 GB.

What I want to do

I would like to clean the repository history in order to reduce its size as much as possible. I chose a special commit that I want to be my new root commit (call it <NEW_ROOT>); I want to remove every commit before <NEW_ROOT> and keep all the commits after.

I want to keep only master and, possibly, develop branches, any other branch should be removed from history to reduce size.

At the end of the procedure I want to push everything to remote, so that it only keeps updated master and origin (basically it must reflect my local situation).

What I tried so far

I browsed the web a lot and found many solutions, but none of them worked for me. In particular I guess that such a solution would be perfect in my case, unfortunately I got a lot of conflicts when rebasing.

I also struggled a lot because many solutions I found refers to obsolete and deprecated tools/options (e.g. git filter-branch).

Could you please help me find a way out?

Thanks a lot!

1 Answer 1

2

This sounds like something you can achieve by doing a shallow clone of your local large repository:

A shallow repository has an incomplete history some of whose commits have parents cauterized away. [...] This is sometimes useful when you are interested only in the recent history of a project even though the real history recorded in the upstream is much larger.

The idea is to shallow clone your local repository into a new directory starting from the commit you deemed to be the new root. Note that this solution assumes that you're only interested in keeping a single branch in the new repository (e.g. master).

The first thing you need to do is create a branch reference that points to the parent of <NEW_ROOT> in the existing repository:

cd your-large-repo git branch new-root <NEW_ROOT>^ 

We'll use new-root as the cut off point for the shallow clone. Since we do want to include <NEW_ROOT> in the new repository, we set the cut-off point to its parent. Of course, <NEW_ROOT> must be reachable from master.

At this point, you can go ahead and clone your local repository into a new directory specifying that:

  1. You're only interested in the master branch
  2. You want to exclude all the commits reachable from new-root

Here's the complete command:

git clone --branch master --shallow-exclude=new-root file://C:\path\to\your-large-repo C:\path\to\your-new-repo 

The --shallow-exclude option is what tells Git to exclude all commits leading up to and including new-root from the clone.

Now, if you cd into your-new-repo, you'll find that it only contains the master branch and that the root commit is <NEW_ROOT>.

The new repository will have its origin set to file://C:\path\to\your-large-repo. So, before you go any further, you'll have to replace it with the actual URL of the remote repository:

git remote set-url origin https://example.com/your-large-repo.git 

At this point, you can simply force push the new history to the remote repository (with the usual caveat on the consequences of force pushing).

Sign up to request clarification or add additional context in comments.

1 Comment

Finally a working solution, thanks a lot!!! I tried to use a shallow copy but it didn't work because the "server doesn't support shallow copies". I didn't realize I could achieve the same by using the local repository, now I almost halved the size. Thanks! :-)

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.