14

Is it possible to download a repository's commits, branches, and tags, excluding blobs and trees? I would like to be able to view the history and whatnot without downloading the files (this is for the Chromium repo, which is multiple gigs). Obviously I will not be able to see which files were affected by a commit, but that's fine.

2
  • Please check this: stackoverflow.com/a/3489576/6435375 Commented Jun 7, 2016 at 16:04
  • 1
    Thanks @RicoHerlt, but that is referring to downloading only a limited history, including the blobs/trees. I am hoping to get the entire history without the blobs/trees. Commented Jun 7, 2016 at 16:33

4 Answers 4

9

No, or at least, not using any ordinary access. Some sites offer web access, through which you can obtain the contents of every commit object without also obtaining tree and blob objects, but the normal process of receiving objects or thin packs is either truncated at the commit level (via --depth) or is complete.

You can of course see all visible tags with git ls-remote as well as through any sensible web interface (it would be weird to provide something like GitHub's fancy API if you didn't provide the tags that way :-) ).

Note that traversing all commits via a web API may be tremendously slow, either due to having to stop and wait (if you program it synchronously rather than as a streaming process) or due to rate limiting software on the host (GitHub and Bitbucket both seem to do rate limiting).

Sign up to request clarification or add additional context in comments.

Comments

5
+100

We are building ghuser.io (enhanced GitHub profile pages) and any way to get the commit history without files would help us tremendously to scale.

Then you would need to setup a mirror server with GVFS (Git Virtual File System) / VFS For Git support.

Since June 2016 (the OP question) and now (Q4 2018), VFS For Git (since issue 72 is soon to be resolved) has been proposed by Microsoft (Feb. 2017), and allows you to develop with TeraBytes repos(!) without having the files downloaded.

GitHub itself should support it soon.

See more at gvfs.io, although I suspect that a domain name which is now renamed to reflect the new "VFS For Git" name: https://vfsforgit.org.
(Microsoft/VFSForGit.WWW issue 9 is closed, Nov. 28th 2018)

Note: (Feb. 2021), the certificate issue regarding https://vfsforgit.org finally got resolved: see microsoft/VFSForGit issue 1705.

5 Comments

Very interesting... Any clue when GitHub will support VFS for Git? And would it mean that all current repoS on GitHub would be accessible with VFS for Git?
No idea yet (GitHub support might have more information), in the meantime, keep an eye on blog.github.com/changelog and blog.github.com/category/announcements. For your second question: yes. All repos could then be cloned through that virtual filesystem.
Still: you can setup your own server, with a latest Git + VFS4G and use that server as a mirror from which you can clone/pull and push back to.
We want to get the entire commit history of several hundred thousands of repoS. Setting up such server for so many repoS would cost a lot of money. (ghuser.io is currently serving 2654 users that contributed to 73,359 repoS)
@brillout I agree. Somehow the scope of your question escaped me. I would still maintain VFS4G is a good fit for your original question.
0

The "Partial Clone" feature was added in git 2.19.

Documentation here: https://www.git-scm.com/docs/partial-clone

In order to use it:

  • You need git >= 2.19 on both the server and the client
  • On the server, you need to enable the feature: git config --global uploadpack.allowFilter true
  • git clone --filter=tree:0 REMOTE_URL

Comments

-1

You can achieve this with github apis.

https://developer.github.com/v3/repos/commits/#list-commits-on-a-repository

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.