Pulling certain git branches in a script

Question

I have a script that starts like this:

#!/bin/sh for b in `git branch -r | grep -v -- '->'`; do git branch --track ${b##origin/} $b; done git fetch --all

This will fetch all of the remote branches. I only want to fetch the branches that start with the word "hotfix".

How can I do this?

Edit: At the beginning, I'd also like to delete all of the branches, besides master

torek · Accepted Answer · 2017-01-03 23:58:56Z

This doesn't do what you think it does. I'm going to address this a bit backwards, because....

Item #2: git fetch --all means to fetch from all remotes. This has little to do with branches.

Item #1: let's define the term remote, since Git doesn't do a very good job of it. (The gitglossary documentation describes a remote repository and a remote-tracking branch without ever defining the word "remote"!) A remote is mainly just a short, one-word name for a URL. The classic remote is the word origin. Most repositories are created by cloning, and cloning sets up a remote to save the original URL. The default name for this remote is origin.

Hence, git fetch --all fetches from all remotes. Unless you have more than one remote, this does nothing special. If you only have origin, you're still just fetching from origin as usual.

Item #3 is the question of what, precisely, Git actually fetches. Here, things get complicated. Let's start by noting, though, that each commit has its own unique ID, those big ugly SHA-1 hash IDs that Git prints (often abbreviated, as face0ff or cafedad or whatever).

One of the keys to understanding Git is to recognize that branch names, in Git, have rather little importance. They mainly matter in two ways, one of which ties in to git fetch, so we will get back to this soon—but first, we need to view the commit graph (DAG, in gitglossary). The DAG, or Directed Acyclic Graph, is made from all the commits in your repository. It's this graph that git fetch fetches, with names merely being ways to get started within the graph. A name, like master or branch1, translates to a commit ID. Git looks up the commit by its ID, and that gets Git the contents of the commit.

Each commit stores, in its contents, the IDs of its parent commits. Most commits just have one parent. Merge commits are those that have more than one parent, and there is at least one root commit, which has no parents. The very first commit in a repository is necessarily a root commit. This means we can always start from one of the most recent commits, and work backwards, following that commit to its parent. Using that parent, we find its parents, and using those parents, we find yet more parents, until we eventually work our way back to the root. If we draw the process, we get something like this, if there are no branches and merges:

o <- o <- o ... <- o <- o <-- most-recent

where each o represents a commit, and the backwards-pointing arrows go from commit to parent. Because they only point backwards, not forwards, we can only go from child to parent, never from parent to child.¹

The backwards direction of the internal links is usually not that important, except that they show why we have to have a name—such as a branch name—to get us started: the most recent commit has no later commit pointing to it. This is where branch names come in, in Git: branch names are how we find the most recent commits.

Thus, we might draw the DAG like this, if there are three names, master and two branch-es, and no visible merges:

o--o--o--...--o--o <-- master \ o--...--o <-- branch1 \ o <-- branch2

Newer commits are towards the right and older ones towards the left in this drawing. It's also worth pointing out that the root commit is in fact on all three branches, and in this particular graph, all but one of the commits that are on branch2 are also on branch1. This is another key to understanding Git: commits are often on many branches at the same time. A branch name just gets us started, so that we don't miss any commits. It doesn't have to be the only way to get to a commit, but we need some name by which we can find every commit.²

In short, these names—branch names, tag names, or any other names—make some set of commits reachable.

This rather long aside finally brings us to what git fetch actually fetches.

¹It is possible to "go backwards", but only by making an exhaustive search through the entire repository, which takes a long time. The maintenance command git fsck does this, for instance. It will find commits that have no names pointing to them. These are called "unreferenced" and "dangling" commits, and they're actually normal, since Git spins off a lot of commits that are deliberately abandoned, in the normal course of working in a repository. Git's "garbage collector", or git gc, eventually cleans these up.

²That name need not be a branch name. For instance, any tag name, or refs/stash, can also name commits, and the commits these locate need not be on any branch at all.

`git fetch` brings into our repository some commit(s) located by some name(s)

Remember that when we run git fetch, we're having our Git contact another Git. That other Git has its own, separate, independent Git repository, with its own commits and its own branches.

When we have our Git call up their Git, we usually don't want our Git's commits to be forgotten and replaced with their Git's commits. We usually don't want our branches to be discarded in favor of their branches.³ Instead, what we usually want is to have our Git's commits get added-to. We want their commits added to ours, and we want our Git to remember their Git's branches, but under some other name.

This is where the remote name re-enters the picture. Their branches have names, like master and branch1 and hotfix. Our Git will take their commits, which their Git finds (has reachable) by their names, and combine them with our existing commits. But our Git must give their commits names in our repository, and here our Git uses our remote-tracking branch names.

When we run git fetch, our Git calls up their Git and asks them what branches (and tags and other names) they have, and what commits go with those names. Our Git then checks to see if we have those commits. If not, our Git asks for those commits, and their parents, and those parents' parents, and so on, until our Git finds some commits we already have. At this point our Git doesn't need any more commits from them, because we have just found out where their graph joins up with our graph.

Next, our Git stores those fetched commits away in our repository, and now comes the final key step: our Git stores the IDs under our remote-tracking branch names. That is, their master may have been deadcab while ours is badbeef. We don't want to replace ours, but we do want to remember theirs—so we have our Git remember origin/master = deadcab. Now our graph looks like this:

...--o--o--o <-- master (badbeef) \ o--o <-- origin/master (deadcab)

Commit deadcab, their master, points back to commit cafeb0b, which points back to badbeef, which is our master. We call their master our origin/master to keep it separate from our master.

If we decide we like their two new commits, we can advance our master to point to deadcab directly:

...--o--o--o \ o--o <-- master, origin/master (deadcab)

Now we have two names pointing to the same commit, deadcab; but that's just fine. The two names are our master and our origin/master (with our origin/master being our Git's memory of their master based on the last time we fetched from them).

³If we do want that, this is called a "fetch mirror", and git fetch can implement this directly. That's almost, but not quite, what you want.

What you want is almost, but not quite, a fetch mirror

You have suggested that what you want is to:

Delete all your local branch names, except for master. This is a valid thing to do, but be careful, because it makes your own commits unreachable. Any commits you have, that no one else has, that were name-able only through your own local branch names, are no longer name-able. That will make them eligible for garbage collection.
Obtain (as remote-tracking branches) the branches that they are calling hotfix*, and make local branches that point to the same commit.

The Git command that does this sort of work in scripts is git for-each-ref. To use it, you need to know that your own local branches are a specific kind of Git reference (hence for-each-ref). A reference is just a name that starts with refs/, and a branch name is just a reference starting with refs/heads/. A remote-tracking branch is just a reference starting with refs/remotes/ and then having the name of the remote, so all the origin ones are refs/remotes/origin/.

Hence, we want to do this in three steps:

git fetch origin: call up the Git at the URL stored under origin, get any new commits from it, and update our own origin/* remote-tracking branches (i.e., everything in refs/remotes/origin/). We should probably use --prune as well, which tells our Git to delete, from our remote-tracking branches, any origin/* branches that no longer exist on origin. Hence:
```
git fetch --prune origin 
```
git for-each-ref refs/heads: this will let us do something with every local branch. We want to delete it unless its name is master. This also requires a bit of care, since we can't delete a branch we have checked-out, so it is probably a good idea to git checkout master first:
```
git checkout master git for-each-ref --format='%(refname:short)' refs/heads | while read b; do [ $b == master ] || git branch -D $b done 
```

Create new local branches whose name mimics remote-tracking branches whose name matches the form hotfix*:

git for-each-ref --format='%(refname:short)' 'refs/remotes/origin/hotfix*' | while read rb; do b=${rb#origin/} git branch $b --track $rb done

Collectives™ on Stack Overflow

Pulling certain git branches in a script

1 Answer 1

`git fetch` brings into our repository some commit(s) located by some name(s)

What you want is almost, but not quite, a fetch mirror

Comments

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

git fetch brings into our repository some commit(s) located by some name(s)

What you want is almost, but not quite, a fetch mirror

Comments

Related

`git fetch` brings into our repository some commit(s) located by some name(s)