7

I have one directory called 'projects' which is the parent directory, within that near by 200 sub-directories which are my projects.

For now I am executting git pull by following script.

#!/bin/bash find . -type d -name .git -exec sh -c "cd \"{}\"/../ && pwd && git pull && echo -e '-------------------- \n ' " \; 

Is there any efficient way I can do this process in multithreading and faster way?

1
  • Use xargs or parallel with the list of directories. Commented Apr 7, 2015 at 12:33

2 Answers 2

3

All sub-directory are not having same git repository and also its not submodules. So for now I am solving this problem by xargs which is below.

#!/bin/bash find . -type d -name '.git' -print0 | xargs -P 40 -n 1 -0 -I '{}' sh -c "cd \"{}\"/../ && git pull && pwd && echo -e '-------------------- \n ' " \; 
  • find . - Start find from current working directory (recursively by default)
  • -type d -name '.git' - Finding all directories having .git directory as sub-directory.
  • -print0 - List of directories as input to xargs

I also found some good help at http://coldattic.info/shvedsky/pro/blogs/a-foo-walks-into-a-bar/posts/7

Sign up to request clarification or add additional context in comments.

Comments

1

Note that if your nested repos were declared as submodules, then a simple git submodule update --remote would be enough.

That is, provided you had your submodules configured to follow a branch.
See also "Git submodule to track remote branch".

Those updates (involving a pull) would not be multithreaded though (both for the checkout part, but for the fetch part as well.

The multi-threading is only for one operation, as mentioned in this thread:

A few selected operations are multi-threaded if you compile with thread support (i.e., do not set NO_PTHREADS when you build).

But object packing (used during fetch/push, and during git-gc) is multi-threaded (at least the delta compression portion of it is).

git may fork to perform certain asynchronous operations.
E.g., during a fetch, one process runs pack-objects to create the output, and the other speaks the git protocol, mostly just passing through the output to the client.
On systems with threads, some of these operations are performed using a thread rather than fork.
This is not about CPU performance, but about keeping the code simple (and cannot be controlled with config).


All that means, as Etan Reisner comments, that you would need to script those git pull updates yourself in order to multithread those commands.

See "Multithreading in Bash" for scripting solution.

3 Comments

While this contains a fair bit of generally useful information it doesn't actually deal with the situation in the OP's question at all or offer anything by way of a solution to the OP's problem.
@EtanReisner because as far as I know there is no git-native solution: multiple git pull are not multithreaded by default.
His repositories are unrelated (it would seem). You don't need a git-native solution. You postulated a situation that doesn't exist and which painted yourself into a problem that needn't exist.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.