1

I have a web project which I deploy to an ec2 instance simply by pushing new commits. I use the post-recieve git hook remotely to execute a shell-script which 'deploys' the project by checking it out into a production directory. The steps are, run npm install on the express app, npm install on the frontend (a create-react-app app), then run npm run build (which basically utilizes web-pack to build an optimized distribution folder from my node source code).

These steps are expensive and in many cases not needed. E.G. if all I did was update a Node component in srcs/components/ then npm run build should run, but npm install on the server and frontend shouldn't. If all I have done is added a comment to my express app, no scripts should run.

My currently server-side deploy script looks like this:

#!/usr/bin/env bash GIT_WORK_TREE=/home/ec2-user/absiteProd git checkout -f ### TODO: conditional NPM work pm2 restart index 

My question is then how can I use git (or grep, sed, awk, etc..) to reliably tell me when either /home/ec2-user/absiteProd/frontend/package.json, /home/ec2-user/absiteProd/server/package.json or anything in 'home/ec2-user/absiteProd/frontend/sources` has changed?

Currently I'm having some success with:

if `git log --stat -n 1` | grep --quite frontend/src/* ; then cd home/ec2-user/frontend npm run build fi 

But since this seems like such a common requirement in app deployment, I feel like there must be a simpler way?

1
  • I have edited the answer with a possible script Commented Jul 4, 2017 at 19:40

2 Answers 2

1

You can find a similar need in this thread:

How do I find a last commit for the given directory inside the repository?

I want to avoid rebuilding the specific part of the project if there were no changes in it since the last build, so I need to find the sha of the last time the directory was changed.

You can compare the last commit where an element is modified, using git rev-list:

git rev-list -1 HEAD -- frontend/package.json git rev-list -1 HEAD -- absiteProd/server/package.json git rev-list -1 HEAD -- frontend/src 

with the current HEAD SHA1 (git rev-parse, the --verify is optional):

git rev-parse --verify HEAD 

That is:

h=$(git rev-parse --verify HEAD) b=false if [[ "$(git rev-list -1 HEAD -- frontend/package.json)" == "${h}" ]]; then b=true; fi if [[ "$(git rev-list -1 HEAD -- frontend/package.json)" == "${h}" ]]; then b=true; fi if [[ "$(git rev-list -1 HEAD -- frontend/package.json)" == "${h}" ]]; then b=true; fi if !b; then exit 0; fi cd home/ec2-user/frontend npm run build 
Sign up to request clarification or add additional context in comments.

8 Comments

@AlexBollbach Yes, that is the idea. I would actually exit if there is a difference. If all three tests don't show any difference, then npm install /frontend
sry i deleted my comment. i wanted to do some research on your answer first.. I'm trying to understand the output of rev-list.
@AlexBollbach No problem. I have updated/fixed the script.
this will take me sometime to grok.. looking at it though, it doesn't seem to allow 3 conditional commands. npm install for frontend/backend and npm run build, it would need a,b,c booleans, no?
@AlexBollbach the idea is that, if any of the file has changed in the last commit (HEAD), b is true, meaning you must build.
|
1

Git does not store directories in any useful way, so you must define what you mean by "anything in" yourself (which has its advantages since you can define what you mean rather than getting stuck with someone else's useless-to-you definition, but means you must do more work).

That said, Git stores each file as a path name within each commit. Your deployment script takes some work-tree—in this case, /home/ec2-user/absiteProd—from one state to another. Since it uses git checkout to do so, and git checkout does nothing special with time stamps, you now have many options with many different low-level details and subsequent consequences. Here are two obvious-ish and reasonably simple starting points:

  • Was /home/ec2-user/absiteProd exactly the same as some previous commit? If so, which commit? (Commits have unique hash IDs and these are generally the things to use in scripts.) You can then have Git compare the previous commit with the new commit, using git diff --name-status for instance. This is similar to what you are doing now, but better.

    If your deployment script is a post-receive script, you already have both the old and new hash IDs of the reference, which you have read from standard input. Hence the set of files changed, with their statuses, between those two commits, is:

     git diff-tree -r --name-status $oldhash $newhash 
  • If git checkout wrote on any file(s), those files will have "now" as their modify-time time-stamps, since git checkout just lets the system's time apply to updated files. Can you use this? As long as you never deploy more than twice in a single second, you could combine this with the make build-system, which builds files based on time-stamps.

If make is suitable here, it is probably the best choice, except for its maximum of one-per-second deployment (or whatever your underlying OS has for time stamp resolution on files). You can just declare that whatever the output file(s) is/are, they depend on the corresponding input file(s), and give the recipe to build the output(s) from the input(s) and run make.

4 Comments

in your first bullet-point, are you talking about running git diff ... in the work-tree?
No, in the post-receive script itself, before or after running git checkout. You can do this at any time since the commits now exist in the repository, and all Git is doing is comparing the commits themselves directly (regardless of what is in any work-tree anywhere).
i did something similar after asking the question: if git log --stat -n 1 | grep --quiet src/*; then #run script for src/ change else echo do nothing fi
The defects here are: (1) git log compares the tip commit (in this case HEAD) to its immediate parent, which is not necessarily the commit that you had in this work-tree before (what if someone pushed three commits just now, and the files in question were changed in the first two but not in the third?). (2) git log --stat abbreviates path names to fit into 80 columns. You can defeat this (stackoverflow.com/q/10459374/1256452) but it's better to just go straight to the plumbing commands. (3) git log and other porcelain commands obey user config and are not meant for scripting.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.