After breaking our code into reusable bits, how do we test and deploy?

Question

We started out with one developer, and one svn repo containing all our code:

^/foo/trunk/module-a ^/foo/trunk/module-b ^/foo/trunk/module-b/submodule-b1 ^/foo/trunk/website1

(at the time this was a big improvement). After this got a chance to grow for a bit we started having problems with circular dependencies, slow testsuites, and general difficulties re-using code (since e.g. website1's feature set had crept into otherwise generic module-a).

Wanting to modularize the code base, and expecting us to move to git shortly (and having read somewhere that git doesn't like svn mega-repos), we've transitioned to a much more granular structure:

^/module-a/trunk/ ^/module-b/trunk/ ^/module-b/trunk/sumbmodule-b1 ^/earlier-sub-sub-sub-module-c/trunk etc. (about 120 such modules)

This was conceptually great. More modular code, much faster test-suites, easier to document, etc. We open-sourced some of our more generic components, and made all modules pip installable (using pip install -e . to install them in the development virtualenv).

We created a ^/srv/trunk repository containing the folder structure of the runtime environment, ie. ^/srv/trunk/lib for the modules, /srv/trunk/src for the remains of ^/foo/trunk, ^/srv/trunk/www for websites etc.

And finally (taking an idea from perforce, which I worked with a very long time ago [https://www.perforce.com/perforce/r12.1/manuals/cmdref/client.html]) we created a "vcs-fetch" text file that listed all relevant repos and where they should be checked out into the dev environment, and a corresponding command to do so. E.g. a vcs-fetc line:

svn srv/lib/module-a ^/module-a/trunk

would cause either (first time)

cd /srv/lib && svn co ^/module-a/trunk module-a

or (afterwards)

cd /srv/lib/module-a && svn up

and similarly for github repos (both our own and altered/unaltered vendor packages).

We've used the same vcs-fetch process for creating the production environment, but we're quickly finding out that we have no way of knowing which version used to run in prod after doing a vcs-fetch.

With the mega-repo, we could just note the revision number before updating prod from trunk, and going back was a simple svn -r nnn up . away. With code in both svn and git (and one module in hg) -- and ~120 repos, it isn't obvious how to do this..

I read http://12factor.net/ today, and the first factor is "One codebase" so I'm also wondering if I'm way off the right path here?

One idea I had was to create a deploy script that would create pip-installable "deployment"-wheels and "bundle" them together in a requirements.txt file. A deployment would then involve creating a new virtualenv, pip-installing the requirements.txt file listing the deployment wheels, and switching the active virtualenv. Reverting to previous would just involve switching the virtualenv back (but unless we wanted to keep the virtualenvs around forever it wouldn't allow us to go back to any point in time -- in my experience that has never been needed though).

At this point I'm wondering if I'm walking in the wrong direction, or if I just haven't walked far enough on the right path..? (everything I'm reading keeps talking about "your app", and I don't know how that translates to running 14 websites off of the same code base...)

May I assume that the individual components are now developed by different teams with divergent development cycles? If so, breaking the repository apart is unavoidable either way. Even though with git, you would then place synchronized release tags for major, stable configurations. Have a look at Google's repo tool. Attempting to match development versions by integrated meta data is pretty much futile. Linking the application together via pip is perfectly legit as well. — Ext3h
– Ext3h, Commented Apr 15, 2016 at 12:45
If you please include estimates KLOC (1000 lines of code) and byte measures of the code we can easily get an idea of the size for example "2000 lines of code. 50 kilobytes source code." or "40 KLOC, 2 GB XML". . It seems what you need to just migrating to git and git has import functions. You can start by reading the git book. — Niklas Rosencrantz
– Niklas Rosencrantz, Commented Apr 26, 2016 at 6:39
@Programmer400 the codebase is: .py 670 kloc, .js: 135kloc, .less: 25kloc, .html: 130kloc. So big, but not huge. From what I've read git doesn't really like repos of this size, so I imagine we'll have to split into smaller repos before switching to git..? — thebjorn
– thebjorn, Commented Apr 26, 2016 at 8:23

gbjbaanb · Accepted Answer · 2016-06-23 09:57:41Z

It sounds like you're missing branches (or rather 'tags' or 'release' branches).

Instead of using your SVN revnum as a reference to determine which version you're installing, you should create a branch at that released revision. You then deploy that branch name.

It makes it easier to branch even if there are no changes so every module keeps the same release number, however your OSS packages might not like being branched with no changes, so the next best thing is to keep a script of dependencies - so version 5 of your product requires OSS module X v2 and so on.

You'd change your script to stop referring to versions and instead work with the branch names (although they can be anything, its best to decide on a fixed naming convention, eg Release_1_2_3)

Another hint is to maintain a file with each module describing the current version, you can auto-generate these if necessary, and maybe include a full changelog too, but it means anyone can see what version is deployed by just looking.

axl · Accepted Answer · 2016-06-23 09:42:11Z

I think you have a lot of good ideas already, I've used most of them on various projects throughout the years, and your primary concern seem to be the inability to tell what version of all modules where included in a given package if you split them up.

I'm all for splitting them up, at some level of granularity, especially if you have multiple teams and varying release cycles, as @Ext3h mentions.

Since I'm not sure how isolated your modules are, or how detailed you want your versioning to be, I'll suggest some options.

Use git submodules. With submodules, you can store each module in a separate git repo, similar to your svn setup, and also to what you're thinking of. You then link those modules to the root project which will contain a reference to the relevant commit of each submodule, for each of its own commits.

IMO this is a theoretically nice setup, and reasonably simple. The main drawbacks are that the workflow for submodules is a little bit awkward, however you seem to have solved such things nicely with scripts before, so it may not be a real problem.

The other caveat is that submodule commit references will just be a SHA1, there's never any human-readable details on what branch you are, and you may end up having to manually checkout the right branch when you want to do work directly in the submodule.

However, I've not used this pattern extensively, so I don't know how much of a problem it might be for a large project such as yours.

Another alternative is to use some sort of dependency manager. This requires that each module or set of modules can be versioned, packaged, and published individually, and that you have a system which can pull those packages together in the way you want when you want them.

You're suggesting pip already, and what seem to be missing from your suggestion is storing the resulting requirements.txt along with the build, or in the root project repo, so that you can re-create the virtualenv later rather than having to save it on disk.

There are other systems as well; I set up a rather large project using a slightly customized version of Apache Ivy as both the tool to package and publish each module, as well as pulling them together for the final project. Ivy also stores a manifest listing all the versions of all modules you're referencing, if you need to recreate the setup later.

Stack Exchange Network

After breaking our code into reusable bits, how do we test and deploy?

2 Answers 2

Hot Network Questions

After breaking our code into reusable bits, how do we test and deploy?

2 Answers 2

Related

Hot Network Questions