We started out with one developer, and one svn repo containing all our code:
^/foo/trunk/module-a ^/foo/trunk/module-b ^/foo/trunk/module-b/submodule-b1 ^/foo/trunk/website1 (at the time this was a big improvement). After this got a chance to grow for a bit we started having problems with circular dependencies, slow testsuites, and general difficulties re-using code (since e.g. website1's feature set had crept into otherwise generic module-a).
Wanting to modularize the code base, and expecting us to move to git shortly (and having read somewhere that git doesn't like svn mega-repos), we've transitioned to a much more granular structure:
^/module-a/trunk/ ^/module-b/trunk/ ^/module-b/trunk/sumbmodule-b1 ^/earlier-sub-sub-sub-module-c/trunk etc. (about 120 such modules) This was conceptually great. More modular code, much faster test-suites, easier to document, etc. We open-sourced some of our more generic components, and made all modules pip installable (using pip install -e . to install them in the development virtualenv).
We created a ^/srv/trunk repository containing the folder structure of the runtime environment, ie. ^/srv/trunk/lib for the modules, /srv/trunk/src for the remains of ^/foo/trunk, ^/srv/trunk/www for websites etc.
And finally (taking an idea from perforce, which I worked with a very long time ago [https://www.perforce.com/perforce/r12.1/manuals/cmdref/client.html]) we created a "vcs-fetch" text file that listed all relevant repos and where they should be checked out into the dev environment, and a corresponding command to do so. E.g. a vcs-fetc line:
svn srv/lib/module-a ^/module-a/trunk would cause either (first time)
cd /srv/lib && svn co ^/module-a/trunk module-a or (afterwards)
cd /srv/lib/module-a && svn up and similarly for github repos (both our own and altered/unaltered vendor packages).
We've used the same vcs-fetch process for creating the production environment, but we're quickly finding out that we have no way of knowing which version used to run in prod after doing a vcs-fetch.
With the mega-repo, we could just note the revision number before updating prod from trunk, and going back was a simple svn -r nnn up . away. With code in both svn and git (and one module in hg) -- and ~120 repos, it isn't obvious how to do this..
I read http://12factor.net/ today, and the first factor is "One codebase" so I'm also wondering if I'm way off the right path here?
One idea I had was to create a deploy script that would create pip-installable "deployment"-wheels and "bundle" them together in a requirements.txt file. A deployment would then involve creating a new virtualenv, pip-installing the requirements.txt file listing the deployment wheels, and switching the active virtualenv. Reverting to previous would just involve switching the virtualenv back (but unless we wanted to keep the virtualenvs around forever it wouldn't allow us to go back to any point in time -- in my experience that has never been needed though).
At this point I'm wondering if I'm walking in the wrong direction, or if I just haven't walked far enough on the right path..? (everything I'm reading keeps talking about "your app", and I don't know how that translates to running 14 websites off of the same code base...)