I'm unclear how TDD, the methodology, handles the following case. Suppose I want to implement the mergesort algorithm, in Python. I begin by writing

 assert mergesort([]) === []

and the test fails with

> NameError: name 'mergesort' is not defined

I then add

 def mergesort(a):
 return []

and my test passes. Next I add

 assert mergesort[5] == 5

and my test fails with

> AssertionError

which I make pass with

 def mergesort(a):
 if not a:
 return []
 else:
 return a

Next, I add

 assert mergesort([10, 30, 20]) == [10, 20, 30]

and I now have to try to make this pass. I "know" the mergesort algorithm so I write:

 def mergesort(a):
 if not a:
 return []
 else:
 left, right = a[:len(a)//2], a[len(a)//2]
 return merge(mergesort(left)), mergesort(right))

And this fails with

> NameError: name 'merge' is not defined

Now here's the question. How can I run off and start implementing `merge` using TDD? It seems like I can't because I have this "hanging" unfulfilled, failing test for `mergesort`, _which won't pass until `merge` is finished!_ If this test hangs around, I can never really do TDD because I won't be "green" during my TDD iterations constructing `merge`.

It seems like I am stuck with the following three ugly scenarios, and would like to know (1) which one of these does the TDD community prefer, or (2) is there another approach I am missing? I've watched several Uncle Bob TDD walkthroughs and don't recall seeing a case like this before! 

Here are the 3 cases:

 1. Implement merge in a different directory with a different test suite.
 2. Don't worry about being green when developing the helper function, just manually keep track of which tests you _really_ want to pass.
 3. Comment out (GASP!) or delete the lines in `mergesort` that call `merge`; then after getting `merge` to work, put them back in.

These all look silly to me (or am I looking at this wrong?). Does anyone know the preferred approach?