This is a somewhat handwaving answer (you can tell by how long it is): I've held back from giving it in the hope that someone would arrive with a more formal one as I have forgotten a lot of this stuff, but no-one has, so.
Start off in Euclidean space, because it's easy to visualize. Geodesics are just straight lines. Straight lines have two properties:
- They are extrema (minima) of distance between any two points on the line: of all the curves connecting $A$ and $B$, the straight line is the shortest.
- They parallel-transport their own tangent vectors (and in fact any vector in the tangent space at a point on the curve). What this means is that if you take the tangent vector at any point on the line (in fact: any vector in the tangent space at that point) and drag it along the line so that it remains parallel to itself as it is dragged, then you get another tangent vector (...) at the point where it is dragged to.
(1) is pretty straightforward. (2) is disturbingly vague: I haven't really said what it means to 'parallel transport' a vector or even what a tangent vector / tangent space really is. Well, these can be made precise of course, but I'm going to cop out on that.
An important property of Euclidean space is that these two properties pick out the same curves: if you take, say, an arc of a circle neither the shortest curve between any two points on the arc, and nor does it parallel transport its own tangent vectors -- if you drag a tangent vector half-way-around a circle you'll get something which is $-1$ times the tangent vector at the new point! And in fact straight lines are the only lines which do both things in Euclidean space.
But these two properties are not obviously the same: in particular, if we consider a general manifold there may be no metric: so if we want to define parallel transport we either can't or we can't use the metric to do it.
Well, we can define parallel transport, and we do so using a connection, which is just a mechanism for defining what it means to parallel transport vectors: it defines how you connect the tangent spaces at different points on the manifold. With a connection (or perhaps with a suitably well-behaved one) you can define parallel transport, and geodesics which are now not extrema of length, because there is no length, but are families of curves which parallel transport their own tangent vectors. And you get things like covariant derivatives and so on (in fact I think covariant derivatives and connections are equivalent).
It's important to note that a connection is additional structure on a manifold: it's not something that you can derive from the basic notion of what it means to be a manifold.
So, if we add a metric to the manifold (which is another bit of additional structure) we now have two different notions of what it means to be a geodesic: geodesics can be curves which parallel transport their own tangent vectors, and curves which are extrema of length. These two notions correspond to the two additional bits of structure that we've added, and they do not need to be the same and will not be in general.
But we can choose them so they are. In particular if the connection is such that the inner product of any two vectors that are parallel transported along a curve remains constant, then the connection & the metric are compatible. There are some other definitions which are equivalent to this one: if the covariant derivative (which comes from the connection) of the metric vanishes then this is the same thing, for instance.
In fact there is slightly more that you need to do: a connection and a metric can be compatible but the connection can have torsion. I have never understood torsion very well, but I think it means that, although the inner products of vectors are preserved as you drag them along a curve, they somehow rotate together.
If you insist that the connection is both compatible with the metric and has no torsion then there is (I am almost sure, but possibly with some other sanity conditions added) exactly one such connection, and this is the Levi-Civita connection: it's the unique torsion-free Riemannian connection on a manifold (and Riemannian, to me anyway, means 'metric compatible' although I think there is a little more to it than that).
And for such a connection, then the two notions of geodesic will be the same: geodesics both parallel transport tangent vectors, and are extrema of length.
But none of this answers the question of why we should choose them to be compatible: we don't need to after all. I think that the answers to that are essentially that choosing them to be compatible gets us a spacetime which is 'closer' to Euclidean space, and also for which there are a bunch less parameters which need to be chosen somehow: if we picked them not to be compatible then we'd have to have some set of differential equations which specified how the connection behaved as well as ones which specified how the metric behaved, for instance). And, most compellingly, if we choose them to be compatible then we get a theory of gravitation which works very well.
People have investigated theories of gravity with torsion, for instance.