All the answers so far seem to miss a very basic point: the functional form you choose should be flexible enough to capture the features that are scientifically relevant. Models 2-5 impose zero coefficients on some terms without scientific justification. And even if scientifically justified, Model 1 remains appealing because you might as well test for the zero coefficients rather than impose them.
The key is understanding what the restrictions mean. The typical admonition to avoid Models 3-5 is because in most applications the assumptions they impose are scientifically implausible. Model 3 assumes X2 only influences the slope dY/dX1 but not the level. Model 4 assumes X1 only influences the slope dY/dX2 but not the level. And Model 5 assumes neither X1 nor X2 affects the level, but only dY/dX1 or dY/dX2. In most applications these assumptions don't seem reasonable. Model 2 also imposes a zero coefficient but still has some merit. It gives the best linear approximation to the data, which in many cases satisfies the scientific goal.