Consider a nearest-neighbor Ising Hamiltonian $$H = \sum_{i=1}^n J_i \sigma_i^z \sigma_{i+1}^z.$$ Let $X = \sum_{i=1}^n \sigma_i^x$. The QAOA ansatz is $$|\mathbf{\beta}, \mathbf{\gamma}\rangle = \exp\left(-i\beta_1 X \right)\exp\left(-i\gamma_1 H \right) \dots \exp\left(-i\beta_p X \right)\exp\left(-i\gamma_p H \right) |+\rangle^{\otimes n}$$ The expectation value of an operator $A$ under this ansatz is $\langle A \rangle = \langle\mathbf{\beta}, \mathbf{\gamma}|A|\mathbf{\beta}, \mathbf{\gamma}\rangle$.
Let's say that $A$ is supported only on sites $2$ and $3$. And consider $p=1$. Then we see that \begin{align*} \langle A \rangle &= \langle +|^{\otimes n} \exp(i\gamma_1 H)\exp(i\beta_1 X)A\exp(-i\beta_1 X)\exp(-i\gamma_1 H) |+\rangle^{\otimes n} \\ &= \langle +|^{\otimes n} \exp\left(i\gamma_1 H\right)\exp\left(i\beta_1 (\sigma_2^x + \sigma_3^x)\right)A\exp\left(-i\beta_1 (\sigma_2^x + \sigma_3^x)\right)\exp\left(-i\gamma_1 H\right) |+\rangle^{\otimes n} \\ &= \langle +|^{\otimes n} \exp\left(i\gamma_1 \left(J_1 \sigma_1^z \sigma_2^z + J_2 \sigma_2^z \sigma_3^z + J_3 \sigma_3^z \sigma_4^z\right)\right)\exp\left(i\beta_1 (\sigma_2^x + \sigma_3^x)\right)A\\&\qquad\exp\left(-i\beta_1 (\sigma_2^x + \sigma_3^x)\right)\exp\left(-i\gamma_1 \left(J_1 \sigma_1^z \sigma_2^z + J_2 \sigma_2^z \sigma_3^z + J_3 \sigma_3^z \sigma_4^z\right)\right) |+\rangle^{\otimes n} \\ \end{align*} where:
- in the second line we used that all the other $\sigma_i^x$ terms commuted through $A$ and canceled, since $A$ is only supported on 2 and 3.
- in the third line we used that all the other $\sigma_i^z \sigma_{i+1}^z$ terms in $H$ commute through $A$ and canceled.
Now look at the last line. Imagine if $p=2$. Then we would have another $\exp(-i \beta_2 X)$ to worry about. This time though, the $\sigma_1^z$ and $\sigma_4^z$ terms would get in the way of $\sigma_1^x$ and $\sigma_4^x$ commuting all the way through $A$ and canceling out. And then similarly, when we add the $\exp(-i \gamma_2 H)$ term, the $\sigma_4^z \sigma_5^z$ term would not commute all the way through and cancel, because now we have a $\sigma_4^x$ term that gets in the way. You can imagine doing this again with $p=3$, and so on.
So in summary:
- we want to determine the expectation value of some local observable $A$.
- when we look at $\langle A \rangle$ at $p=1$, almost all of the terms commute through $A$ and cancel. So only the qubits very close to the support of $A$ actually contribute to $\langle A \rangle$.
- for each successive $p$, you get more terms (e.g. $\sigma_i^x$ terms) "in the way", so the full expression for $\langle A \rangle$ includes qubit terms that are less local to $A$.
- eventually, all the qubits will contibute to $\langle A \rangle$, even though $A$ itself is only supported locally on a small number of qubits.
Potentially this is everything you already knew. For the TSP specifically, the exact meaning of $p$ depends on the particular Ising/QUBO formulation of the TSP. But certainly $p$ is related to the number of variables that are being taken into account at a time when calculating some observable quantity (this is what we just showed above). With TSP, variables are presumably related to nodes that must be visited, perhaps at a certain time interval. So then trying to optimize $\langle A \rangle$ for some $A$ would only take into account nodes to are somehow $f(p)$-local to $A$, where $f$ is just some monotonically increasing function. But recall with QAOA that you are trying to optimize $\langle H \rangle$. $H$ is composed of a sum of local observables. So I don't think it is right to say that level-$p$ gives you something related to a $p$-locally optimized solution. I think it is much more complicated, because you are optimizing a global sum of a bunch of $f(p)$-local quantities, which is itself still somehow global.
So in summary, how exactly $p$ is related to properties of the resulting solution to TSP is very nontrivial. Truthfully, it may be that this is a good thing. If we could say something as definite as "QAOA${}_p$ gives a $p$-locally optimized solution", then we would almost definitely find that classical "$p$"-local greedy methods would compete with and/or outperform QAOA${}_p$. The complicated behavior of how a solution is related to $p$ is potentially a reason that QAOA could give us heuristically better performance than classical approximation methods.
This is all just my impression, which could of course be wrong.