I see two main reasons:
Pragmatic reasons: It is a reasonable dividing line for most problems that arise in practice. Most problems that people encounter in practice seem to have time complexity something like $O(n)$, $O(n \log n)$, $O(n^2)$, $O(n^3)$, $O(2^n)$, etc. You'll see that there is a big gap there: there are not many algorithms that fall in the intermediate zone (i.e., super-polynomial but sub-exponential running time). Consequently, "polynomial or not?" does a pretty good job of separating the algorithms that are feasible from those that are not, for most problems we encounter in practice. Definitely not perfect, and there absolutely are counter-examples, but it's pretty good.
For example, if you run across a real-world problem in practice, and then I tell you that there is a polynomial-time algorithm for it, then empirically it is likely that there is a practical algorithm for it. Not always, it's not guaranteed, but this rule of thumb seems to hold reasonably well in practice. In particular, this means that the theory (which is framed in terms of worst-case polynomial running time) makes predictions that are reasonably accurate in practice.
Theoretical reasons. This dividing line has some nice theoretical properties that are convenient for building a clean mathematical theory. For instance, polynomials are closed under composition, which means that if you have an algorithm that does a polynomial number of steps and calls a subroutine that runs in polynomial-time, the result also runs in polynomial time. This makes it easier to prove many theoretical results, and enables faster progress from theoreticians. It abstracts away aspects of the problem (e.g., $O(n^2)$ vs $O(n^3)$) that are less important and enables theoreticians to focus more narrowly on a particular, cleanly-specifiable mathematical problem.
Also, for almost every implementable model of computation we know, if it runs in polynomial time on that model of computation, it runs in polynomial time on essentially every other model of computation that can be instantiated in practice. This means that the theoretical model is robust to minor details in how computers are formalized. This too is convenient for building a theoretical foundation.
When you combine these two -- a clean formalization that enables theoreticians to make more progress, and reasonably accurate predictions in practice -- we obtain a mathematical theory that is both deep and useful.
There is a lot written on this subject. Please read Why polynomial time is called "efficient"?, Why are Polynomial Time Problems Considered Tractable, and Larger Times are Not?, https://mathoverflowWhy is "P vs.net/q/54239/37212 NP" necessarily relevant?, https://cstheory.stackexchange.com/q/6660Polynomial-time algorithms with huge exponent/5038constant, and https://web.archive.org/web/20191113115650/http://people.eecs.berkeley.edu/~luca/cs170/notes/lecture21.pdfthese lecture notes to learn more. (Thanks to Mark Dominus for providing one of these links.)