The "square root" in the sense of $-(a^2 +b^2)=(ia-b)(ia+b)\,\,$ of the Klein-Gordon's differential operator is the Dirac differential operator. That is the main message:
$$(i\gamma^\mu\partial_\mu-m)(i\gamma^\nu\partial_\nu +m) = -\gamma^\mu\gamma^\nu\partial_\mu\partial_\nu -m^2 = -(\gamma^{\left[\mu\right.}\gamma^{\left.\nu\right]} + \gamma^{\left(\mu\right.}\gamma^{\left.\nu\right)})\partial_\mu\partial_\nu -m^2 = -\gamma^{\left(\mu\right.}\gamma^{\left.\nu\right)} \partial_\mu\partial_\nu -m^2 = -\eta^{\mu\nu}\partial_\mu\partial_\nu -m^2 = -(\Box +m^2)$$\begin{align}(i\gamma^\mu\partial_\mu-m)(i\gamma^\nu\partial_\nu +m) &= -\gamma^\mu\gamma^\nu\partial_\mu\partial_\nu -m^2 \\ &= -(\gamma^{\left[\mu\right.}\gamma^{\left.\nu\right]} + \gamma^{\left(\mu\right.}\gamma^{\left.\nu\right)})\partial_\mu\partial_\nu -m^2 \\ &= -\gamma^{\left(\mu\right.}\gamma^{\left.\nu\right)} \partial_\mu\partial_\nu -m^2 \\ &= -\eta^{\mu\nu}\partial_\mu\partial_\nu -m^2 = -(\Box +m^2) \end{align}
In the algebra we threw away $\gamma^{\left[\mu\right.}\gamma^{\left.\nu\right]}\partial_\mu\partial_\nu$ because $\gamma^{\left[\mu\right.}\gamma^{\left.\nu\right]}$ is antisymmetric whereas the sucession of two partial derivatives is symmetric. It is also essential for this decomposition that the "mass operator" commutes with $i\gamma^\mu\partial_\mu$.
The analogy of a square root can be still extended a little bit further. Instead of writing the first term of the KG-equation (short Klein-Gordon: KG) $\partial^\mu \phi \partial_\mu\phi $ we can apply partial integration and write:
$$\partial^\mu \phi \partial_\mu\phi = -\phi \partial^\mu\partial_\mu \phi + \partial^\mu (\phi \partial_\mu \phi)$$
and just neglect the total partial derivative $\partial^\mu (\phi \partial_\mu \phi)$. Such a Lagrangian also leads to the KG-equation. In that case the KG-Lagrangian looks like this:
$${\cal{L}} = -\phi (\Box + m^2)\phi$$
In order to reach as close as possible to the Dirac Lagrangian we actually better start off from the Lagrangian of the complex KG-field:
$${\cal{L}} = -\phi^\dagger (\Box + m^2)\phi$$
We replace $\phi \rightarrow \psi$ and $\phi^\dagger \rightarrow \bar{\psi}$ as a first step and then choose of the 2 factors $i\gamma^\mu\partial_\mu\pm m$ the one with the minus sign :
$${\cal{L}} = \bar{\psi}(i\gamma^\mu\partial_\mu-m)\psi$$
Therefore, it is not the Lagrangian that is square-rooted, but the differential operator inside.
Needless to say that we cannot just keep the complex scalar field sandwiching the differential operator as the $\gamma^\mu$ matrices have not only vector character but are also "bispinor tensors"(the indices are usually suppressed in order to avoid clutter) which requires spinors as sandwich in order to make the Lagrangian Lorentz-covariant.
It was the genius of P.Dirac to recognize this thereby giving up the scalar character of the involved fields.