$\DeclareMathOperator*{\argmin}{argmin}$
Before you try to solve the inverse problem, you have to address an issue of identifiability. That is, from the observations along $v(u)$, it is impossible to uniquely specify all four of your parameters simultaneously. To see this, notice that the derivative of $v(u)$ can we written as $$ \frac{dv}{du} = \frac{v'}{u'} = \frac{au+bv}{cu+dv} = \left(\frac{a}{c}\right)\frac{u + \left(\frac{b}{a}\right)v}{u + \left(\frac{d}{c}\right)v} = \alpha\frac{u+\beta v}{u+\gamma v}. $$ Therefore, only the ratios $\alpha = a/c$, $\beta = b/a$, and $\gamma = d/c$ are uniquely identifiable from the curve $v(u)$. However, once this has been established, you can apply more standard methods to determine $\alpha$, $\beta$, and $\gamma$ from $v(u)$.
Assume you have a set of $n$ observations $V = (v_1,\dots,v_n)^T$ corresponding to the points $U = (u_1,\dots,u_n)^T$ and write our parameters to be inferred in a vector $q = (\alpha,\beta,\gamma)^T$. Since $u$ is a scalar, we assume these are in increasing order, although it could be done in any order as long as we properly select the initial condition for the above ODE. To select this we can just select the smallest value of $u_i$, supply the initial condition $v(u_i) = v_i$, and integrate with increasing $u$.
Once this has been established, we consider the function $f:\mathbb{R}^3\to\mathbb{R}^n$ given by $f(q) = (v(u_1;q),\dots,v(u_n;q))^T$, where $v(u_i;q)$ is the numerically computed solution of the $dv/du$ ODE with parameters $q = (\alpha,\beta,\gamma)^T$ and initial condition discussed above. With this, we can write the nonlinear least squares problem $$ \begin{aligned} J(q) &= \|V - f(q)\|_2^2 \\ q^* &= \argmin_{q\in\mathbb{R}^3} J(q). \end{aligned} $$ Once we have a routine to compute $J(q)$ (or just $V-f(q)$ depending on the solver used), we can use most unconstrained optimization solvers to find an answer. Note that we don't (and don't want to) compute the gradient of $f$, so we must rely on finite-difference derivatives, which is the default for most general purpose routines. This is done in fminunc and lsqnonlin in MATLAB, and optimize.minimize and optimize.least_squares in SciPy. Alternatively, some languages are able to compute derivatives using adjoints or automatic differentiation, such as in Julia's SciML ecosystem, which was mentioned in another answer, although this allows for finite differences as well.