It has been a long time since I've thought about this problem. It is probably hard to solve with NDSolve, but it works with finite differences. Start with conditions on the surface and work your way down the rod string to the bottom.
General wave equation with viscous damping and gravity.
pde = D[u[x, t], t, t] == a^2*D[u[x, t], x, x] - c*D[u[x, t], t] - g
u = displacement of rod x = position on the rod string a = speed of sound in the rod string (wave velocity) c = viscous damping coefficient g = acceleration due to gravity F = Load (Tension at a point on the rod)
Finite differences as a rule.
fd = {dttu[i, n] -> ( u[i, n + 1] - 2 u[i, n] + u[i, n - 1])/Δt^2, dxxu[i, n] -> ( u[i + 1, n] - 2 u[i, n] + u[i - 1, n])/Δx^2, dtu[i, n] -> (u[i, n + 1] - u[i, n - 1])/(2 Δt), dxu[i, n] -> (u[i + 1, n] - u[i - 1, n])/(2 Δx), F[i, n] -> -EA (u[i + 1, n] - u[i - 1, n])/(2 Δx), a -> Δx/Δt, c -> γ/Δt}
Get the pde in finite difference form.
pdefd = dttu[i, n] == a^2 dxxu[i, n] - c dtu[i, n] - g /. fd (*(u[i, n - 1] - 2*u[i, n] + u[i, n + 1])/Δt^2 == -((γ*(u[i, n + 1] - u[i, n-1]))/(2*Δt^2)) + (u[i - 1, n] - 2*u[i, n] + u[i + 1, n])/Δt^2 - g*) Solve[pdefd, u[i + 1, n]] // Flatten // Simplify // Collect[#, {u[i, n + 1], u[i, n - 1]}] & (*{u[i + 1, n] -> (1/2)*(2 - γ)*u[i, n - 1] + (γ/2 + 1)*u[i, n + 1] - u[i - 1, n] + Δt^2*g}*) {u[i + 1, n] -> (1/2 (2 - γ) // Expand) u[i, n - 1] + (γ/2 + 1) u[i, n + 1] - u[i - 1, n] + Δt^2 g} (*{u[i + 1, n] -> (1 - γ/2) u[i, n - 1] + (γ/2 + 1) u[i, n + 1] - u[i - 1, n] + Δt^2 g}*)
Get u[2,n]
u2nRule = % /. i -> 1 (*{u[2, n] -> (1 - γ/2)*u[1, n - 1] + (γ/2 + 1)*u[1, n + 1] - u[0, n] + Δt^2*g}*)
From the F finite difference
(u2nRule /. u[0, n] -> u[2, n] - (2 Δx)/EA F[n])[[1]] /. Rule -> Equal (*u[2, n] == (1 - γ/2) u[1, n - 1] + (γ/2 + 1) u[1, n + 1] - u[2, n] + (2 Δx F[n])/ EA + Δt^2 g*) Solve[%, F[n]] // Flatten // Expand // Collect[#, {u[1, n + 1], u[1, n - 1]}] & (*{F[n] -> u[1, n - 1]*((γ*EA)/(4*Δx) - EA/(2*Δx)) + u[1, n + 1]*(-((γ*EA)/(4*Δx)) - EA/(2*Δx)) + (EA*u[2, n])/Δx - (Δt^2*EA*g)/(2*Δx)}*)
Simplify a little
F[n] == EA/(2 Δx) ((F[n] /. % ) (2 Δx)/EA // Expand) // Collect[#, {EA/(2 Δx), u[1, n + 1], u[1, n - 1]}] & (*F[n] == (EA*((γ/2 - 1)*u[1, n - 1] + (-(γ/2) - 1)*u[1, n + 1] + 2*u[2, n] + Δt^2*(-g)))/(2*Δx))
Manually change back to i's and n's. 2->i+1, 1->i
F[n] == (EA (-((1 - γ/2) u[i, n - 1]) - (γ/2 + 1) u[i, n + 1] + 2 u[i + 1, n] + Δt^2 (-g)))/(2 Δx);
Now that we have established the finite difference equations we can implement the finite difference procedure. We will use actual well data I have saved from a well measured in the early 80's.
data = {{0., 0., 12.689}, {0.335, 0.161, 13.933}, {0.446, 0.31, 14.928}, {0.538, 0.459, 15.177}, {0.611, 0.621, 15.177}, {0.865, 1.242, 15.799}, {1.049, 1.863, 18.66}, {1.217, 2.483, 20.65}, {1.368, 3.104, 21.77}, {1.503, 3.725, 22.765}, {1.641, 4.346, 24.009}, {1.78, 4.967, 24.507}, {1.899, 5.588, 22.89}, {2.016, 6.208, 20.65}, {2.132, 6.829, 19.531}, {2.266, 7.45, 18.411}, {2.382, 8.071, 18.162}, {2.498, 8.692, 17.789}, {2.615, 9.313, 17.789}, {2.755, 9.933, 19.655}, {2.916, 10.554, 20.277}, {3.077, 11.175, 20.402}, {3.275, 11.796, 20.028}, {3.334, 11.957, 19.531}, {3.413, 12.106, 19.033}, {3.509, 12.268, 17.54}, {3.717, 12.417, 16.172}, {3.921, 12.268, 15.301}, {3.995, 12.106, 15.052}, {4.068, 11.957, 15.052}, {4.122, 11.796, 14.928}, {4.286, 11.175, 12.938}, {4.415, 10.554, 10.574}, {4.528, 9.933, 8.832}, {4.625, 9.313, 7.34}, {4.723, 8.692, 6.22}, {4.822, 8.071, 5.598}, {4.921, 7.45, 6.469}, {5.017, 6.829, 9.206}, {5.11, 6.208, 11.694}, {5.202, 5.588, 13.062}, {5.293, 4.967, 14.182}, {5.404, 4.346, 15.052}, {5.496, 3.725, 15.177}, {5.61, 3.104, 14.182}, {5.749, 2.483, 11.569}, {5.909, 1.863, 10.325}, {6.091, 1.242, 10.45}, {6.327, 0.621, 13.186}, {6.406, 0.459, 13.808}, {6.504, 0.31, 14.928}, {6.621, 0.161, 13.933}, {6.889, 0., 12.689}};
Actual well data of the polish rod position and load with time. The polish rod is the top rod of the rod string and is above the surface.
Column 1 is the time in seconds
Column 2 is the position in feet.
Column 3 is the load in units of 1000 lbs.
These are practical oilfield units. Not metric
Assign the values to variables
timez = Table[data[[n, 1]], {n, Length[data]}]; posz = Table[data[[n, 2]], {n, Length[data]}]; loadz = Table[data[[n, 3]], {n, Length[data]}]; tmax = timez[[Length[data]]];
Interpolation functions for position and load
posT = Interpolation[Table[{timez[[n]], posz[[n]]}, {n, Length[data]}]]; loadT = Interpolation[Table[{timez[[n]], loadz[[n]]}, {n, Length[data]}]];
Plot polish rod position vs polish rod load
ParametricPlot[{posT[t], loadT[t]}, {t, 0, 6.889}, AspectRatio -> 1/GoldenRatio, PlotRange -> {{0, 13}, {0, 25}}, AxesLabel -> {"Position", "Load"}]

The polish rod ( the top rod that is above ground) is moving clockwise in the above plot. Higher loads occur when the rod string is rising. It is jerky because it is real data.
More well data.
sg = 0.993;(* produced fluid specific gravity*) a = 1.95538 10^4;(*speed of sound in steel, ft/sec*) Ey = 3 10^4;(*Young's modulus KSI*) g = 32.2 ;(*gravitation constant, ft/sec^2*) n1 = 5; (*initial rod string segments*) n2 = 5;(*alternate rod string setments if necessary*) roddia = {1.0, 0.875, 0.75};(*the rod string consists of 3 segements with these \ diameters, inches*) rodlen = {1950, 2025, 1893};(*rod string segment lengths, feet*) c = 0.2;(*damping coefficient, 1/second*) area = π roddia^2/4;(*square inches)
Now move down the rod string.
ll = 0; nrod = Length[rodlen]; timez = Table[data[[n, 1]], {n, Length[data]}]; posz = Table[data[[n, 2]], {n, Length[data]}]; loadz = Table[data[[n, 3]], {n, Length[data]}]; posT = Interpolation[Table[{timez[[n]], posz[[n]]}, {n, Length[data]}]]; loadT = Interpolation[ Table[{timez[[n]], loadz[[n]]}, {n, Length[data]}]]; Do[ ll = ll + rodlen[[m]]; If[m < nrod, buoy = 0.433 sg ll (area[[m]] - area[[m + 1]])/1000]; EA = Ey area[[m]]; Δx = -rodlen[[m]]/n1; n3 = n2; Δt = -Δx/a; γ = c Δt; j = 40; k = Floor[tmax/Δt + j]; u = Table[0, {ii, 7}, {jj, k}]; t = Table[0, {ii, k}]; F = Table[0, {ii, k}]; Do[ t[[n]] = (n - j/2 - 1) Δt; time = If[t[[n]] >= 0, If[t[[n]] <= tmax, t[[n]], t[[n]] - tmax], t[[n]] + tmax]; u[[1, n]] = posT[time]; F[[n]] = loadT[time]; , {n, 1, k} ]; Do[ u[[2, n]] = 0.5 ((1 - γ/2) u[[1, n - 1]] + (1 + γ/2) u[[1, n + 1]] + 2 Δx/EA F[[n]] + g Δt^2); , {n, 2, k - 1} ]; Do[ Do[ u[[i + 1, n]] = (1 + γ/2) u[[i, n + 1]] - u[[i - 1, n]] + (1 - γ/2) u[[i, n - 1]] + g Δt^2; , {n, i + 1, k - i} ]; , {i, 2, n2 + 1} ]; If[m < nrod, loadz = Table[0, {ii, k}]; timez = Table[0, {ii, k}]; posz = Table[0, {ii, k}]; ]; Do[ F[[n]] = EA/Δx/ 2 (2 u[[n2 + 2, n]] - (1 - γ/2) u[[n2 + 1, n - 1]] - (1 + γ/2) u[[n2 + 1, n + 1]] - g Δt^2); t[[n]] = (n - j/2 - 1) Δt; If[m < nrod && t[[n]] >= 0 && t[[n]] <= tmax + 0.1, loadz[[n - n2 - 1]] = F[[n]] + buoy; timez[[n - n2 - 1]] = t[[n]]; posz[[n - n2 - 1]] = u[[n2 + 1, n]]; ]; If[t[[n]] < 0, t[[n]] = t[[n]] + tmax]; If[t[[n]] > tmax, t[[n]] = t[[n]] - tmax]; , {n, n2 + 2, k - n2 - 1} ]; timez = DeleteCases[timez, 0]; posz = DeleteCases[posz, 0]; loadz = DeleteCases[loadz, 0]; posT = Interpolation[ Table[{timez[[n]], posz[[n]]}, {n, Length[timez]}]]; loadT = Interpolation[Table[{timez[[n]], loadz[[n]]}, {n, Length[timez]}]]; nn = k - 2 (n2 + 1); n2 = n1; , {m, 1, 3} ];
Now we have positions and load conditions at the bottom of the rod string where the pump is. Make new interpolation functions so we can plot the bottom conditions.
tf = Table[t[[n]], {n, j/2 + 1, k - j/2, 2}]; loadf = Table[(F[[n - 3]] + F[[n - 2]] + F[[n - 1]] + F[[n]] + F[[n + 1]] + F[[n + 2]] + F[[n + 3]])/7, {n, j/2 + 1, k - j/2, 2}]; posf = Table[u[[n3 + 1, n]], {n, j/2 + 1, k - j/2, 2}]; posT = Interpolation[Table[{tf[[n]], posf[[n]]}, {n, Length[tf]}]]; loadT = Interpolation[Table[{tf[[n]], loadf[[n]]}, {n, Length[tf]}]]; ParametricPlot[{posT[t], loadT[t]}, {t, 0, tmax}, AspectRatio -> 1/GoldenRatio, AxesLabel -> {"Position", "Load"}]

The ideal bottom hole plot is a perfect rectangle, and this well is in pretty good shape. Again, the jerky plot is typical of real data. I smoothed the loads by taking a 7 point average. The damping coefficient in general is not known with great accuracy. Fortunately, the shape of the bottom hole curve is relatively insensitive to the damping coefficient, and it is the shape, that determines well problems. It is much more difficult to diagnose well problems from the surface data than it is with bottom hole data.
Again, in time the pump moves clockwise, the higher loads occur as the pump rises. This routine also takes into account the change in buoyancy force moving to a smaller diameter rod string.
This procedure is adapted from a FORTRAN program I wrote in about 1982. I have adapted it into Mathematica code, but some purists may think it is not very good Mathematica code and they will be right. This program is very fast on modern computers and I am not about to spend a bunch of hours messing with it. Be glad you don't have to run it on a 2 Mhz 286 computer without a math coprocessor.