My problem is as follows. I have 2D points representing longitude and latitude. I do not have any other information (no time-stamps). The number of points is limited, about some hundreds. The curve is not of the form Latitude = f(Longitude).
The task to achieve is to obtain a smooth version of the curve, i.e. a parametrization from t=0 to t=1 following the curve. As shown on the following example (desired output in red), I prefer to limit curvature of the predicted curve.
Are there general methods to solve this kind of problem? I already used some R packages and methods but with mixed results.
Here the example of points.
set.seed(1222) N = 250 r = 7*runif(N) x = r*cos(r) + rnorm(N, 0, 0.01) y = x*r*sin(r) + rnorm(N, 0, 1) M = data.frame(cbind(x,y)) M = M[-which(M[,1] > 4 & M[,1] < 5),] plot(M)
For now, I tried different things:
1/ Doing a simple non-parametric smoothing do not give desired outputs.
# Take the mean over closest points distance_M = as.matrix(dist(M)) M_new = M eps = 5 for(i in 1:nrow(M)) { M_new[i,] = apply(M[which(distance_M[i,] < eps),], 2, mean) } plot(M) lines(M_new, type = "p", col = "red") 2/ Manifold identification to transform points to a time-series. This works a little better, although it is very sensitive to the chosen parameter. I tested kPCA, LLE and isomap. Only isomap gives correct outputs.
## Order according to the 1st component of linear PCA # (will not work for this example, even with a nonlinear kernel) library(kernlab) library(colorRamps) kpc <- kpca(~.,data=data.frame(M), kernel="vanilladot", kpar = list(), features = 2) out_kpc = predict(kpc,data.frame(M))[,1] plot(M, col = blue2red(300)[cut(out_kpc,300,labels=FALSE)]) ## Isomap: # * Global manifold learning # * Need compute of the distance, so can be very long with > 2000 points library(vegan) library(colorRamps) out_isomap = isomap(dist(M), k=5, ndim = 1) plot(M, col=blue2red(300)[cut(out_isomap$points,300,labels=FALSE)]) # k < 5 : Data are fragmented # 5 <= k <= 11 : OK # k > 11 : Looks like PCA ## LLE: # * Local manifold learning (may not be suitable for some applications) # * Quicker than isomap library(lle) library(colorRamps) out_lle = lle(M, m = 1, k=9) plot(M, col=blue2red(300)[cut(out_lle$Y,300,labels=FALSE)]) Here is the result with isomap. From this result, I can try to smooth the obtained time series ('the obtained time series' = the curve shown in gray)... 
3/ Other ideas...
I found an article about "Geodesic Regression on Riemannian Manifolds", but I cannot find a related R package for it.
For linear identification, orthonormal linear regression can do this task, because x and y are taken in account symmetrically. For non linear regression, I tried non linear PCA (in kernlab) and "Orthogonal Nonlinear Least-Squares Regression" (onls package) without results.
Once the correct time series is identified, I have seen we may use "Kalman-Filter" to smooth the curve. I have shown many R examples with one output (one dimensional time series forecasting), but I cannot find a simple code example for this task (two dimensional smoothing).
