UPDATE: after additional clarification it turned out that OP meant "set of points" instead of "sequence of points", and so the problem is reduced to TSP. Ignore the following solution which assumes "sequence of points".
The first approach is that you could solve using Floyd-Warshall algorithm in $O(|V|^3)$ time.
1) Convert the grid into a weighted graph where $w(v_i,v_j) = 1$ for each neighbouring pair of points $(p_i,p_j)$ on the grid. I assume that every move in any of the eight directions from a point takes 1 step (but you may use a different metric).
2) Compute shortest paths for each pairs of vertices (points on the grid) $(v_i,v_j)$, by storing them in the matrix $M$, where $M[i][j]$ means the shortest path from point $p_i$ to the point $p_j$, using Floyd-Warshall algorithm.
3) Then for the given sequence of points $[p_1(x_1,y_1),\dots, p_k(x_k, y_k)]$, compute the sum $M[p_1][p_2] + M[p_2][p_3] + \dots + M[p_{n-1}][p_n]$ (The shortest path from $p_1$ to $p_2$ plus the shortest to $p_3$ plus $\dots$).
You could take a different approach (a greedy algorithm). Given a sequence of points $[p_1(x_1,y_1),\dots, p_k(x_k, y_k)]$, you first compute the shortest path from $p_1$ to $p_2$ (using Fredman-Tarjan algorithm, using Fibonacci heaps) in $O(E + V\log{V})$ time, then from $p_2$ to $p_3$, and so until the last point. So, its complexity is $O(K(E + V\log{V}))$, where $K$ is the number of points $p_i$. To improve further observe that each vertex has at most eight neighbours and so the graph has at most $8V/2 = 4V$ edges, and hence the time complexity is $O(K(4V + V\log{V}))$ which is $O(KV\log{V}))$.
Yet another approach to improve the greedy algorithm is using BFS for finding the shortest path from one point to another one since your graph has edges only of weight 1 (may be considered as unweighted). BFS works in $O(V+E)$ time or $O(V)$ since your vertices have constant number of neighbours (eight at most). So the running time may be reduced to $O(KV)$.
And finally, you may compute the shortest path from one point $p_i$ to another $p_j$ in $O(1)$ by $\max(|x_i-x_j|, |y_i-y_j|)$ given coordinates $(x_i, y_i)$ and $(x_j,y_j)$ and reduce the algorithm to $O(K)$ time.