1
$\begingroup$

I need to group together people driving together based on GPS data. Data are collected by mobile phones. From each user we receive them in batches every 10 seconds. Each batch have list of GPS data (location, speed, direction) collected every 2 seconds.

Ideal solution would be to process this data in real time and identify/update groups of people driving together. But we may receive data from users out-of-order (e.g., due to connectivity loss). Eventually we should get all entries, but this makes real time processing much more complicated.

Instead, I want to start with post-processing first. I plan to normalize data from each user in given period using linear regression - to have locations at same timestamps. And then group users together using some clustering algorithm. Would this be a good approach? If yes, then which clustering algorithm would you recommend. Or maybe there better ways to solve this?

$\endgroup$
2
  • $\begingroup$ It is not clear, what is meant by driving together . Does that mean people travelling in almost similar direction with similar speed at same location ? So is it just similar direction irrespective of speed ? $\endgroup$ Commented Oct 29, 2022 at 13:37
  • $\begingroup$ @amolgoel By driving together I mean people which are in the same vehicle. If I would have continues and accurate position of all of those people, then I would say that, people driving together are people which are, at any given point of time in 1 or 2m radius. And think about this I realised that normalisation/interpolation of GPS data from users is crucial for this. $\endgroup$ Commented Oct 30, 2022 at 20:20

1 Answer 1

0
$\begingroup$

The features are lat, long, speed, direction. They decide the cluster. Since you do not know the number of clusters, K-means can not be used. If you have outliers in data , use DBSCAN. Else use Hierarchial clustering. In hierarchial clustering, decide the number of clusters with dendogram. The type of distance between clusters can be 'single'. Single is more suitable for linkage or serial clusters.

$\endgroup$

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.