2
$\begingroup$

I'm working with a dataset consisting of multiple CSV files, each representing time series data of accelerations (x, y, z) captured during vibration events. For each event, a sensor records data for the entire duration and then stops—so each file contains a full vibration event.

I've applied z-scoring, standardization, and PCA to reduce dimensionality. Then, I used k-means clustering on the principal components and obtained a meaningful clustering into 6 clusters. Since I don't have labeled anomaly data, I analyzed the distribution of samples across clusters. I noticed that Cluster 0 contains the majority of samples, Cluster 1 has significantly fewer, and the remaining clusters contain only a handful.

Based on this, I assumed that Cluster 0 likely represents normal behavior. I trained an autoencoder only on the data from Cluster 0 and then used it to test/validate data from the other clusters, aiming to detect anomalies based on reconstruction error.

Do you think this is a valid approach in my case? Would you suggest any improvements or alternative methods for anomaly detection in this kind of dataset?

Thanks in advance!

$\endgroup$

1 Answer 1

1
$\begingroup$

Is it possible a MUCH simpler method would work? At 45 seconds, this video shows a simple method on bearing vibration anomaly detection https://www.youtube.com/watch?v=FUdpwYBQlrU&ab_channel=EamonnKeogh If you want to send me data, I can test for you...

$\endgroup$

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.