Assume that "1,2,3" are the ids of users, active means that person visited the stackoverflow in last one month (0=passive, 1=active), and there are positive and negative votes.
id question votes active 1 1 -1, +1, -1, -1, -1 0 1 2 -1, +1, -1, -1, +1 0 2 1 +1, +1, -1, -1 0 3 1 +1, +1, +1, -1, +1 1 3 2 +1, +1, -1, +1, +1, +1 1 3 3 -1, +1 1 I want to know what makes the users stop using stackoverflow. Think that, I have already calculate the how many times did they get negative votes, total vote, average vote for each question...
I wonder what kind of information could I get from these sequences. I want to find something like this: these users who are passive have two negative votes sequentially. For example, one positive vote after two negative votes in the second question of user 1, doesn't prevent the user churn. User 3 doesn't have any 2 negative votes sequentially in any of his questions. Hence he is still active.
I'm looking for something like PrefixSpan Algorithm but order is important for me. I mean, I can't write the sequences like
<(-1 +1 -1 -1 -1) (-1 +1 -1 -1 +1 )> or
<(-1) (+1) (-1) (-1) (-1) (-1) (+1) (-1) (-1) (+1 )>. Because the first one loses the order, and the second one jumbled the questions together. Is there any algorithm to find these sequences which is common for churners?