I am trying to build a model that predicts churn events in the future. The business wants to be able to identify which customers are likely to terminate the services within a month. "Within a month" can mean the next day or the 30th day. The problem is some of the features are time-based, for example how many months into the current term, the number of tickets created in the last two weeks, etc. If the event date is floating, how do I calculate the values of these features? Should I make 30 copies of the same churned service and calculated these time-based features for each of them? Is there a better way to approach this?
1 Answer
The problem is similar to modeling defaults of a subset of companies in a specific basket, big or small. The problem is very well-researched, drawing from methods developed in asset pricing, actuarial science and survival analysis. For a good reference, you may look into
Duffie & Singleton (2003). Credit Risk: Pricing, Measurement, and Management.
The most popular and flexible solution is modeling the churn evens as correlated counting processes with intensities that depend on the predictors that you have mentioned above + some other predictors. Counting process is an extension of Poisson process.
- $\begingroup$ the link for counting processes seems to be broken $\endgroup$ddd– ddd2021-02-04 14:26:50 +00:00Commented Feb 4, 2021 at 14:26
- $\begingroup$ Worked for me just now. Please try again and let me know if there are any problems. The page takes 1-2 seconds to load. $\endgroup$stans– stans2021-02-04 14:55:24 +00:00Commented Feb 4, 2021 at 14:55