0
$\begingroup$

I have 3 months of categorized bank transaction data and need to identify recurring cash inflows and outflows for lending risk modeling.

Complications: 1. Income dates shift earlier when payday falls on a weekend (paid Friday). 2. Some individuals have multiple income sources with different periodicities. 3. Amounts may vary around the mean (bonuses, allowances, side gigs). 4. There are many one-off outliers in both inflows and outflows. 5. Recurrence should be defined as at least one instance per month.

I’m considering two approaches:

Rule-based temporal recurrence detection • Detect events that occur “near the same calendar day” ± k days • Include adjustments for weekend/holiday pay behavior • Model amount variance as small perturbations

DBSCAN or density-based clustering

Using a feature space combining: • day-of-month modulo 30 • amount • transaction category

My concern is that DBSCAN may not perform well with shifted periodicity (e.g., 25th → 23rd if weekend), whereas rule-based models might overfit or fail with multiple income streams.

Question: What statistical approach is most appropriate for identifying recurring financial transactions in this setting?

New contributor
Awande Ntombela is a new contributor to this site. Take care in asking for clarification, commenting, and answering. Check out our Code of Conduct.
$\endgroup$

0

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.