0
$\begingroup$

I’m working through the runtime analysis of scikit-learn’s OneVsRestClassifier for two cases:

  1. LogisticRegression (solver=lbfgs, C=2.0, max_iter=1000)
  2. RidgeClassifier (alpha=1.0)

So far I’ve derived:

OVR Logistic (LBFGS) -------------------- For each of K classes and T inner iterations: – Forward pass (X·w): O(n·c) – Batch gradient (Xᵀ·…): O(n·c) – LBFGS update: O(c² + n·c) ⇒ fit cost = O(K · T · n · c) (assuming n ≫ c) 
OVR Ridge (Cholesky) -------------------- – Build Gram matrix XᵀX once: O(n·c²) – For each of K classes: – Solve (G + λI)w = b via Cholesky: O(c³) ⇒ fit cost = O(n·c² + K·c³) 
  1. Are there any scikit-learn implementation details (e.g. caching, sparse optimizations) I’ve overlooked?
  2. Is it valid to simply multiply the per-class cost by K for One-vs-Rest, or have I misapplied the additive-then-multiplicative rule?

I’d really appreciate any feedback or pointers to gotchas in the actual code since I am very inexperienced with runtime complexities.

$\endgroup$

0

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.