Timeline for How to know that your machine learning problem is hopeless?
Current License: CC BY-SA 3.0
18 events
| when toggle format | what | by | license | comment | |
|---|---|---|---|---|---|
| May 17, 2023 at 21:18 | comment | added | Kuba hasn't forgotten Monica | I got about 150 cheap 1ft cube cast plastic crates at a "dollar store" well over a decade ago. They were a good deal and I wiped the shelves. I don't think they got many such outliers. But they stopped selling the crates and I wished they got more. They were great for organizing my lab. Case in point for needing the domain knowledge, and for how hopeless it may be sometimes. I apologize hereby to data analysts at said large dollar store chain. | |
| Nov 24, 2022 at 10:54 | comment | added | Stephan Kolassa | @Stef: that is precisely my point. If we know what is happening, we can include it. The key question is to find this out. I see many questions here at CV and elsewhere where people might ask how to forecast a given time series, and my first reaction is that they need to figure out what happened at time point X first. Sure this is trivial, but it seems to be unintuitive enough for people not to necessarily think of it. Also, see my "outlier" series - if something like the peak happens around Easter, we need to understand whether this will recur next Easter, or not. | |
| Nov 24, 2022 at 10:50 | comment | added | Stef | And then it's easy to make a prediction for next year: predict a behaviour for the normal times, and a behaviour for the time of the exceptional event, and in the absence of more information, just say "we don't know much about this exceptional event and whether it will happen again, but it might happen around the month of april". | |
| Nov 24, 2022 at 10:49 | comment | added | Stef | The Easter peak doesn't look like an "unknown unknown" to me, where "we don't have a way forward because we don't know what questions to ask". It's pretty clear that some exceptional event occurred in april in that time-series. We know exactly what questions to ask: 1) What exactly is this exceptional event in april? 2) Will this exceptional event happen again next year? 3) If it does happen again, will it happen at the same date in april or at a different date? | |
| Nov 21, 2020 at 9:25 | comment | added | markowitz | @StephanKolassa You are master in forecasting, maybe you can/want help me in those two questions: stats.stackexchange.com/questions/497271/… ; stats.stackexchange.com/questions/491065/… | |
| Apr 13, 2017 at 12:50 | history | edited | CommunityBot | replaced http://datascience.stackexchange.com/ with https://datascience.stackexchange.com/ | |
| Aug 27, 2016 at 19:17 | vote | accept | Tim | ||
| Jul 13, 2016 at 7:12 | comment | added | Stephan Kolassa | Dice are not all that random, either. | |
| Jul 13, 2016 at 7:12 | comment | added | Stephan Kolassa | @Walfrat: the question of course is whether we have already arrived at the level of "residual" variance, or whether there is still any unmodeled but modelable structure left - and how we can know the answer to this question. A "coin toss" may look random but be perfectly forecastable once you know the DGP. | |
| Jul 5, 2016 at 12:13 | comment | added | Stephan Kolassa | @KarolisJuodelė: that is exactly my point. We can't even know when our situation is hopeless, unless we talk to an expert... and then, sometimes the expert can't help us either, and there are "unknown unknowns" to the experts, which conceivably somebody else might know. | |
| Jul 5, 2016 at 12:11 | comment | added | Karolis Juodelė | Using domain knowledge you can add new features to the first two cases (eg, time till Easter and TV viewing numbers, though the latter needs forecasting of its own) to get much better results. In neither case is the situation hopeless. The really interesting part is how to tell missing domain knowledge from a data set of fair coin flips. | |
| Jul 5, 2016 at 11:57 | comment | added | Walfrat | If you are forecasting a fair coin toss, then there is no way to get above 50% accuracy.. You said everything there. | |
| Jul 5, 2016 at 9:47 | history | edited | Stephan Kolassa | CC BY-SA 3.0 | added 3 characters in body |
| Jul 5, 2016 at 9:46 | comment | added | Stephan Kolassa | Sure. I'd love to see someone else's perspective on this, too! | |
| Jul 5, 2016 at 9:46 | comment | added | Tim | +1 for marvelous answer that I totally agree with. I'm not accepting it (yet) since still hoping for other answers since the problem is broad. | |
| Jul 5, 2016 at 9:22 | history | edited | Stephan Kolassa | CC BY-SA 3.0 | added 78 characters in body |
| Jul 5, 2016 at 9:15 | history | edited | Stephan Kolassa | CC BY-SA 3.0 | added 640 characters in body |
| Jul 5, 2016 at 9:01 | history | answered | Stephan Kolassa | CC BY-SA 3.0 |