Skip to main content

You are not logged in. Your edit will be placed in a queue until it is peer reviewed.

We welcome edits that make the post easier to understand and more valuable for readers. Because community members review edits, please try to make the post substantially better than how you found it, for example, by fixing grammar or adding additional resources and hyperlinks.

5
  • 2
    $\begingroup$ Actually your answer contradicts @StephanKolassa answer where he refers to literature suggesting that forecast benchmarks are rather misleading. $\endgroup$ Commented Jul 6, 2016 at 12:42
  • 5
    $\begingroup$ @Tim: full disclosure - that link went to an article on benchmarks that I wrote myself. Nevertheless, I stand by my conclusions: all demand forecasting accuracy benchmarks I have seen very probably compare apples to oranges. I'm thus a bit sceptical about looking to external benchmarks. In addition, I think this answer somewhat begs the question. Once your ML algorithm improves on "the best known", how do you know whether you can improve it further, or whether we have achieved The Plateau of Hopelessness? $\endgroup$ Commented Jul 6, 2016 at 15:57
  • 2
    $\begingroup$ My most recent use case was rather different. I was trying to predict who was at risk of killing themselves from their postings on the internet. There are various psychometric tests that one can use to gauge severity of depression such as the PHQ9. As its a widely used medical test there is considerable work on its validity and reliability such as "The PHQ-9 Validity of a brief depression severity measure". I found that the reliability and other measures in that paper a good starting point as to the likely results one could achieve from machine learning. $\endgroup$ Commented Jul 7, 2016 at 8:48
  • 2
    $\begingroup$ You are right, of course, about improving on the "best known", there is no way of telling whether to continue searching for a better model. But in my experience, its fairly rare this occurs in a real-world situation. Most of the work I do seems to be about trying to apply expert judgements at scale through the use of machine learning not trying to improve on the best expert in the field. $\endgroup$ Commented Jul 7, 2016 at 8:51
  • 3
    $\begingroup$ And when you have asked the domain experts, they cannot think of anymore features to be created/extracted. Have you ever concluded "the target is random" ? $\endgroup$ Commented Aug 6, 2020 at 5:15