Timeline for Counting the population of integers

Current License: CC BY-SA 3.0

10 events

when toggle format	what		by	license	comment
Feb 11, 2013 at 12:15	comment	added	Leonid Shifrin		@jVincent To summarize - and this is my last comment on this matter: the beef I have with your approach to benchmarking is that you present benchmarks which are accurate only for a very narrow subset of all possible cases where functions in question can be used (namely, functions used on smallish toy lists, but in a large loop). To my mind, such benchmarks are of limited utility. Also, if you present them / argue about efficiency, you have to always state under which conditions your conclusions are correct - which you did not do, as far as I am concerned.
Feb 11, 2013 at 12:09	comment	added	Leonid Shifrin		@jVincent "If you have a loop that constantly performs operation `f[input,100]`, it doesn't matter if `f[x,n]` is `O(n)` and `g[x,n]` is `O(n^2)`, you might still have f be the faster solution.`" - right, this is the core of why I find your approach problematic. The statement is correct, but why do you think that this is the main use case for this function? Had the OP faced a problem having this in a large loop, he'd say so in the question. This is not what most people imply when ask to optimize a function. If they need to optimize code in a loop, they would say so.
Feb 11, 2013 at 12:04	comment	added	Leonid Shifrin		@jVincent For a fixed and small number of unique elements, your benchmarks may make some sense (although this is hardly a general case for this problem). But even in this case, instead of simply repeating the code 10000 times, I would increase the lengths of individual sublists, over which you are counting. This is important, because `Count` is a pattern-based function. For long sub-lists, `Tally` shows its advantages. Keeping sublists small as you do in your tests, you measure anything but the interesting parts. So, even in your narrowed formulation, your benchmarking is seriously flawed.
Feb 11, 2013 at 10:26	comment	added	Leonid Shifrin		@jVincent ... code run on a toy list in question. This is not what Mr.Wizard and myself meant by discussing efficiency, and I dare to say this is not what most people mean by efficiency either. It's probably third time for this post that I am trying to explain this to you. I am really starting to think that you have never yet faced a real-world code optimization problems, otherwise you would know what I mean. You don't have to go far: browse the questions in performance-tuning tag here, and see how benchmarking is typically done, and what people mean by efficiency.
Feb 11, 2013 at 10:22	comment	added	Leonid Shifrin		@jVincent You seem to not be getting it, even after both myself and Mr.Wizard tried our best to explain. Your tests via running the code 10000 times on the small toy lists are meaningless, because at that scale what you measure is determined not by the true complexity of an algorithm, but by things that become irrelevant for real list samples. Have a look at Mr.Wizard's benchmarks, for more or less real benchmarking. So, to reiterate once again: your statement is correct, but is only meaningful if the OP will only ever be interested in speeding up code consisting of 10000 iterations of a ...
Feb 11, 2013 at 5:37	comment	added	Leonid Shifrin		@Mr.Wizard Have a look at my new version.
Feb 11, 2013 at 5:37	history	edited	Leonid Shifrin	CC BY-SA 3.0	Added a (much) faster solution
Feb 10, 2013 at 22:05	comment	added	Mr.Wizard		Leonid, I know that you did not write this for ultimate speed or you would be compiling to C (and blowing the door off mine to be sure) but I think you might be interested in the timings I added to my answer.
Feb 9, 2013 at 22:28	history	edited	Leonid Shifrin	CC BY-SA 3.0	refactored the code to minimally change the original OP's approach
Feb 9, 2013 at 22:14	history	answered	Leonid Shifrin	CC BY-SA 3.0