Does the time complexity depend on what is being matched on or does it get compiled down to some form of lookup table which can do O(1) lookups?
2 Answers
Some Scala's match statements can be compiled to the same byte code as Java's switch statements. There is an annotation to ensure that.
But, for most cases, especially complex ones like deconstructions, it will be compiled to the same byte code as a series of if / else statements.
In general, I would not expect them to be a "constant" operation, but rather a "linear" operation. In any case, since the maximum number of checks will not change for the input, and usually it will not be more than a ten of them. Formally one will say it has O(1) complexity.
See yǝsʞǝlA's answer for a more detailed explanation of that.
If you are worried about it, you can put most common cases first and then the others. However, I would not really care about the performance of it, your application will not really notice it. And I would favor readability and correctness of the code instead.
4 Comments
O(1) because you are not usually matching large n number of patterns/conditions. In fact, it's simply calling unapply and using if/else logic. You could come up with pathological cases where a collection would have to be iterated over completely to perform pattern match but that's very rare. Another way to get O(n) is to write your own unapply and make it slow, but generally it's O(1). If you want to be sure look at generated bytecode.O asymptotic bound relies on the fact that we only care about cases where n is large. There is no such thing as O(50). There is O(1) and O(n), etc. 50 is just a constant that we can ignore. The same applies to equality comparison which might involve multiple constant time operations which are also simplified to a constant time bound. Just wanted to be more clear/strict on this. Cheers.Pattern matching in most cases will be O(1) because you are usually matching against a small number or possible cases and each match is comprised of a few constant time operations on average.
Since pattern matching is achieved by calling unapply method on the matched object docs and optionally comparing extracted values, the time complexity will depend on unapplys method's implementation and can be of any complexity. There is no compiler optimization possible for general case because some pattern matches depend on data being passed to them.
Compare these scenarios:
List(1, 2, 3) match { case _ :+ last => ... // O(n) with respect to list length case head :: tail => ... // O(1) w.r.t. list length case _ => ... // O(1) - default case, no operation needs to be done } Most of the time we would pattern match something like a list to get head and tail split with :: - O(1) because unapply is simply returning head if it exists.
We usually don't use :+ because it's not common and expensive (library code):
/** An extractor used to init/last deconstruct sequences. */ object :+ { /** Splits a sequence into init :+ last. * @return Some((init, last)) if sequence is non-empty. None otherwise. */ def unapply[T,Coll <: SeqLike[T, Coll]]( t: Coll with SeqLike[T, Coll]): Option[(Coll, T)] = if(t.isEmpty) None else Some(t.init -> t.last) } To get last element of a sequence (t.last) we need to loop, which is O(n).
So it will really depend how you pattern match, but usually you pattern match case classes, tuples, options, collections to get first element and not last, etc. In such overwhelming majority of cases you'll be getting O(1) time complexity and a ton of type safety.
Additionally:
In the worst case here there will be m patterns each doing on average c operations to perform a match (this assumes unapply has constant time, but there are exceptions). Additionally there will be an object with n properties which we need to match against these patterns which gives us a total of: m * c * n operations. However, since m is really small (patterns never grow dynamically and usually written by a human) we can safely call it a constant b giving us: T(n) = b * c * n. In terms of big-O: T(n) = O(n). So we established theoretical bound of O(n) that is for cases where we need to check all n properties of an object. As I was pointing out above in most of the cases we don't need to check all properties/elements like when we use head :: tail where n is replaced with constant and we get O(1). It's only if we always do something like head :+ tail we would get O(n). Amortized cost I believe is still O(1) for all cases in your program.
9 Comments
exists on a collection of predicates. Given that, the complexity of the operation is O(n) because you have, in the worst case, to check all the elements in the collection. It does not matter if this hypothetical collection contain just a few elements (even only one), the generalization of the complexity will be O(n). Now, in practical terms it does not really matter, as N will never tend to a larger enough number to notice an impact in the performance of the application (except if the unapply is costly enough).n and 1/constant factor stand for. Number of patterns doesn't grow dynamically, it's a constant factor that can be neglected. Big-O notation and time complexity analysis is all about analysis of order of growth (see: en.wikipedia.org/wiki/Big_O_notation#Infinite_asymptotics).c equals to a million. When we say O(n) we really mean linearly proportional to the size of n, not some fixed size c.