2

Does the time complexity depend on what is being matched on or does it get compiled down to some form of lookup table which can do O(1) lookups?

0

2 Answers 2

4

Some Scala's match statements can be compiled to the same byte code as Java's switch statements. There is an annotation to ensure that.
But, for most cases, especially complex ones like deconstructions, it will be compiled to the same byte code as a series of if / else statements.

In general, I would not expect them to be a "constant" operation, but rather a "linear" operation. In any case, since the maximum number of checks will not change for the input, and usually it will not be more than a ten of them. Formally one will say it has O(1) complexity.
See yǝsʞǝlA's answer for a more detailed explanation of that.

If you are worried about it, you can put most common cases first and then the others. However, I would not really care about the performance of it, your application will not really notice it. And I would favor readability and correctness of the code instead.

Sign up to request clarification or add additional context in comments.

4 Comments

To offer my 5 cents... Pattern matching most of the time would be O(1) because you are not usually matching large n number of patterns/conditions. In fact, it's simply calling unapply and using if/else logic. You could come up with pathological cases where a collection would have to be iterated over completely to perform pattern match but that's very rare. Another way to get O(n) is to write your own unapply and make it slow, but generally it's O(1). If you want to be sure look at generated bytecode.
@yǝsʞǝlA what I meant with it not being O(1) is in the sense of a lookup table. Pattern matching is not magical, it does not know to which case to go for the given input. It just go one by one until one matches. Thus if you have fifty cases (which I doubt anyone would have so many) technically speaking that would be O(n=50). But again, since each match is simple, and the complexity of if else is "trivial", I would not really care for that.
The whole definition of O asymptotic bound relies on the fact that we only care about cases where n is large. There is no such thing as O(50). There is O(1) and O(n), etc. 50 is just a constant that we can ignore. The same applies to equality comparison which might involve multiple constant time operations which are also simplified to a constant time bound. Just wanted to be more clear/strict on this. Cheers.
Most patterns are tests for equals and instanceof, for case classes and objects. But definitely don't underestimate the power of switch. Also, the power of warnings for misses cases when using sealed hierarchies; that's where one hears the most complaints: it didn't warn me! until runtime!
2

Pattern matching in most cases will be O(1) because you are usually matching against a small number or possible cases and each match is comprised of a few constant time operations on average.

Since pattern matching is achieved by calling unapply method on the matched object docs and optionally comparing extracted values, the time complexity will depend on unapplys method's implementation and can be of any complexity. There is no compiler optimization possible for general case because some pattern matches depend on data being passed to them.

Compare these scenarios:

List(1, 2, 3) match { case _ :+ last => ... // O(n) with respect to list length case head :: tail => ... // O(1) w.r.t. list length case _ => ... // O(1) - default case, no operation needs to be done } 

Most of the time we would pattern match something like a list to get head and tail split with :: - O(1) because unapply is simply returning head if it exists.

We usually don't use :+ because it's not common and expensive (library code):

/** An extractor used to init/last deconstruct sequences. */ object :+ { /** Splits a sequence into init :+ last. * @return Some((init, last)) if sequence is non-empty. None otherwise. */ def unapply[T,Coll <: SeqLike[T, Coll]]( t: Coll with SeqLike[T, Coll]): Option[(Coll, T)] = if(t.isEmpty) None else Some(t.init -> t.last) } 

To get last element of a sequence (t.last) we need to loop, which is O(n).

So it will really depend how you pattern match, but usually you pattern match case classes, tuples, options, collections to get first element and not last, etc. In such overwhelming majority of cases you'll be getting O(1) time complexity and a ton of type safety.

Additionally:

In the worst case here there will be m patterns each doing on average c operations to perform a match (this assumes unapply has constant time, but there are exceptions). Additionally there will be an object with n properties which we need to match against these patterns which gives us a total of: m * c * n operations. However, since m is really small (patterns never grow dynamically and usually written by a human) we can safely call it a constant b giving us: T(n) = b * c * n. In terms of big-O: T(n) = O(n). So we established theoretical bound of O(n) that is for cases where we need to check all n properties of an object. As I was pointing out above in most of the cases we don't need to check all properties/elements like when we use head :: tail where n is replaced with constant and we get O(1). It's only if we always do something like head :+ tail we would get O(n). Amortized cost I believe is still O(1) for all cases in your program.

9 Comments

We can make an analogy of pattern matching with an exists on a collection of predicates. Given that, the complexity of the operation is O(n) because you have, in the worst case, to check all the elements in the collection. It does not matter if this hypothetical collection contain just a few elements (even only one), the generalization of the complexity will be O(n). Now, in practical terms it does not really matter, as N will never tend to a larger enough number to notice an impact in the performance of the application (except if the unapply is costly enough).
I think you have wrong idea of what n and 1/constant factor stand for. Number of patterns doesn't grow dynamically, it's a constant factor that can be neglected. Big-O notation and time complexity analysis is all about analysis of order of growth (see: en.wikipedia.org/wiki/Big_O_notation#Infinite_asymptotics).
We don't care about constant factor/small n cases which never grow with the size of the input because they are negligible, those get removed from the analysis even if constant c equals to a million. When we say O(n) we really mean linearly proportional to the size of n, not some fixed size c.
I know what O notation is, and I AGREE with what you say. What you are not understanding is the meaning in which I am using my words. Yes pattern matching as well as a series of if / else are O(1) in the formal sense of complexity analysis. What I was trying to explain was that it is not a look up table, it won't go from input to output in one go. It has to do do match by match until the first one that applies. Which is a common confusion for people learning. I was not trying to be formal, I was trying to be practical. Definitely using the big O there was a bad idea.
Edited the answer to state that you are right as I can not delete my own as it was already accepted.
|

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.