0

I was asked a question in which we're looking for duplicate social security numbers, as in ssn that are in both array A and also array B. My immediate thought was to iterate over each array storing the count of each ssn in a HashMap.Then of course, all keys which have count of more than 1 are duplicates. My understanding was that this would be an O(n + m) operation, where m and n are the number of elements in each array. I was told that this is "very inefficient" in regards to time-complexity and that the time complexity was actually O(n * m). How is this correct?

13
  • 2
    Yeah, that's definitely O(n+m). Commented Feb 5, 2017 at 22:14
  • 1
    "I was told..." Told by whom? Why does it matter? It sounds pretty strange. It's linear on average. It's quadratic in the worst case. Commented Feb 5, 2017 at 22:15
  • @JoeC that's what I was thinking. How do you correct an interviewer without sounding like a complete tool? Commented Feb 5, 2017 at 22:15
  • 5
    The right tact to take is to ask the interviewer why he believes that he is right. Either he will realize his mistake and correct himself, or you will have an opportunity to clarify your answer. Commented Feb 5, 2017 at 22:17
  • 1
    The quadratic worst case would be if the hash code were absolutely useless in that it put all of the strings in the same hash bucket. Commented Feb 5, 2017 at 22:19

1 Answer 1

1

First of all, O(n+m) is faster than O(n*m). Time complexity of O(n*m) is going to be the case when you have two loops like this :

for x in [1..n] for y in [1..m] 

So, the term "very inefficient" doesn't apply at all, unless there is a crunch of extra space that O(n+m) incurs. In that case you can clarify what is the most important factor time complexity or space complexity.

Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.