0

Code

total_exc_time = 0 for _ in range(10): start = time.time() set_a = set() dict_a = dict() add = set_a.add for index in range(1000000): dict_a[index] = index add(index) total_exc_time += ((time.time() - start) * 1000) total_exc_time = 0 for _ in range(10): start = time.time() dict_a = dict() for index in range(1000000): dict_a[index] = index set_a = set(dict_a.keys()) total_exc_time += ((time.time() - start) * 1000) 

Result

332.3066234588623 ms 283.2982540130615 ms 

isn't the time complexity for the first code is O(n) and later code is 2 times O(n)?

6
  • 4
    Function calls have overhead. Any overhead multiplied by ten million will becomes more noticeable. Idk if that's it, but it's a possibility. Commented Nov 19, 2020 at 0:04
  • 1
    "the time complexity for the first code is O(n) and later code is 2 times O(n)" - that's not how asymptotic complexity works. Commented Nov 19, 2020 at 0:12
  • sorry, I know the have the same O(n) but, the first code looks that it has less computation than the later code. Commented Nov 19, 2020 at 0:18
  • @bakumpai why does it look like that to you? Commented Nov 19, 2020 at 0:20
  • cause of dict.keys() ? Commented Nov 19, 2020 at 0:21

1 Answer 1

1

They are equivalent. The heavy lifting here to calculate the key hash. In both, it happens twice: add to a dict and then add to set. Other procedures should be light. You can use debugger to show the C calls to verify.

In addition to function call overhead mentioned by Carcigenicate, memory management might also play a role. If the set knows the length, it could possibly avoid copying data over when the predefined space isn't enough.

Sign up to request clarification or add additional context in comments.

10 Comments

doesn't dict.keys() populate a list? and it costs O(n)?
@bakumpai no it doesnt. Even if it did it could still be faster to pass it to set than using .add in a loop. For starters, even if it did the same thing (not pre-allocating the entire set) it would still be faster because it is all done in compiled code, whereas your loop is at the interpreter, bytecode level, which will always be significantly slower
I didn't even think of the potential for reallocations/expansion. I don't think it can take that into consideration because the signature for set specifies an "iterable", which doesn't have a length, but it could do an attribute check or something regardless.
@Carcigenicate have to admit I didn't verify myself and it's hypothetical.
|

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.