Almost same code running much slower

Question

I am trying to solve this problem:

Given a string array words, find the maximum value of length(word[i]) * length(word[j]) where the two words do not share common letters. You may assume that each word will contain only lower case letters. If no such two words exist, return 0.

https://leetcode.com/problems/maximum-product-of-word-lengths/

You can create a bitmap of char for each word to check if they share chars in common and then calc the max product.

I have two method almost equal but the first pass checks, while the second is too slow, can you understand why?

class Solution { public: int maxProduct2(vector<string>& words) { int len = words.size(); int *num = new int[len]; // compute the bit O(n) for (int i = 0; i < len; i ++) { int k = 0; for (int j = 0; j < words[i].length(); j ++) { k = k | (1 <<(char)(words[i].at(j))); } num[i] = k; } int c = 0; // O(n^2) for (int i = 0; i < len - 1; i ++) { for (int j = i + 1; j < len; j ++) { if ((num[i] & num[j]) == 0) { // if no common letters int x = words[i].length() * words[j].length(); if (x > c) { c = x; } } } } delete []num; return c; } int maxProduct(vector<string>& words) { vector<int> bitmap(words.size()); for(int i=0;i<words.size();++i) { int k = 0; for(int j=0;j<words[i].length();++j) { k |= 1 << (char)(words[i][j]); } bitmap[i] = k; } int maxProd = 0; for(int i=0;i<words.size()-1;++i) { for(int j=i+1;j<words.size();++j) { if ( !(bitmap[i] & bitmap[j])) { int x = words[i].length() * words[j].length(); if ( x > maxProd ) maxProd = x; } } } return maxProd; } };

Why the second function (maxProduct) is too slow for leetcode?

Solution

The second method does repetitive call to words.size(). If you save that in a var than it working fine

well, you've got a loop that does (number of words)² checks; that'd be a start... How many words are there? — Marcus Müller
– Marcus Müller, Commented Apr 28, 2016 at 13:09
Also you call it bitmap, but in fact, it's a vector<int>, not a thing composed of individually adressable bits. — Marcus Müller
– Marcus Müller, Commented Apr 28, 2016 at 13:09
Consider the case where all words have common letters and count the number of multiplications each function is making. — molbdnilo
– molbdnilo, Commented Apr 28, 2016 at 13:11
@molbdnilo: now the multiplication is done only after the if, like the first code. but still it's too slow — giò
– giò, Commented Apr 28, 2016 at 13:16
@MarcusMüller: it's almost the same as the first one (the first use an array) — giò
– giò, Commented Apr 28, 2016 at 13:17

kchinger · Accepted Answer · 2016-04-28 14:38:23Z

Since my comment turned out to be correct I'll turn my comment into an answer and try to explain what I think is happening.

I wrote some simple code to benchmark on my own machine with two solutions of two loops each. The only difference is the call to words.size() is inside the loop versus outside the loop. The first solution is approximately 13.87 seconds versus 16.65 seconds for the second solution. This isn't huge, but it's about 20% slower.

Even though vector.size() is a constant time operation that doesn't mean it's as fast as just checking against a variable that's already in a register. Constant time can still have large variances. When inside nested loops that adds up.

The other thing that could be happening (someone much smarter than me will probably chime in and let us know) is that you're hurting your CPU optimizations like branching and pipelining. Every time it gets to the end of the the loop it has to stop, wait for the call to size() to return, and then check the loop variable against that return value. If the cpu can look ahead and guess that j is still going to be less than len because it hasn't seen len change (len isn't even inside the loop!) it can make a good branch prediction each time and not have to wait.

Not sure how you test it, but the vector version zero initialize bitmap, whereas the other the buffer is left uninitialized before assignment.
@jarod42 I tested with a vector that I initialized before timing, then timed two identical sets of two loops working on that vector, and the only difference was the .size() was in the compare or done once first. The second loop was the slower, despite having the advantage of going second on the same vector and doing the same thing.

Collectives™ on Stack Overflow

Almost same code running much slower

Solution

1 Answer 1

2 Comments

Hot Network Questions

Collectives™ on Stack Overflow

Solution

1 Answer 1

2 Comments

Related