"""DocStrings"""
+1 for adding a doc string to the wordBreak() method, but -1 for not having any content in it.
PEP-008
Your code diverges from the PEP-008 guidelines in several areas:
- Use
snake_case for functions and variables, not mixedCase. - Use a single space around operators (
elif i > len(s):), not 2 spaces, and then no spaces. - Add a blank line between the inner
prune() function and the outer function, as well as between the class Solution and its first method.
Use pylint or pyflakes or similar to ensure your code follows proper conventions.
Variable Names
Variable names like s are very terse, but since it is given that name in the question, is forgivable.
Variable names like wordDict are misleading, and should be avoided! wordDict is actually a list, not a dict. Although that name was given in the question, you should correct it to word_list, or simply words to avoid confusion. (+1 for the type hint; it helps avoid some confusion. But rename the parameter anyway.)
i is acceptable as a loop index, but there is no obvious loop. idx or pos may improve clarity.
m is unacceptable. It in no way suggests it is the length of a word. Use a more descriptive variable name.
Dead Code
elif i > len(s): return False
At what point will this branch every be taken? The only way is if prune(i+m) is called when i+m is greater than len(s). But if m is the length of word, and s[i:i+m] == word is true, it is impossible for i+m to exceed len(s). This branch is dead code, and can be removed.
Algorithmic Improvements
You are sorting your "dictionary" by length, so that you check the longest words first. However, this doesn't actually guarantee any speed improvements. At any step, if the next possible word is the shortest one, you've guaranteed it will be the last one checked. Ditch the sort.
What you want to do is reduce the number of words from the "dictionary" you are testing at each step. One obvious way would be to turn the "dictionary" into an actual dict, say, storing the list of words which start with a given letter:
word_dict = { "a": ["apple"], "p": ["pen"], }
Or more programmatically:
word_dict = defaultdict(list, ((word[:1], word) for word in wordDict))
Then at each step, instead of comparing s[i:i+len(word)] to each and every word in wordDict, you would compare to the list of words in word_dict[s[i:i+1]], which should be significantly smaller.
This method will degenerate if all the dictionary words start with the same letter (["aaaa", "aaab", "aaac", "aaad"]), but you could handle this by storing the dictionary as a tree structure instead.
Another alternative would be to separate wordDict into sets of words of different lengths:
words_by_length = { 3: { "pen" }, 5: { "apple" } }
Then, at each stage, you could do a single set containment test for each word length:
for word_length, word_list in words_by_length.items(): if s[i:i+word_length] in words_list: # ...