5
\$\begingroup\$

Not sure what I should write here. The code should be self-explanatory.

"""This is a data definition class--Searchable_list. Searchable_list takes a list of strings and makes it searchable. Searchable meaning you can find which elements in the list have a pattern. """ class Searchable_list(object): """this will make your word list searchable. Note, It will also loose the original order of the list.""" def __init__(self, lis): assert hasattr(lis,"__iter__") self.search_dict=dict() for word in set(lis):self.add_word(word) def add_word(self,word): """this will add a word to the search_dict search dict is of the form: {letter:{nextletter:{(index,word)}}} """ assert type(word) is str#or isinstance(word,str) for index,val in enumerate(word[:-1]): next_letter=self.search_dict.setdefault(val,dict()) words_list=next_letter.setdefault(word[index+1],set())#object modification words_list.add((index,word))#object modifification def find_matches(self,seq): """finds all the words in the list with this sequence. Uses '.' as wildcard. """ s_d=self.search_dict assert len(seq)>1 #could put a try catch to catch key errors for index,letter in enumerate(seq[:-1]): if not(letter=="."and seq[index+1]=="."): #no point if they all match... if letter==".": L_m=set.union(*(i.get(seq[index+1],set()) for i in s_d.values())) #.get is important here. not all is have i[seq[index+1]] elif seq[index+1]==".": L_m=set.union(*(i for i in s_d[letter].values())) else: L_m=s_d[letter].get(seq[index+1],{})#this is a set. #L_m==letter_matches if index>0: m_m=((i-index,word) for i,word in L_m) #m_m=matches_matches. These words still have the pattern. #your matching all indexes to the original m_s m_s.intersection_update(m_m) #m_s=matches_set else: m_s=L_m.copy() #http://stackoverflow.com/questions/23200969/how-to-clone-or-copy-a-set-in-python return m_s 

EDIT: Because this post was bumped, and I've added some pretty major improvements to it here's a link to the final version of this. It's not very clean, but it has optimizations this doesn't have. If there's an interest I can try to put some comments explaining the optimizations in the github or post something here about it (probably in the form of another answer). https://github.com/user-name-is-taken/words-with-friends/blob/master/WWF_DDC.py (note, the scrabble stuff is just adapting this code for scrabble). For now, the basic idea behind the optimizations is that python's set.intersection is faster than set.union

\$\endgroup\$
2
  • \$\begingroup\$ "Not sure what I should write here." Does it work as intended? Do you want a review about any and all aspects of your code? If you can answer both questions with 'yes', you probably posted in the right place. Of-course, you could always check the help center. \$\endgroup\$ Commented Mar 10, 2017 at 22:26
  • \$\begingroup\$ I can answer yes to both. \$\endgroup\$ Commented Mar 10, 2017 at 22:30

2 Answers 2

1
\$\begingroup\$

Just an updated version of find matches. Changes are:

  • I added a while loop that removes trailing "."s see the comments for an explanation.
  • And, I moved the intersection out of the loop for speed and to make the code cleaner. This required adding the setsList list.

 def find_matches(self,seq): """finds all the words in the list with this sequence. Uses '.' as wildcard. """ assert len(seq)>1 s_d = self.search_dict setsList =[] while seq[-1]=='.': #not solved by if index+1=='.' because there's no [letter][''] for word endings in self.search_dict. #without this, .f. wouldn't find (0,"of"), because the L_m in the seq[index+1]=="." if wouldn't include it. seq = seq[:-1] for index,letter in enumerate(seq[:-1]): if not(letter=="." and seq[index+1]=="."):#no point if they all match... if letter==".": L_m = set.union(*(i.get(seq[index+1],set()) for i in s_d.values())) #.get is important here. not all is have i[seq[index+1]] elif seq[index+1]==".": L_m = set.union(*s_d[letter].values()) else: L_m = s_d[letter].get(seq[index+1],{})#this is a set. #not using s_d.get could cause errors here... #L_m==letter_matches setsList.append({(i-index,word) for i,word in L_m}) return set.intersection(*setsList) 
\$\endgroup\$
1
\$\begingroup\$

naming

Based on reading your comment, I propose this name:

class SearchableCollection(object): 

Rely on duck typing, rather than trying to write java code in python:

def __init__(self, words): # (delete this line) assert hasattr(lis,"__iter__") self.words = {} for word in words: self._add_word(word) 

It appears that add_word is not part of your public API. Mark it so with a leading underscore, or make it a nested def. Do not assert that type is str.

 for index,val in enumerate(word[:-1]): 

Please name it letter rather than the very vague val. Or cur_letter, parallel with next_letter.

You are using setdefault() in a sensible way. But you might be happier using defaultdict.

You named it words_list, but apparently you meant words_set.

style

Run $ flake8 WWF_DDC.py and follow its advice, please.

\$\endgroup\$

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.