What is the best way to generate a unique key for the contents of a dictionary. My intention is to store each dictionary in a document store along with a unique id or hash so that I don't have to load the whole dictionary from the store to check if it exists already or not. Dictionaries with the same keys and values should generate the same id or hash.
I have the following code:
import hashlib a={'name':'Danish', 'age':107} b={'age':107, 'name':'Danish'} print str(a) print hashlib.sha1(str(a)).hexdigest() print hashlib.sha1(str(b)).hexdigest() The last two print statements generate the same string. Is this is a good implementation? or are there any pitfalls with this approach? Is there a better way to do this?
Update
Combining suggestions from the answers below, the following might be a good implementation
import hashlib a={'name':'Danish', 'age':107} b={'age':107, 'name':'Danish'} def get_id_for_dict(dict): unique_str = ''.join(["'%s':'%s';"%(key, val) for (key, val) in sorted(dict.items())]) return hashlib.sha1(unique_str).hexdigest() print get_id_for_dict(a) print get_id_for_dict(b)
get_id_for_dict({'foo':'bar'})orget_id_for_dict({'fo':'obar'})orget_id_for_dict({'f':'o','o':'bar'}). They all return8843d7f92416211de9ebb963ff4ce28125932878. Better useunique_str = join(['%s%s'%(hashlib.sha1(key), hashlib.sha1(val)) for (key, val) in sorted(dict.items())])',:and;in key or value. Useunique_str = join(['%s%s'%(hashlib.sha1(str(key)), hashlib.sha1(str(val))) for (key, val) in sorted(dict.items())]){'b':"ar';'f':'oo"}and{'b':'ar','f':'oo'}would both get the sameunique_str:'b':'ar';'f':'oo';and the same hash