If I understand you correctly, this should do what you want:
>>> stats = {'a': {'email1':4, 'email2':3}, ... 'the': {'email1':2, 'email3':4}, ... 'or': {'email1':2, 'email3':1}} >>> chi = {'a': 7, 'the':6, 'or':3} >>> sorted(stats, key=chi.get) ['or', 'the', 'a']
Let me know if this works for you. Also, as Boud mentioned above, you should consider numpy/scipy, which would probably provide better performance -- and would definitely provide lots of built-in functionality.
Since you say this doesn't work -- for reasons you haven't yet explained -- here's a more general example of how to use the key argument. This shows that get works with Counter objects as well as standard dicts, but also how to create a function that does something :
>>> stats = {'a': {'email1':4, 'email2':3}, ... 'the': {'email1':2, 'email3':4}, ... 'or': {'email1':2, 'email3':1}} >>> wordlists = ([k] * sum(d.itervalues()) for k, d in stats.iteritems()) >>> chi = collections.Counter(word for seq in wordlists for word in seq) >>> sorted(stats, key=chi.get) ['or', 'the', 'a'] >>> sorted(stats, key=lambda x: chi[x] + 3) ['or', 'the', 'a'] >>> sorted(stats, key=chi.get, reverse=True) ['a', 'the', 'or']
I still don't completely understand what you're looking for, but perhaps you mean to get a sorted list of key, value tuples?
>>> sorted(stats.iteritems(), key=lambda x: chi[x[0]]) [('or', {'email3': 1, 'email1': 2}), ('the', {'email3': 4, 'email1': 2}), ('a', {'email2': 3, 'email1': 4})]
I would actually recommend splitting this up though:
>>>> sorted_keys = sorted(stats, key=chi.get) >>>> [(k, stats[k]) for k in sorted_keys] [('or', {'email3': 1, 'email1': 2}), ('the', {'email3': 4, 'email1': 2}), ('a', {'email2': 3, 'email1': 4})]
You said you want something sorted by the values in chi, but "with the same structure as stats." That's not possible because dictionaries don't have an order; the closest you can come is a sorted list of tuples, or an OrderedDict (in 2.7+).
>>>> collections.OrderedDict((k, stats[k]) for k in sorted_keys) OrderedDict([('or', {'email3': 1, 'email1': 2}), ('the', {'email3': 4, 'email1': 2}), ('a', {'email2': 3, 'email1': 4})])
If you have to frequently reorder the dictionary, this method is kind of pointless.
scipypackage, modulescipy.statswherechisquarefunction is