3

I am trying to track seen elements, from a big array, using a dict. Is there a way to force a dictionary object to be integer type and set to zero by default upon initialization?

I have done this with a very clunky codes and two loops.

Here is what I do now:

fl = [0, 1, 1, 2, 1, 3, 4] seenit = {} for val in fl: seenit[val] = 0 for val in fl: seenit[val] = seenit[val] + 1 
2
  • 4
    "force a dictionary ... set to zero by default upon initialization" -- defaultdict Commented Mar 24, 2019 at 14:51
  • There's seenit = dict.fromkeys(fl, 0) to replace the first loop, but the current answers provide better solutions for replacing both loops at the same time. Commented Mar 24, 2019 at 15:05

3 Answers 3

5

Of course, just use collections.defaultdict([default_factory[, ...]]):

from collections import defaultdict fl = [0, 1, 1, 2, 1, 3, 4] seenit = defaultdict(int) for val in fl: seenit[val] += 1 print(fl) # Output defaultdict(<class 'int'>, {0: 1, 1: 3, 2: 1, 3: 1, 4: 1}) print(dict(seenit)) # Output {0: 1, 1: 3, 2: 1, 3: 1, 4: 1} 

In addition, if you don't like to import collections you can use dict.get(key[, default])

fl = [0, 1, 1, 2, 1, 3, 4] seenit = {} for val in fl: seenit[val] = seenit.get(val, 0) + 1 print(seenit) # Output {0: 1, 1: 3, 2: 1, 3: 1, 4: 1} 

Also, if you only want to solve the problem and don't mind to use exactly dictionaries you may use collection.counter([iterable-or-mapping]):

from collections import Counter fl = [0, 1, 1, 2, 1, 3, 4] seenit = Counter(f) print(seenit) # Output Counter({1: 3, 0: 1, 2: 1, 3: 1, 4: 1}) print(dict(seenit)) # Output {0: 1, 1: 3, 2: 1, 3: 1, 4: 1} 

Both collection.defaultdict and collection.Counter can be read as dictionary[key] and supports the usage of .keys(), .values(), .items(), etc. Basically they are a subclass of a common dictionary.

If you want to talk about performance I checked with timeit.timeit() the creation of the dictionary and the loop for a million of executions:

  • collection.defaultdic: 2.160868141 seconds
  • dict.get: 1.3540439499999999 seconds
  • collection.Counter: 4.700308418999999 seconds

collection.Counter may be easier, but much slower.

Sign up to request clarification or add additional context in comments.

2 Comments

Thanks. This newbie appreciates all the help. I like the defaultdic solution and will go with this.
@giggleusa if you find this or another answer useful you should consider accepting the answer as the correct one (pressing the gray mark under the downvote button). You would receive 2 reputation for accepting one. Otherwise, your question will stay unanswered.
4

You can use collections.Counter:

from collections import Counter Counter([0, 1, 1, 2, 1, 3, 4]) 

Output:

Counter({1: 3, 0: 1, 2: 1, 3: 1, 4: 1}) 

You can then address it like a dictionary:

>>> Counter({1: 3, 0: 1, 2: 1, 3: 1, 4: 1})[1] 3 >>> Counter({1: 3, 0: 1, 2: 1, 3: 1, 4: 1})[0] 1 

Comments

1

Using val in seenit is a bit faster than .get():

seenit = dict() for val in fl: if val in seenit : seenit[val] += 1 else: seenit[val] = 1 

For larger lists, Counter will eventually outperform all other approaches. and defaultdict is going to be faster than using .get() or val in seenit.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.