Python - Prefix frequency in string List

Python - Prefix frequency in string List

In this tutorial, we will learn how to calculate the frequency of prefixes in a list of strings.

Objective:

For the list of strings:

words = ["apple", "appetite", "banana", "apparel", "ball", "bat"] 

We want to compute the frequency of prefixes for a given size:

Let's say the size is 3; then, the prefixes are:

"app", "app", "ban", "app", "bal", "bat" 

The frequency would be:

app: 3 ban: 1 bal: 1 bat: 1 

Steps and Python Code:

1. Using collections.defaultdict:

A defaultdict from the collections module can be used to conveniently compute the frequency of each prefix.

from collections import defaultdict words = ["apple", "appetite", "banana", "apparel", "ball", "bat"] size = 3 prefix_freq = defaultdict(int) for word in words: prefix = word[:size] prefix_freq[prefix] += 1 for prefix, freq in prefix_freq.items(): print(prefix, ":", freq) 

2. Using collections.Counter:

The Counter class from the collections module provides a more concise way to calculate frequencies.

from collections import Counter words = ["apple", "appetite", "banana", "apparel", "ball", "bat"] size = 3 prefixes = [word[:size] for word in words] prefix_freq = Counter(prefixes) for prefix, freq in prefix_freq.items(): print(prefix, ":", freq) 

Explanation:

  • Both methods start by extracting prefixes of the desired length from each word.

  • The first method employs a defaultdict to accumulate the frequency for each prefix. Each time a prefix is encountered, its count is incremented.

  • The second method uses list comprehension to generate a list of all prefixes. Then, the Counter class computes the frequency of each prefix.

Considerations:

  • If a word's length is shorter than the desired size, the entire word is taken as a prefix.

  • If the list contains duplicate words, their prefixes will be counted multiple times. If you want unique prefix counts, ensure to convert the word list to a set before processing.

By following this tutorial, you've learned how to compute the frequency of prefixes in a list of strings using Python.


More Tags

hide nebular flexbox macos-catalina kendo-datasource pytest collocation ngrx sequential radio-group

More Programming Guides

Other Guides

More Programming Examples