- Notifications
You must be signed in to change notification settings - Fork 2
feat(data structures): stream checker and trie node #107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
structure for suffice matches The stream checker data structure leverages the use of the Trie data structure to find suffixes that match words that it was initialized with. Since the trie is a Prefix Tree essentially matching on prefixes, this required a reverse of the Trie to instead match on suffixes. Note that not change to the Trie node is changed other than the initialization of using char in the constructor. BREAKING CHANGE The Trie data structure does not handle the search correctly anymore and will need to be refactored to cater for the changes that have been introduced.
WalkthroughThe pull request introduces a Stream Checker data structure and a complete Trie implementation. It adds new files for StreamChecker (which processes character streams and checks suffix matches against a word list using a reversed Trie), implements TrieNode and Trie classes, and updates documentation to reflect these new additions in the codebase. Changes
Sequence Diagram(s)sequenceDiagram participant Client participant StreamChecker participant Trie participant Deque Client->>StreamChecker: __init__(["go", "hi"]) StreamChecker->>StreamChecker: reverse each word StreamChecker->>Trie: insert("og"), insert("ih") Trie->>Trie: build reversed-word trie StreamChecker->>Deque: create deque(maxlen=2) Client->>StreamChecker: query('g') StreamChecker->>Deque: append('g') StreamChecker->>Trie: traverse from root using stream buffer Trie-->>StreamChecker: no match (not end node) StreamChecker-->>Client: false Client->>StreamChecker: query('o') StreamChecker->>Deque: append('o'), deque=['g','o'] StreamChecker->>Trie: traverse ['o','g'] in reverse Trie-->>StreamChecker: match found (end node) StreamChecker-->>Client: true Client->>StreamChecker: query('i') StreamChecker->>Deque: append('i'), deque=['o','i'] StreamChecker->>Trie: traverse ['i','o'] in reverse Trie-->>StreamChecker: no match StreamChecker-->>Client: false Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Tip 📝 Customizable high-level summaries are now available in beta!You can now customize how CodeRabbit generates the high-level summary in your pull requests — including its content, structure, tone, and formatting.
Example instruction:
Note: This feature is currently in beta for Pro-tier users, and pricing will be announced later. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Nitpick comments (4)
DIRECTORY.md (1)
268-270: Fix list indentation to satisfy markdownlint (MD007).
markdownlint-cli2is flagging the new Streams and Trie bullets for incorrect indentation. Adjusting leading spaces will keep this index consistent and clear.Suggested diff:
- * Streams - * Stream Checker - * [Test Stream Checker](https://github.com/BrianLusina/PythonSnips/blob/master/datastructures/streams/stream_checker/test_stream_checker.py) +* Streams + * Stream Checker + * [Test Stream Checker](https://github.com/BrianLusina/PythonSnips/blob/master/datastructures/streams/stream_checker/test_stream_checker.py) @@ - * Trie - * [Trie](https://github.com/BrianLusina/PythonSnips/blob/master/datastructures/trees/trie/trie.py) - * [Trie Node](https://github.com/BrianLusina/PythonSnips/blob/master/datastructures/trees/trie/trie_node.py) + * Trie + * [Trie](https://github.com/BrianLusina/PythonSnips/blob/master/datastructures/trees/trie/trie.py) + * [Trie Node](https://github.com/BrianLusina/PythonSnips/blob/master/datastructures/trees/trie/trie_node.py)Also applies to: 309-311
datastructures/streams/stream_checker/test_stream_checker.py (1)
1-35: StreamChecker tests cover core behavior; consider one overlapping-words case.The three tests validate basic true/false behavior and multi-query sequences correctly. As a small enhancement, you might add a case with overlapping longer words (e.g.,
words = ["cd", "bcd", "abcd"]with stream"a", "b", "c", "d") to guard against regressions in the reversed-trie traversal logic.datastructures/streams/stream_checker/__init__.py (1)
1-73: StreamChecker logic looks correct; clarify docs and consider tiny cleanups.The reversed-trie + bounded
dequeimplementation is sound and matches the intended suffix-checker semantics, assuming theTrieNodeimport/annotation issue is fixed as noted indatastructures/trees/trie/trie_node.py.A few small, non-blocking improvements:
Docstring accuracy (lines 42–55):
- The comment “True if the letter is the end of a word” understates the behavior; the method returns
Trueif any suffix of the current stream forms a word.- The complexity text repeats itself as
O(L)andO(Lmax); you could simplify toO(min(L, Lmax))and then note thatLmaxis bounded (≤ longest word length), so per-query cost is effectively O(1) in that parameter.Micro-optimizations (optional):
for word in self.words[::-1]:could befor word in reversed(self.words):to avoid copying the list.self.wordsis only used in__build_trie; if you don’t need it later, you could drop the instance attribute and just passwordsinto__build_trieto reduce stored state.These are minor; the core implementation is good as-is.
datastructures/trees/trie/trie.py (1)
9-20: Consider adding a docstring.The insertion logic is correct, but adding a docstring would improve API documentation consistency (especially since
starts_withhas one).def insert(self, word: str) -> None: + """ + Inserts a word into the trie. + + Args: + word: The word to insert into the trie. + """ curr = self.root
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (3)
datastructures/streams/stream_checker/images/examples/stream_checker_example_1.pngis excluded by!**/*.pngdatastructures/streams/stream_checker/images/examples/stream_checker_example_2.pngis excluded by!**/*.pngdatastructures/streams/stream_checker/images/examples/stream_checker_example_3.pngis excluded by!**/*.png
📒 Files selected for processing (7)
DIRECTORY.md(2 hunks)datastructures/streams/stream_checker/README.md(1 hunks)datastructures/streams/stream_checker/__init__.py(1 hunks)datastructures/streams/stream_checker/test_stream_checker.py(1 hunks)datastructures/trees/trie/__init__.py(1 hunks)datastructures/trees/trie/trie.py(1 hunks)datastructures/trees/trie/trie_node.py(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (4)
datastructures/trees/trie/__init__.py (2)
datastructures/trees/trie/trie_node.py (1)
TrieNode(5-23)datastructures/trees/trie/trie.py (1)
Trie(5-57)
datastructures/streams/stream_checker/__init__.py (1)
datastructures/trees/trie/trie_node.py (1)
TrieNode(5-23)
datastructures/streams/stream_checker/test_stream_checker.py (1)
datastructures/streams/stream_checker/__init__.py (2)
StreamChecker(6-73)query(42-73)
datastructures/trees/trie/trie.py (1)
datastructures/trees/trie/trie_node.py (1)
TrieNode(5-23)
🪛 markdownlint-cli2 (0.18.1)
DIRECTORY.md
268-268: Unordered list indentation
Expected: 0; Actual: 2
(MD007, ul-indent)
269-269: Unordered list indentation
Expected: 2; Actual: 4
(MD007, ul-indent)
270-270: Unordered list indentation
Expected: 4; Actual: 6
(MD007, ul-indent)
309-309: Unordered list indentation
Expected: 2; Actual: 4
(MD007, ul-indent)
310-310: Unordered list indentation
Expected: 4; Actual: 6
(MD007, ul-indent)
311-311: Unordered list indentation
Expected: 4; Actual: 6
(MD007, ul-indent)
🔇 Additional comments (3)
datastructures/trees/trie/__init__.py (1)
1-8: Re-export ofTrieandTrieNodelooks good.The explicit imports plus
__all__ = ["Trie", "TrieNode"]cleanly define the package surface and supportfrom datastructures.trees.trie import Trie, TrieNodeas intended.datastructures/trees/trie/trie.py (2)
5-7: LGTM!The Trie initialization is straightforward and correct.
46-57: LGTM!The
starts_withmethod correctly checks for prefix existence in the Trie, with appropriate documentation.
Describe your change:
This adds a stream checker data structure for suffix matches
The stream checker data structure leverages the use of the Trie data structure to find suffixes that match words that it was initialized with. Since the trie is a Prefix Tree essentially matching on prefixes, this required a reverse of the Trie to instead match on suffixes.
Note that not change to the Trie node is changed other than the
initialization of using char in the constructor.
BREAKING CHANGE
The Trie data structure does not handle the search
correctly anymore and will need to be refactored to cater for the
changes that have been introduced.
Checklist:
Fixes: #{$ISSUE_NO}.Summary by CodeRabbit