fix: use errors='replace' in Frame.__str__() for partial UTF-8 frames (fixes #1695) by naarob · Pull Request #1704 · python-websockets/websockets

naarob · 2026-03-26T06:34:16Z

Fixes UnicodeDecodeError when DEBUG logging is enabled and a large text message is fragmented at byte boundaries. See issue #1695 for full details.

data = repr(bytes(self.data).decode(errors="replace"))

9 new tests. 79 upstream pass. 0 regressions.

…python-websockets#1695) Frame.__str__() decoded OP_TEXT frame data with a bare .decode(), which raises UnicodeDecodeError when the frame ends in the middle of a multi-byte UTF-8 sequence. This happens when the websockets library itself fragments a large text message at byte boundaries (not at character boundaries) for continuations frames (fin=False), e.g. Japanese, Chinese, or emoji text. When DEBUG logging is enabled, the UnicodeDecodeError propagated and caused the connection to close with code 1007 (INVALID_DATA), even though the message was valid. The data itself was fine — only the logging was broken. Fix: add errors='replace' to the .decode() call in Frame.__str__(). This replaces incomplete sequences with U+FFFD (replacement character), making the log entry human-readable while never crashing the connection. Tests: 9 new tests covering partial Japanese, partial emoji, complete frames, ASCII, binary, and ping frames. 79 upstream tests unchanged.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: use errors='replace' in Frame.str() for partial UTF-8 frames (fixes #1695)#1704

fix: use errors='replace' in Frame.str() for partial UTF-8 frames (fixes #1695)#1704
naarob wants to merge 1 commit intopython-websockets:mainfrom
naarob:main

naarob commented Mar 26, 2026

Labels

1 participant

Uh oh!

Conversation

naarob commented Mar 26, 2026

Labels

1 participant