Skip to content

Heap-use-after-free issue in pyexpat related to .ExternalEntityParserCreate #139400

@hartwork

Description

@hartwork

Bug report

Bug description:

When configured with --with-address-sanitizer, the test below crashes all versions of CPython:

# Copyright (c) 2025 Sebastian Pipping <sebastian@pipping.org> # Licensed under Zero-Clause BSD ("0BSD") import pyexpat as expat import unittest class ParentParserLifetimeTest(unittest.TestCase): """  Subparsers make use of their parent XML_Parser inside of Expat.  As a result, parent parsers need to outlive subparsers.  Regression test for issue 139400  """ def test_parent_parser_outlives_its_subparsers(self): parser = expat.ParserCreate() subparser = parser.ExternalEntityParserCreate(None) # Now try to cause garbage collection of the parent parser # while it's still being referenced by a related subparser del parser if __name__ == '__main__': unittest.main()

The finding was first documented at #139367 (comment) .

For 3.13, the AddressSanitizer crash details are: (click to expand)
# ./python ..test_file_with_test_class_above_added_dot_py.. -v test_use_after_free__crash (__main__.UseAfterFreeCrashDemoTest.test_use_after_free__crash) ... ================================================================= ==16187==ERROR: AddressSanitizer: heap-use-after-free on address 0x7cda1ad7e038 at pc 0x7f4a1af8a94f bp 0x7ffc61f63720 sp 0x7ffc61f63710 READ of size 8 at 0x7cda1ad7e038 thread T0  #0 0x7f4a1af8a94e in getRootParserOf Modules/expat/xmlparse.c:8660  #1 0x7f4a1af8a94e in expat_free Modules/expat/xmlparse.c:913  #2 0x7f4a1af8a94e in expat_free Modules/expat/xmlparse.c:906  #3 0x7f4a1af8a94e in PyExpat_XML_ParserFree Modules/expat/xmlparse.c:1997  #4 0x7f4a1af70286 in xmlparse_dealloc Modules/pyexpat.c:1266  #5 0x558a55f60555 in Py_DECREF Include/object.h:949  #6 0x558a55f60555 in Py_XDECREF Include/object.h:1042  #7 0x558a55f60555 in _PyFrame_ClearLocals Python/frame.c:104  #8 0x558a55f60555 in _PyFrame_ClearExceptCode Python/frame.c:129  #9 0x558a55e9df91 in clear_thread_frame Python/ceval.c:1682  #10 0x558a55e9df91 in _PyEval_FrameClearAndPop Python/ceval.c:1709  #11 0x558a55ec6b44 in _PyEval_EvalFrameDefault Python/generated_cases.c.h:5222  #12 0x558a55b523d3 in _PyObject_VectorcallTstate Include/internal/pycore_call.h:168  #13 0x558a55b523d3 in method_vectorcall Objects/classobject.c:93  #14 0x558a55b4d5f6 in _PyVectorcall_Call Objects/call.c:273  #15 0x558a55b4d5f6 in _PyObject_Call Objects/call.c:348  #16 0x558a55b4d5f6 in PyObject_Call Objects/call.c:373  #17 0x558a55eb1e8a in _PyEval_EvalFrameDefault Python/generated_cases.c.h:1362  #18 0x558a55b4e383 in _PyObject_VectorcallDictTstate Objects/call.c:135  #19 0x558a55b4e383 in _PyObject_Call_Prepend Objects/call.c:504  #20 0x558a55ce88e6 in slot_tp_call Objects/typeobject.c:9570  #21 0x558a55b470d1 in _PyObject_MakeTpCall Objects/call.c:242  #22 0x558a55ebf7de in _PyEval_EvalFrameDefault Python/generated_cases.c.h:1850  #23 0x558a55b523d3 in _PyObject_VectorcallTstate Include/internal/pycore_call.h:168  #24 0x558a55b523d3 in method_vectorcall Objects/classobject.c:93  #25 0x558a55b4d5f6 in _PyVectorcall_Call Objects/call.c:273  #26 0x558a55b4d5f6 in _PyObject_Call Objects/call.c:348  #27 0x558a55b4d5f6 in PyObject_Call Objects/call.c:373  #28 0x558a55eb1e8a in _PyEval_EvalFrameDefault Python/generated_cases.c.h:1362  #29 0x558a55b4e383 in _PyObject_VectorcallDictTstate Objects/call.c:135  #30 0x558a55b4e383 in _PyObject_Call_Prepend Objects/call.c:504  #31 0x558a55ce88e6 in slot_tp_call Objects/typeobject.c:9570  #32 0x558a55b470d1 in _PyObject_MakeTpCall Objects/call.c:242  #33 0x558a55ebf7de in _PyEval_EvalFrameDefault Python/generated_cases.c.h:1850  #34 0x558a55b523d3 in _PyObject_VectorcallTstate Include/internal/pycore_call.h:168  #35 0x558a55b523d3 in method_vectorcall Objects/classobject.c:93  #36 0x558a55b4d5f6 in _PyVectorcall_Call Objects/call.c:273  #37 0x558a55b4d5f6 in _PyObject_Call Objects/call.c:348  #38 0x558a55b4d5f6 in PyObject_Call Objects/call.c:373  #39 0x558a55eb1e8a in _PyEval_EvalFrameDefault Python/generated_cases.c.h:1362  #40 0x558a55b4e383 in _PyObject_VectorcallDictTstate Objects/call.c:135  #41 0x558a55b4e383 in _PyObject_Call_Prepend Objects/call.c:504  #42 0x558a55ce88e6 in slot_tp_call Objects/typeobject.c:9570  #43 0x558a55b470d1 in _PyObject_MakeTpCall Objects/call.c:242  #44 0x558a55eaa121 in _PyEval_EvalFrameDefault Python/generated_cases.c.h:813  #45 0x558a55b4e383 in _PyObject_VectorcallDictTstate Objects/call.c:135  #46 0x558a55b4e383 in _PyObject_Call_Prepend Objects/call.c:504  #47 0x558a55cfb652 in slot_tp_init Objects/typeobject.c:9816  #48 0x558a55cd8107 in type_call Objects/typeobject.c:1997  #49 0x558a55b470d1 in _PyObject_MakeTpCall Objects/call.c:242  #50 0x558a55eaa121 in _PyEval_EvalFrameDefault Python/generated_cases.c.h:813  #51 0x558a55ed8b3e in _PyEval_EvalFrame Include/internal/pycore_ceval.h:119  #52 0x558a55ed8b3e in _PyEval_Vector Python/ceval.c:1820  #53 0x558a55ed8b3e in PyEval_EvalCode Python/ceval.c:604  #54 0x558a5600eb0e in run_eval_code_obj Python/pythonrun.c:1381  #55 0x558a5600eb0e in run_eval_code_obj Python/pythonrun.c:1348  #56 0x558a5600f127 in run_mod Python/pythonrun.c:1489  #57 0x558a560138d0 in pyrun_file Python/pythonrun.c:1295  #58 0x558a560138d0 in _PyRun_SimpleFileObject Python/pythonrun.c:517  #59 0x558a5601421c in _PyRun_AnyFileObject Python/pythonrun.c:77  #60 0x558a560833ec in pymain_run_file_obj Modules/main.c:410  #61 0x558a560833ec in pymain_run_file Modules/main.c:429  #62 0x558a560833ec in pymain_run_python Modules/main.c:696  #63 0x558a56085156 in Py_RunMain Modules/main.c:775  #64 0x558a56085156 in pymain_main Modules/main.c:805  #65 0x558a56085156 in Py_BytesMain Modules/main.c:829  #66 0x7f4a1b9a733f (/lib64/libc.so.6+0x2733f)  #67 0x7f4a1b9a73f8 in __libc_start_main (/lib64/libc.so.6+0x273f8)  #68 0x558a55a23754 in _start ([..]/cpython/python+0x19c754) 0x7cda1ad7e038 is located 952 bytes inside of 1096-byte region [0x7cda1ad7dc80,0x7cda1ad7e0c8) freed by thread T0 here:  #0 0x7f4a1bd6b9eb (/usr/lib/gcc/x86_64-pc-linux-gnu/15/libasan.so.8+0x11f9eb)  #1 0x7f4a1af87e12 in expat_free Modules/expat/xmlparse.c:934  #2 0x7f4a1af87e12 in expat_free Modules/expat/xmlparse.c:906  #3 0x7f4a1af87e12 in PyExpat_XML_ParserFree Modules/expat/xmlparse.c:2011  #4 0x7f4a1af70286 in xmlparse_dealloc Modules/pyexpat.c:1266  #5 0x558a55f60555 in Py_DECREF Include/object.h:949  #6 0x558a55f60555 in Py_XDECREF Include/object.h:1042  #7 0x558a55f60555 in _PyFrame_ClearLocals Python/frame.c:104  #8 0x558a55f60555 in _PyFrame_ClearExceptCode Python/frame.c:129  #9 0x558a55e9df91 in clear_thread_frame Python/ceval.c:1682  #10 0x558a55e9df91 in _PyEval_FrameClearAndPop Python/ceval.c:1709  #11 0x558a55ec6b44 in _PyEval_EvalFrameDefault Python/generated_cases.c.h:5222  #12 0x558a55b523d3 in _PyObject_VectorcallTstate Include/internal/pycore_call.h:168  #13 0x558a55b523d3 in method_vectorcall Objects/classobject.c:93  #14 0x558a55b4d5f6 in _PyVectorcall_Call Objects/call.c:273  #15 0x558a55b4d5f6 in _PyObject_Call Objects/call.c:348  #16 0x558a55b4d5f6 in PyObject_Call Objects/call.c:373  #17 0x558a55eb1e8a in _PyEval_EvalFrameDefault Python/generated_cases.c.h:1362  #18 0x558a55b4e383 in _PyObject_VectorcallDictTstate Objects/call.c:135  #19 0x558a55b4e383 in _PyObject_Call_Prepend Objects/call.c:504  #20 0x558a55ce88e6 in slot_tp_call Objects/typeobject.c:9570  #21 0x558a55b470d1 in _PyObject_MakeTpCall Objects/call.c:242  #22 0x558a55ebf7de in _PyEval_EvalFrameDefault Python/generated_cases.c.h:1850  #23 0x558a55b523d3 in _PyObject_VectorcallTstate Include/internal/pycore_call.h:168  #24 0x558a55b523d3 in method_vectorcall Objects/classobject.c:93  #25 0x558a55b4d5f6 in _PyVectorcall_Call Objects/call.c:273  #26 0x558a55b4d5f6 in _PyObject_Call Objects/call.c:348  #27 0x558a55b4d5f6 in PyObject_Call Objects/call.c:373  #28 0x558a55eb1e8a in _PyEval_EvalFrameDefault Python/generated_cases.c.h:1362  #29 0x558a55b4e383 in _PyObject_VectorcallDictTstate Objects/call.c:135  #30 0x558a55b4e383 in _PyObject_Call_Prepend Objects/call.c:504  #31 0x558a55ce88e6 in slot_tp_call Objects/typeobject.c:9570  #32 0x558a55b470d1 in _PyObject_MakeTpCall Objects/call.c:242  #33 0x558a55ebf7de in _PyEval_EvalFrameDefault Python/generated_cases.c.h:1850  #34 0x558a55b523d3 in _PyObject_VectorcallTstate Include/internal/pycore_call.h:168  #35 0x558a55b523d3 in method_vectorcall Objects/classobject.c:93  #36 0x558a55b4d5f6 in _PyVectorcall_Call Objects/call.c:273  #37 0x558a55b4d5f6 in _PyObject_Call Objects/call.c:348  #38 0x558a55b4d5f6 in PyObject_Call Objects/call.c:373  #39 0x558a55eb1e8a in _PyEval_EvalFrameDefault Python/generated_cases.c.h:1362  #40 0x558a55b4e383 in _PyObject_VectorcallDictTstate Objects/call.c:135  #41 0x558a55b4e383 in _PyObject_Call_Prepend Objects/call.c:504  #42 0x558a55ce88e6 in slot_tp_call Objects/typeobject.c:9570  #43 0x558a55b470d1 in _PyObject_MakeTpCall Objects/call.c:242  #44 0x558a55eaa121 in _PyEval_EvalFrameDefault Python/generated_cases.c.h:813  #45 0x558a55b4e383 in _PyObject_VectorcallDictTstate Objects/call.c:135  #46 0x558a55b4e383 in _PyObject_Call_Prepend Objects/call.c:504  #47 0x558a55cfb652 in slot_tp_init Objects/typeobject.c:9816  #48 0x558a55cd8107 in type_call Objects/typeobject.c:1997 previously allocated by thread T0 here:  #0 0x7f4a1bd6ceab in malloc (/usr/lib/gcc/x86_64-pc-linux-gnu/15/libasan.so.8+0x120eab)  #1 0x7f4a1af8b227 in parserCreate Modules/expat/xmlparse.c:1364  #2 0x7f4a1af6f0d1 in newxmlparseobject Modules/pyexpat.c:1211  #3 0x7f4a1af6f0d1 in pyexpat_ParserCreate_impl Modules/pyexpat.c:1609  #4 0x7f4a1af6f0d1 in pyexpat_ParserCreate Modules/clinic/pyexpat.c.h:511  #5 0x558a55c4ed39 in cfunction_vectorcall_FASTCALL_KEYWORDS Objects/methodobject.c:440  #6 0x558a55b48813 in _PyObject_VectorcallTstate Include/internal/pycore_call.h:168  #7 0x558a55b48813 in PyObject_Vectorcall Objects/call.c:327  #8 0x558a55eaa121 in _PyEval_EvalFrameDefault Python/generated_cases.c.h:813  #9 0x558a55b523d3 in _PyObject_VectorcallTstate Include/internal/pycore_call.h:168  #10 0x558a55b523d3 in method_vectorcall Objects/classobject.c:93  #11 0x558a55b4d5f6 in _PyVectorcall_Call Objects/call.c:273  #12 0x558a55b4d5f6 in _PyObject_Call Objects/call.c:348  #13 0x558a55b4d5f6 in PyObject_Call Objects/call.c:373  #14 0x558a55eb1e8a in _PyEval_EvalFrameDefault Python/generated_cases.c.h:1362  #15 0x558a55b4e383 in _PyObject_VectorcallDictTstate Objects/call.c:135  #16 0x558a55b4e383 in _PyObject_Call_Prepend Objects/call.c:504  #17 0x558a55ce88e6 in slot_tp_call Objects/typeobject.c:9570  #18 0x558a55b470d1 in _PyObject_MakeTpCall Objects/call.c:242  #19 0x558a55ebf7de in _PyEval_EvalFrameDefault Python/generated_cases.c.h:1850  #20 0x558a55b523d3 in _PyObject_VectorcallTstate Include/internal/pycore_call.h:168  #21 0x558a55b523d3 in method_vectorcall Objects/classobject.c:93  #22 0x558a55b4d5f6 in _PyVectorcall_Call Objects/call.c:273  #23 0x558a55b4d5f6 in _PyObject_Call Objects/call.c:348  #24 0x558a55b4d5f6 in PyObject_Call Objects/call.c:373  #25 0x558a55eb1e8a in _PyEval_EvalFrameDefault Python/generated_cases.c.h:1362  #26 0x558a55b4e383 in _PyObject_VectorcallDictTstate Objects/call.c:135  #27 0x558a55b4e383 in _PyObject_Call_Prepend Objects/call.c:504  #28 0x558a55ce88e6 in slot_tp_call Objects/typeobject.c:9570  #29 0x558a55b470d1 in _PyObject_MakeTpCall Objects/call.c:242  #30 0x558a55ebf7de in _PyEval_EvalFrameDefault Python/generated_cases.c.h:1850  #31 0x558a55b523d3 in _PyObject_VectorcallTstate Include/internal/pycore_call.h:168  #32 0x558a55b523d3 in method_vectorcall Objects/classobject.c:93  #33 0x558a55b4d5f6 in _PyVectorcall_Call Objects/call.c:273  #34 0x558a55b4d5f6 in _PyObject_Call Objects/call.c:348  #35 0x558a55b4d5f6 in PyObject_Call Objects/call.c:373  #36 0x558a55eb1e8a in _PyEval_EvalFrameDefault Python/generated_cases.c.h:1362  #37 0x558a55b4e383 in _PyObject_VectorcallDictTstate Objects/call.c:135  #38 0x558a55b4e383 in _PyObject_Call_Prepend Objects/call.c:504  #39 0x558a55ce88e6 in slot_tp_call Objects/typeobject.c:9570  #40 0x558a55b470d1 in _PyObject_MakeTpCall Objects/call.c:242  #41 0x558a55eaa121 in _PyEval_EvalFrameDefault Python/generated_cases.c.h:813  #42 0x558a55b4e383 in _PyObject_VectorcallDictTstate Objects/call.c:135  #43 0x558a55b4e383 in _PyObject_Call_Prepend Objects/call.c:504  #44 0x558a55cfb652 in slot_tp_init Objects/typeobject.c:9816  #45 0x558a55cd8107 in type_call Objects/typeobject.c:1997 SUMMARY: AddressSanitizer: heap-use-after-free Modules/expat/xmlparse.c:8660 in getRootParserOf Shadow bytes around the buggy address:  0x7cda1ad7dd80: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd  0x7cda1ad7de00: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd  0x7cda1ad7de80: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd  0x7cda1ad7df00: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd  0x7cda1ad7df80: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd =>0x7cda1ad7e000: fd fd fd fd fd fd fd[fd]fd fd fd fd fd fd fd fd  0x7cda1ad7e080: fd fd fd fd fd fd fd fd fd fa fa fa fa fa fa fa  0x7cda1ad7e100: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa  0x7cda1ad7e180: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd  0x7cda1ad7e200: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd  0x7cda1ad7e280: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd Shadow byte legend (one shadow byte represents 8 application bytes):  Addressable: 00  Partially addressable: 01 02 03 04 05 06 07   Heap left redzone: fa  Freed heap region: fd  Stack left redzone: f1  Stack mid redzone: f2  Stack right redzone: f3  Stack after return: f5  Stack use after scope: f8  Global redzone: f9  Global init order: f6  Poisoned by user: f7  Container overflow: fc  Array cookie: ac  Intra object redzone: bb  ASan internal: fe  Left alloca redzone: ca  Right alloca redzone: cb ==16187==ABORTING

My understanding is that there is a bug in the graph of object relations and that the same parser instance is being freed twice as a consequence.

CC @picnixz

CPython versions tested on:

3.15, 3.14, 3.13, 3.12, 3.11, 3.10, 3.9

Operating systems tested on:

Other, Windows, macOS, Linux

Linked PRs

Metadata

Metadata

Assignees

Projects

Status

Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions