Fix OpenMP memory leak and segfault in fragment Hessian diagonalization #1366
+37 −22
Add this suggestion to a batch that can be applied as a single commit. This suggestion is invalid because no changes were made to the code. Suggestions cannot be applied while the pull request is closed. Suggestions cannot be applied while viewing a subset of changes. Only one suggestion per line can be applied in a batch. Add this suggestion to a batch that can be applied as a single commit. Applying suggestions on deleted lines is not supported. You must change the existing code in this line in order to create a valid suggestion. Outdated suggestions cannot be applied. This suggestion has been applied or marked resolved. Suggestions cannot be applied from pending reviews. Suggestions cannot be applied on multi-line comments. Suggestions cannot be applied while the pull request is queued to merge. Suggestion cannot be applied right now. Please check back later.
This commit resolves a segmentation fault and massive memory consumption in the GFN-FF fragment Hessian diagonalization (
frag_hess.f90).The original implementation used a large automatic array
hess_mask(size 3N x 3N) inside an OpenMP parallel region, leading to stack overflows on systems with moderate atom counts (~2000+). Additionally, an OpenMPreduction(+:ev_calc)on the eigenvector matrix caused each thread to allocate a private copy of the full matrix, resulting in O(N_threads * N_atoms^2) memory usage and OOM crashes.Changes:
hess_maskautomatic array.!$omp critical.pack/unpack) with explicit loops to handle fragment data, improving efficiency and reducing memory overhead.Outcomes:
Less technical story:
Generally I had a problem with the frequency calculations when I did use --bhess with 3k+ atom systems. Setting OMP_STACKSIZE and unlimit wasn't helping (the program did run until it ran out of the memory).
So I did some debugging why by compiling with
-g -traceback -check boundsand ran through gdb to get:Things to review: What about the
!$omp do schedule(dynamic)- better to use here static or dynamic ?