I’m using ParallelMap on a very large dataset (millions of elements), but it quickly consumes all available RAM.
For example:
result = ParallelMap[func, data] This causes RAM usage to fill up completely.
To reduce memory usage, I tried dividing the data into 20 or more parts, but it doesn’t seem to help. I’m not sure if this is the correct approach and I just need to divide into more chunks, or if there’s something fundamentally wrong with this method.
blockSize = Ceiling[Length[data]/20]; parts = Partition[data, blockSize, blockSize, {1, 1}, {}]; result = Reap[ Do[ partialRes = ParallelMap[func, part, Method -> "FinestGrained"]; Sow[partialRes]; Clear[partialRes]; , {part, parts} ] ] Even when splitting the data into 20 or more chunks, the RAM usage still becomes full.
Is there a better way to manage memory when using ParallelMap on large datasets?