I have an extremely large CSV file that contains only entries like the following:
Nil,+1 int,+1 int,-1 int,-1 Nil,+1 Nil,-1 Dictionary,+1 Dictionary,-1 Array,+1 Nil,+1 String,+1 I have parsed the file in Wolfram via
ds = Import["/path/to/large/file.txt", {"CSV", "Dataset"}, HeaderLines -> 0]; listOfAssoc = (Association[Rule @@ #1]) & /@ (ds // Normal); Merge[listOfAssoc, Total] which yields
<|"int" -> -6159, "Nil" -> 72282, "Array" -> -9, "Dictionary" -> -15, "String" -> -371, "bool" -> -266, "float" -> 15857, "RID" -> 0, "Rect2" -> 0, "Color" -> -23, "PoolVector2Array" -> 0, "PoolRealArray" -> 0, "PoolIntArray" -> 0, "Vector2" -> -10, "PoolStringArray" -> 0, "Transform" -> -2, "Transform2D" -> 0, "Object" -> 1042, "Vector3" -> 612, "PoolVector3Array" -> 0, "PoolColorArray" -> 0, "Plane" -> -4, "Quat" -> 0, "AABB" -> 0, "Basis" -> 0, "NodePath" -> 0, "PoolByteArray" -> 0|>
This looks about correct to me as far as the actual computation goes (i.e. I'm just trying to add up a bunch of 1's and -1's from log data to see if certain types in another computer program are leaking memory, and this computation was very helpful for that).
Problem.
The last line of this computation (Merge[listOfASsoc, Total]) takes several minutes (like 10 minutes). Am I doing something wrong?

Merge[list, Total]seems to have roughly $O(n^2)$ time complexity for large number of non-empty associations inlist. Maybe this is the source of trouble? In this case you could possibly perform the operation more efficiently in some sort of divide and conquer method thanks to properties ofTotal... $\endgroup$Merge[listOfAssoc, Total]withFirst@NestWhile[Merge[Total] /@ Partition[#, UpTo@64] &, listOfAssoc, Length[#] > 1 &]. This changes the order of summation which may or may not be relevant in your application, but at least yields much more linear time complexity. $\endgroup$