Skip to main content
added 1284 characters in body
Source Link
Ben Izd
  • 9.7k
  • 1
  • 16
  • 47

Split[ls, fn] calls fn with items of Partition[ls,2,1], so if we create a temporary variable (here as i) that in every comparison, the first value gets added to the i and if i + second element is under 20, it returns True meaning keeping them together, otherwise, i=0 and return False, meaning adding a breakpoint.

Block[{i = 0}, Split[alist, (i += #1; If[i + #2 < 20, True, i = 0; False]) &] ] (* Out: {{2, 5, 1, 8, 1, 1}, {9, 7, 1}, {5, 2, 9}, {6, 2, 2, 2, 4, 3}, {2, 7}} *) 

Using @rhermans benchmark code on Ryzen 1700 (AbsoluteTiming):

enter image description here

Update 1

I never thought it would be this competitive. If you're looking to squeeze as much as possible, we could have:

Block[{i = 0}, Split[#, If[(i += #1) + #2 < 20, True, i = 0] &]] & 

It doesn't return False probably thanks to a TrueQ in its internal.

Update 2

Reading @Syed answer, inspired me to add a general function just like his cSplit3 but faster ;)

It starts from i=1 to j and applies the function until it returns False, which then the current position will be Sowed and i=j. After that Differences is applied and the rest is in TakeList hands.

ClearAll[customSplit]; customSplit[data_, fn_] := Block[{i = 1}, TakeList[data, Differences@ Reap[Sow[1]; Do[If[fn[data[[i ;; j]]], Null, Sow[i = j];], {j, Length@data}]; Sow[Length@data + 1];][[2, 1]]] ] 

It doesn't score from a readability point of view but it's fast.

Test

Assuming:

SeedRandom[1]; alist = RandomInteger[{1, 10}, 10000]; blist = RandomReal[{0, 3}, 10000]; wtdata = Transpose[{alist, blist}]; 
tex1 = cSplit3[wtdata, Total@(Times @@@ #) < 30 &]; // MaxMemoryUsed // RepeatedTiming (* Out: {0.163039, 282136} *) tex2 = customSplit[wtdata, Total@(Times @@@ #) < 30 &]; // MaxMemoryUsed // RepeatedTiming (* Out: {0.062228, 278920} *) tex2 == tex1 (* Out: True *) 

And Finally, the most delicious part (courtesy of @AlexeyPopkov and @ChrisDegnen):

enter image description here

Split[ls, fn] calls fn with items of Partition[ls,2,1], so if we create a temporary variable (here as i) that in every comparison, the first value gets added to the i and if i + second element is under 20, it returns True meaning keeping them together, otherwise, i=0 and return False, meaning adding a breakpoint.

Block[{i = 0}, Split[alist, (i += #1; If[i + #2 < 20, True, i = 0; False]) &] ] (* Out: {{2, 5, 1, 8, 1, 1}, {9, 7, 1}, {5, 2, 9}, {6, 2, 2, 2, 4, 3}, {2, 7}} *) 

Using @rhermans benchmark code on Ryzen 1700 (AbsoluteTiming):

enter image description here

Update 1

I never thought it would be this competitive. If you're looking to squeeze as much as possible, we could have:

Block[{i = 0}, Split[#, If[(i += #1) + #2 < 20, True, i = 0] &]] & 

It doesn't return False probably thanks to a TrueQ in its internal.

Split[ls, fn] calls fn with items of Partition[ls,2,1], so if we create a temporary variable (here as i) that in every comparison, the first value gets added to the i and if i + second element is under 20, it returns True meaning keeping them together, otherwise, i=0 and return False, meaning adding a breakpoint.

Block[{i = 0}, Split[alist, (i += #1; If[i + #2 < 20, True, i = 0; False]) &] ] (* Out: {{2, 5, 1, 8, 1, 1}, {9, 7, 1}, {5, 2, 9}, {6, 2, 2, 2, 4, 3}, {2, 7}} *) 

Using @rhermans benchmark code on Ryzen 1700 (AbsoluteTiming):

enter image description here

Update 1

I never thought it would be this competitive. If you're looking to squeeze as much as possible, we could have:

Block[{i = 0}, Split[#, If[(i += #1) + #2 < 20, True, i = 0] &]] & 

It doesn't return False probably thanks to a TrueQ in its internal.

Update 2

Reading @Syed answer, inspired me to add a general function just like his cSplit3 but faster ;)

It starts from i=1 to j and applies the function until it returns False, which then the current position will be Sowed and i=j. After that Differences is applied and the rest is in TakeList hands.

ClearAll[customSplit]; customSplit[data_, fn_] := Block[{i = 1}, TakeList[data, Differences@ Reap[Sow[1]; Do[If[fn[data[[i ;; j]]], Null, Sow[i = j];], {j, Length@data}]; Sow[Length@data + 1];][[2, 1]]] ] 

It doesn't score from a readability point of view but it's fast.

Test

Assuming:

SeedRandom[1]; alist = RandomInteger[{1, 10}, 10000]; blist = RandomReal[{0, 3}, 10000]; wtdata = Transpose[{alist, blist}]; 
tex1 = cSplit3[wtdata, Total@(Times @@@ #) < 30 &]; // MaxMemoryUsed // RepeatedTiming (* Out: {0.163039, 282136} *) tex2 = customSplit[wtdata, Total@(Times @@@ #) < 30 &]; // MaxMemoryUsed // RepeatedTiming (* Out: {0.062228, 278920} *) tex2 == tex1 (* Out: True *) 

And Finally, the most delicious part (courtesy of @AlexeyPopkov and @ChrisDegnen):

enter image description here

added 279 characters in body
Source Link
Ben Izd
  • 9.7k
  • 1
  • 16
  • 47

Split[ls, fn] calls fn with items of Partition[ls,2,1], so if we create a temporary variable (here as i) that in every comparison, the first value gets added to the i and if i + second element is under 20, it returns True meaning keeping them together, otherwise, i=0 and return False, meaning adding a breakpoint.

Block[{i = 0}, Split[alist, (i += #1; If[i + #2 < 20, True, i = 0; False]) &] ] (* Out: {{2, 5, 1, 8, 1, 1}, {9, 7, 1}, {5, 2, 9}, {6, 2, 2, 2, 4, 3}, {2, 7}} *) 

Using @rhermans benchmark code on Ryzen 1700 (AbsoluteTiming):

enter image description here

Update 1

I never thought it would be this competitive. If you're looking to squeeze as much as possible, we could have:

Block[{i = 0}, Split[#, If[(i += #1) + #2 < 20, True, i = 0] &]] & 

It doesn't return False probably thanks to a TrueQ in its internal.

Split[ls, fn] calls fn with items of Partition[ls,2,1], so if we create a temporary variable (here as i) that in every comparison, the first value gets added to the i and if i + second element is under 20, it returns True meaning keeping them together, otherwise, i=0 and return False, meaning adding a breakpoint.

Block[{i = 0}, Split[alist, (i += #1; If[i + #2 < 20, True, i = 0; False]) &] ] (* Out: {{2, 5, 1, 8, 1, 1}, {9, 7, 1}, {5, 2, 9}, {6, 2, 2, 2, 4, 3}, {2, 7}} *) 

Using @rhermans benchmark code on Ryzen 1700 (AbsoluteTiming):

enter image description here

Split[ls, fn] calls fn with items of Partition[ls,2,1], so if we create a temporary variable (here as i) that in every comparison, the first value gets added to the i and if i + second element is under 20, it returns True meaning keeping them together, otherwise, i=0 and return False, meaning adding a breakpoint.

Block[{i = 0}, Split[alist, (i += #1; If[i + #2 < 20, True, i = 0; False]) &] ] (* Out: {{2, 5, 1, 8, 1, 1}, {9, 7, 1}, {5, 2, 9}, {6, 2, 2, 2, 4, 3}, {2, 7}} *) 

Using @rhermans benchmark code on Ryzen 1700 (AbsoluteTiming):

enter image description here

Update 1

I never thought it would be this competitive. If you're looking to squeeze as much as possible, we could have:

Block[{i = 0}, Split[#, If[(i += #1) + #2 < 20, True, i = 0] &]] & 

It doesn't return False probably thanks to a TrueQ in its internal.

Amended Partition arguments
Source Link
Chris Degnen
  • 31.4k
  • 2
  • 57
  • 112

Split[ls, fn] calls fn with items of Partition[ls,2]2,1], so if we create a temporary variable (here as i) that in every comparison, the first value gets added to the i and if i + second element is under 20, it returns True meaning keeping them together, otherwise, i=0 and return False, meaning adding a breakpoint.

Block[{i = 0}, Split[alist, (i += #1; If[i + #2 < 20, True, i = 0; False]) &] ] (* Out: {{2, 5, 1, 8, 1, 1}, {9, 7, 1}, {5, 2, 9}, {6, 2, 2, 2, 4, 3}, {2, 7}} *) 

Using @rhermans benchmark code on Ryzen 1700 (AbsoluteTiming):

enter image description here

Split[ls, fn] calls fn with items of Partition[ls,2], so if we create a temporary variable (here as i) that in every comparison, the first value gets added to the i and if i + second element is under 20, it returns True meaning keeping them together, otherwise, i=0 and return False, meaning adding a breakpoint.

Block[{i = 0}, Split[alist, (i += #1; If[i + #2 < 20, True, i = 0; False]) &] ] (* Out: {{2, 5, 1, 8, 1, 1}, {9, 7, 1}, {5, 2, 9}, {6, 2, 2, 2, 4, 3}, {2, 7}} *) 

Using @rhermans benchmark code on Ryzen 1700 (AbsoluteTiming):

enter image description here

Split[ls, fn] calls fn with items of Partition[ls,2,1], so if we create a temporary variable (here as i) that in every comparison, the first value gets added to the i and if i + second element is under 20, it returns True meaning keeping them together, otherwise, i=0 and return False, meaning adding a breakpoint.

Block[{i = 0}, Split[alist, (i += #1; If[i + #2 < 20, True, i = 0; False]) &] ] (* Out: {{2, 5, 1, 8, 1, 1}, {9, 7, 1}, {5, 2, 9}, {6, 2, 2, 2, 4, 3}, {2, 7}} *) 

Using @rhermans benchmark code on Ryzen 1700 (AbsoluteTiming):

enter image description here

deleted 232 characters in body
Source Link
Ben Izd
  • 9.7k
  • 1
  • 16
  • 47
Loading
deleted 232 characters in body
Source Link
Ben Izd
  • 9.7k
  • 1
  • 16
  • 47
Loading
added 570 characters in body
Source Link
Ben Izd
  • 9.7k
  • 1
  • 16
  • 47
Loading
Source Link
Ben Izd
  • 9.7k
  • 1
  • 16
  • 47
Loading