Most efficient way of making pair list from a list

Question

I have a bunch of datasets which look like:

3 7 1 6 5 8 2 4 1 2 8 1 5 2 5 ...

and I need to make a list of pairs such that the first element makes a pair with other elements in the same line.

3 7 3 1 3 6 ...

I usually use Table because it is fast enough for $10^3$ elements, but as my dataset gets bigger, it's no longer efficient. For example:

data = RandomInteger[{1, 10}, {10^7, 5}]; Flatten[Table[{data[[i, 1]], data[[i, j]]}, {i, 1, Length@data}, {j, 2, 5}], 1]//AbsoluteTiming//First (* 39.828744 *)

What is the most efficient way to find this pair list?

Timing

Finally I had a chance to do a timing on my real machine:

(* Kuba *) 0.726869 (* ciao *) 1.297926 (* nested table*) 2.146111 (* Shutao Tang *) 11.290944 (* Mr.Wizard♦ *) 57.232387

Although it seems results can be (completely) different if I do a timing on my old laptop!

Are the elements always a machine size Integer? Or always a machine-precision Real? Or mixture of different types of data? — Michael E2
– Michael E2, Commented Apr 11, 2015 at 21:38

Mr.Wizard · Accepted Answer · 2015-04-11 22:55:52Z

16

About 4x faster:

Partition[Flatten @ data[[All, {1, 2, 1, 3, 1, 4, 1, 5}]], 2]

edited Apr 11, 2015 at 22:55

Mr.Wizard

275k34 gold badges606 silver badges1.5k bronze badges

answered Apr 11, 2015 at 22:16

Kuba

139k13 gold badges297 silver badges803 bronze badges

1

$\begingroup$ Shouldn't you make this more general? $\endgroup$

Mr.Wizard
– Mr.Wizard

2015-04-11 22:32:30 +00:00
Commented Apr 11, 2015 at 22:32
$\begingroup$ It depends. If it is slow, probably doesn't matter. If it is fast, ok. But atm I'm confused because I'm testing your and rasher's method with result that your is 20x slower than OP's. So as soon as I figure it out I will tell you :P $\endgroup$

Kuba
– Kuba

2015-04-11 22:35:48 +00:00
Commented Apr 11, 2015 at 22:35
$\begingroup$ LOL -- looks like I wrote a function for the wrong shape of data. Okay, +1, and back to the drawing board. $\endgroup$

Mr.Wizard
– Mr.Wizard

2015-04-11 22:43:17 +00:00
Commented Apr 11, 2015 at 22:43
$\begingroup$ @Mr.Wizard Ok :) I have to go but I will make it more general tomorrow, if it is worth it :). Feel free to delete my non relevant comments when you change something. Good night. :) $\endgroup$

Kuba
– Kuba

2015-04-11 22:46:52 +00:00
Commented Apr 11, 2015 at 22:46
$\begingroup$ Ah, clean. I feel herp-a-derp for not thinking of that... +1 $\endgroup$

ciao
– ciao

2015-04-12 01:54:05 +00:00
Commented Apr 12, 2015 at 1:54

Add a comment |

ciao · Accepted Answer · 2015-04-11 21:58:45Z

I think you'll find this faster:

pairem[data_] := Module[{c = ConstantArray[0, 8*(Length@data)], p = Flatten@data[[All, 2 ;;]], p2 = data[[All, 1]]}, c[[2 ;; ;; 2]] = p; c[[1 ;; ;; 2]] = Flatten@Transpose[ConstantArray[p2, 4]]; Partition[c, 2]]

@Pickett I don't know, I still get only 2.5 speed up here, maybe it is hardware related. — Kuba
– Kuba, Commented Apr 11, 2015 at 22:27

Community · Accepted Answer · 2017-04-13 12:55:59Z

Edit: this method is optimized for long sublists which is exactly the opposite of your example. Sorry. I'll post if I find anything applicable.

This is fairly clean and on larger data at least as fast as rasher's code:

fn = ArrayFlatten[{{First@#, Rest@# ~Partition~ 1 }}] &;

Test:

fn /@ {{3, 7, 1, 6, 5}, {8, 2, 4, 1, 2}, {8, 1, 5, 2, 5}}

{{{3, 7}, {3, 1}, {3, 6}, {3, 5}}, {{8, 2}, {8, 4}, {8, 1}, {8, 2}}, {{8, 1}, {8, 5}, {8, 2}, {8, 5}}}

x = RandomInteger[99, {5000, 5000}]; fn /@ x // AbsoluteTiming // First pairem[x] // AbsoluteTiming // First

0.334019 0.383022

A related question: Prepend 0 to sublists

Try with x = RandomInteger[99, {10^6, 5}] which is really the case. — Kuba
– Kuba, Commented Apr 11, 2015 at 22:39
@Kuba, x = RandomInteger[99, {10^6, 5}];pairem[x]; // AbsoluteTiming fn /@ x; // AbsoluteTiming ===> {0.2158204, Null},{13.1357421, Null} — xyz
– xyz, Commented Apr 12, 2015 at 5:12

Carl Woll · Accepted Answer · 2018-05-03 03:14:22Z

Here is a refinement of @Kuba's clever Part syntax:

data = RandomInteger[{1,10},{10^7,5}]; r1 = Partition[ Flatten @ data[[All,{1,2,1,3,1,4,1,5}]], 2 ]; //AbsoluteTiming r2 = ArrayReshape[ data[[All, {1,2,1,3,1,4,1,5}]], {4 Length[data], 2} ]; //AbsoluteTiming r1===r2

{1.55516, Null}

{0.568824, Null}

True

xyz · Accepted Answer · 2015-04-12 05:09:33Z

My trail:

data = RandomInteger[{1, 10}, {8, 5}]

{{1, 1, 9, 10, 6}, {7, 2, 8, 5, 4}, {2, 1, 10, 1, 7}, {9, 1, 4, 5, 2}, {6, 10, 6, 5, 10}, {2, 10, 1, 7, 4}, {3, 5, 3, 2, 2}, {1, 2, 9, 6, 2}}

Thread@{#1, {##2}} & @@@ data

{ {{1, 1}, {1, 9}, {1, 10}, {1, 6}}, {{7, 2}, {7, 8}, {7, 5}, {7, 4}}, {{2, 1}, {2, 10}, {2, 1}, {2, 7}}, {{9, 1}, {9, 4}, {9, 5}, {9, 2}}, {{6, 10}, {6, 6}, {6, 5}, {6, 10}}, {{2, 10}, {2, 1}, {2, 7}, {2, 4}}, {{3, 5}, {3, 3}, {3, 2}, {3, 2}}, {{1, 2}, {1,9}, {1, 6}, {1, 2}} }

Performance test

x = RandomInteger[99, {10^6, 5}]; Thread@{#1, {##2}} & @@@ x; // AbsoluteTiming

{2.3535156, Null}

Pretty, so +1, but much slower even than OP code, who is looking for efficiency. — ciao
– ciao, Commented Apr 12, 2015 at 5:01
@rasher, The high-efficiency methods are pairem and @Kuba method currently — xyz
– xyz, Commented Apr 12, 2015 at 5:22

user4405 · Accepted Answer · 2015-04-16 18:58:31Z

Just for fun a solution with Riffle:

riffle[d_] := Module[{data = Transpose[d]}, Partition[Flatten[Transpose[Rest[Riffle[data, {First[data]}]]]], 2] ]

Stack Exchange Network

Most efficient way of making pair list from a list

Timing

6 Answers 6

Linked

Hot Network Questions

Most efficient way of making pair list from a list

Timing

6 Answers 6

Linked

Related

Hot Network Questions