Skip to main content
added 78 characters in body
Source Link
Peter.O
  • 33.8k
  • 32
  • 120
  • 167

I assume your large files are already sorted. The following method requires no further sorting.

You can simply add leading zeros to the keys, using sed ... Because the process is pipelined, there are no temporary files to deal with. The sed overhead it trivial.


# make key 9 digits # Add 9 leading 0's # Remove excess 0's join -a1 -11 <(sed -r 's/^([0-9]+)/000000000\1/;s; s/^0+([0-9]{9})/\1/' file1) \ <(sed -r 's/^([0-9]+)/000000000\1/;s; s/^0+([0-9]{9})/\1/' file2) 

Output is:

000000001 lkj klj lkj 000000002 lkj lkj lkj 000000003 000000004 000000005 000000006 000000007 lkj lkj lkj 000000008 000000009 000000010 000000011 lkk kll lkk 

If you don't want the leading zeros in the output, use this command instead.
The extra sed -r 's/^0+//' removes leading zeros.

join -a1 -11 <(sed -r 's/^([0-9]+)/000000000\1/;s/^0+([0-9]{9})/\1/' file1) \ <(sed -r 's/^([0-9]+)/000000000\1/;s/^0+([0-9]{9})/\1/' file2) | sed -r 's/^0+//' 

Output

1 lkj klj lkj 2 lkj lkj lkj 3 4 5 6 7 lkj lkj lkj 8 9 10 11 lkk kll lkk 

I assume your large files are already sorted. The following method requires no further sorting.

You can simply add leading zeros to the keys, using sed ... Because the process is pipelined, there are no temporary files to deal with. The sed overhead it trivial.


join -a1 -11 <(sed -r 's/^([0-9]+)/000000000\1/;s/^0+([0-9]{9})/\1/' file1) \ <(sed -r 's/^([0-9]+)/000000000\1/;s/^0+([0-9]{9})/\1/' file2) 

Output is:

000000001 lkj klj lkj 000000002 lkj lkj lkj 000000003 000000004 000000005 000000006 000000007 lkj lkj lkj 000000008 000000009 000000010 000000011 lkk kll lkk 

If you don't want the leading zeros in the output, use this command instead.
The extra sed -r 's/^0+//' removes leading zeros.

join -a1 -11 <(sed -r 's/^([0-9]+)/000000000\1/;s/^0+([0-9]{9})/\1/' file1) \ <(sed -r 's/^([0-9]+)/000000000\1/;s/^0+([0-9]{9})/\1/' file2) | sed -r 's/^0+//' 

Output

1 lkj klj lkj 2 lkj lkj lkj 3 4 5 6 7 lkj lkj lkj 8 9 10 11 lkk kll lkk 

I assume your large files are already sorted. The following method requires no further sorting.

You can simply add leading zeros to the keys, using sed ... Because the process is pipelined, there are no temporary files to deal with. The sed overhead it trivial.


# make key 9 digits # Add 9 leading 0's # Remove excess 0's join -a1 -11 <(sed -r 's/^([0-9]+)/000000000\1/; s/^0+([0-9]{9})/\1/' file1) \ <(sed -r 's/^([0-9]+)/000000000\1/; s/^0+([0-9]{9})/\1/' file2) 

Output is:

000000001 lkj klj lkj 000000002 lkj lkj lkj 000000003 000000004 000000005 000000006 000000007 lkj lkj lkj 000000008 000000009 000000010 000000011 lkk kll lkk 

If you don't want the leading zeros in the output, use this command instead.
The extra sed -r 's/^0+//' removes leading zeros.

join -a1 -11 <(sed -r 's/^([0-9]+)/000000000\1/;s/^0+([0-9]{9})/\1/' file1) \ <(sed -r 's/^([0-9]+)/000000000\1/;s/^0+([0-9]{9})/\1/' file2) | sed -r 's/^0+//' 

Output

1 lkj klj lkj 2 lkj lkj lkj 3 4 5 6 7 lkj lkj lkj 8 9 10 11 lkk kll lkk 
added 90 characters in body
Source Link
Peter.O
  • 33.8k
  • 32
  • 120
  • 167

I assume your large files are already sorted.. The following method requires no further sorting. You

You can simply add leading zeros to the keys, using sed ... Because the process is pipelined, there are no temporary files to deal with. The sed overhead it trivial.

 
join -a1 -11 <(sed -r 's/^([0-9]+)/000000000\1/;s/^0+([0-9]{9})/\1/' file1) \ <(sed -r 's/^([0-9]+)/000000000\1/;s/^0+([0-9]{9})/\1/' file2) 

Output is:

000000001 lkj klj lkj 000000002 lkj lkj lkj 000000003 000000004 000000005 000000006 000000007 lkj lkj lkj 000000008 000000009 000000010 000000011 lkk kll lkk 
 

If you don't want the leading zeros in the output, you can strip them off with another simple inlineuse this command instead.
The extra sed -r 's/^0+//' stepremoves leading zeros.

join -a1 -11 <(sed -r 's/^([0-9]+)/000000000\1/;s/^0+([0-9]{9})/\1/' file1) \ <(sed -r 's/^([0-9]+)/000000000\1/;s/^0+([0-9]{9})/\1/' file2) | sed -r 's/^0+//' 

Output

1 lkj klj lkj 2 lkj lkj lkj 3 4 5 6 7 lkj lkj lkj 8 9 10 11 lkk kll lkk 

I assume your large files are already sorted... You can simply add leading zeros to the keys, using sed ... Because the process is pipelined, there are no temporary files to deal with. The sed overhead it trivial.

join -a1 -11 <(sed -r 's/^([0-9]+)/000000000\1/;s/^0+([0-9]{9})/\1/' file1) \ <(sed -r 's/^([0-9]+)/000000000\1/;s/^0+([0-9]{9})/\1/' file2) 

Output is:

000000001 lkj klj lkj 000000002 lkj lkj lkj 000000003 000000004 000000005 000000006 000000007 lkj lkj lkj 000000008 000000009 000000010 000000011 lkk kll lkk 

If you don't want the leading zeros in the output, you can strip them off with another simple inline sed step.

join -a1 -11 <(sed -r 's/^([0-9]+)/000000000\1/;s/^0+([0-9]{9})/\1/' file1) \ <(sed -r 's/^([0-9]+)/000000000\1/;s/^0+([0-9]{9})/\1/' file2) | sed -r 's/^0+//' 

Output

1 lkj klj lkj 2 lkj lkj lkj 3 4 5 6 7 lkj lkj lkj 8 9 10 11 lkk kll lkk 

I assume your large files are already sorted. The following method requires no further sorting.

You can simply add leading zeros to the keys, using sed ... Because the process is pipelined, there are no temporary files to deal with. The sed overhead it trivial.

 
join -a1 -11 <(sed -r 's/^([0-9]+)/000000000\1/;s/^0+([0-9]{9})/\1/' file1) \ <(sed -r 's/^([0-9]+)/000000000\1/;s/^0+([0-9]{9})/\1/' file2) 

Output is:

000000001 lkj klj lkj 000000002 lkj lkj lkj 000000003 000000004 000000005 000000006 000000007 lkj lkj lkj 000000008 000000009 000000010 000000011 lkk kll lkk 
 

If you don't want the leading zeros in the output, use this command instead.
The extra sed -r 's/^0+//' removes leading zeros.

join -a1 -11 <(sed -r 's/^([0-9]+)/000000000\1/;s/^0+([0-9]{9})/\1/' file1) \ <(sed -r 's/^([0-9]+)/000000000\1/;s/^0+([0-9]{9})/\1/' file2) | sed -r 's/^0+//' 

Output

1 lkj klj lkj 2 lkj lkj lkj 3 4 5 6 7 lkj lkj lkj 8 9 10 11 lkk kll lkk 
added 84 characters in body
Source Link
Peter.O
  • 33.8k
  • 32
  • 120
  • 167

I assume your large files are already sorted... You cnacan simply enough add leading zeros to the keys, using sed ... Because the process is pipelined, there are no temporary files to deal with. The sed overhead it trivial.

join -a1 -11 <(sed -r 's/^([0-9]+)/000000000\1/;s/^0+([0-9]{9})/\1/' file1) \ <(sed -r 's/^([0-9]+)/000000000\1/;s/^0+([0-9]{9})/\1/' file2) 

Output is:

000000001 lkj klj lkj 000000002 lkj lkj lkj 000000003 000000004 000000005 000000006 000000007 lkj lkj lkj 000000008 000000009 000000010 000000011 lkk kll lkk 

If you don't want the leading zeros in the output, you can strip them off with another simple inline sed step.

join -a1 -11 <(sed -r 's/^([0-9]+)/000000000\1/;s/^0+([0-9]{9})/\1/' file1) \ <(sed -r 's/^([0-9]+)/000000000\1/;s/^0+([0-9]{9})/\1/' file2) | sed -r 's/^0+//' 

Output

1 lkj klj lkj 2 lkj lkj lkj 3 4 5 6 7 lkj lkj lkj 8 9 10 11 lkk kll lkk 

I assume your large files are already sorted... You cna simply enough add leading zeros to the keys, using sed

join -a1 -11 <(sed -r 's/^([0-9]+)/000000000\1/;s/^0+([0-9]{9})/\1/' file1) \ <(sed -r 's/^([0-9]+)/000000000\1/;s/^0+([0-9]{9})/\1/' file2) 

Output is:

000000001 lkj klj lkj 000000002 lkj lkj lkj 000000003 000000004 000000005 000000006 000000007 lkj lkj lkj 000000008 000000009 000000010 000000011 lkk kll lkk 

If you don't want the leading zeros in the output, you can strip them off with another simple inline sed step.

join -a1 -11 <(sed -r 's/^([0-9]+)/000000000\1/;s/^0+([0-9]{9})/\1/' file1) \ <(sed -r 's/^([0-9]+)/000000000\1/;s/^0+([0-9]{9})/\1/' file2) | sed -r 's/^0+//' 

Output

1 lkj klj lkj 2 lkj lkj lkj 3 4 5 6 7 lkj lkj lkj 8 9 10 11 lkk kll lkk 

I assume your large files are already sorted... You can simply add leading zeros to the keys, using sed ... Because the process is pipelined, there are no temporary files to deal with. The sed overhead it trivial.

join -a1 -11 <(sed -r 's/^([0-9]+)/000000000\1/;s/^0+([0-9]{9})/\1/' file1) \ <(sed -r 's/^([0-9]+)/000000000\1/;s/^0+([0-9]{9})/\1/' file2) 

Output is:

000000001 lkj klj lkj 000000002 lkj lkj lkj 000000003 000000004 000000005 000000006 000000007 lkj lkj lkj 000000008 000000009 000000010 000000011 lkk kll lkk 

If you don't want the leading zeros in the output, you can strip them off with another simple inline sed step.

join -a1 -11 <(sed -r 's/^([0-9]+)/000000000\1/;s/^0+([0-9]{9})/\1/' file1) \ <(sed -r 's/^([0-9]+)/000000000\1/;s/^0+([0-9]{9})/\1/' file2) | sed -r 's/^0+//' 

Output

1 lkj klj lkj 2 lkj lkj lkj 3 4 5 6 7 lkj lkj lkj 8 9 10 11 lkk kll lkk 
Source Link
Peter.O
  • 33.8k
  • 32
  • 120
  • 167
Loading