4

I have a large text file (666000 colums) in the format

A B C D E F 

Desired output

AB CD EF 

How can we do it in sed or awk. I have tried a couple of things but nothing seems to be working. Please suggest something.

1
  • 2
    do you mean 666000 columns or 666000 rows? Commented Sep 27, 2013 at 0:29

3 Answers 3

3

In sed:

sed 's! \([^ ]\+\)\( \|$\)!\1 !g' your_file 

This will make the substitutions and print the result to standard out. To modify the file in place, add the -i switch:

sed -i 's! \([^ ]\+\)\( \|$\)!\1 !g' your_file 

Explanation

This sed command will look for a space, followed by at least one non-space character, followed by a space or the end of the line. It substitutes this sequence with whatever non-space characters it found followed by a single space. The substitution is applied as many times as possible across the line (this is called a global substitution) because the g modifier is supplied at the end. So, basically, with a sequence like A B C, sed will find the pattern " B " and substitute it with "B " leaving you with AB C as the final result.

Assumptions made by this code

This code assumes the spaces between your columns are really spaces and not TABs for example. This can be easily fixed at the expense of readability:

sed 's![[:blank:]]\+\([^[:blank:]]\+\)\([[:blank:]]\+\|$\)!\1 !g' your_file 
2
  • awk:

    awk '{printf $1$2;for(i=3; i<=NF;i+=2){printf " %s",$i$(i+1)}print}' file 

    This will probably be the fastest of the two for large files.

  • Perl:

    perl -pe 's/([^\s]+)\s+([^\s]+)/$1$2/g' file 
0
0

If your file indeed has that many columns, one option is to use gawk to treat each column as a record by setting RS to "one or more whitespace characters". This helps avoid having to set up a loop through the columns. Note that this solution is fragile in the face of an odd number of columns in a line.

awk --re-interval -v RS='[[:space:]]{1,}' '{x=$0; getline; printf x$0RT}' file 

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.