2

I have a file like so

Hello,Hi,Hullo,Hammers,Based,Random 

For n=2, output must be like so

Hello,Hi Hullo,Hammers Based,Random 

For n=3, output must be like so

Hello,Hi,Hullo Hammers,Based,Random 

How could I accomplish this using awk/sed?

Edit: n is a factor of number of fields

2
  • 4
    For n=4, should the last line be Based,Random or Based,Random,,? The former does not fulfill "specific number of fields", but the latter does not entirely come from "splitting a single line of fields". Commented Jul 1, 2022 at 23:48
  • Ed Morton is correct, I forgot to mention that it can be assumed that number of fields divides evenly by 'n' Commented Jul 19, 2022 at 14:44

9 Answers 9

4
$ awk -v n=2 -F',' '{for (i=1;i<=NF;i++) printf "%s%s", $i, (i%n ? FS : ORS)}' file Hello,Hi Hullo,Hammers Based,Random $ awk -v n=3 -F',' '{for (i=1;i<=NF;i++) printf "%s%s", $i, (i%n ? FS : ORS)}' file Hello,Hi,Hullo Hammers,Based,Random 

In your question you didn't address how to handle cases where the number of fields don't divide evenly by n so I haven't addressed it here either.

3

Another approach with tr and paste:

For n=2,

$ <input tr ',' '\n' | paste -d ',' - - Hello,Hi Hullo,Hammers Based,Random 

For n=3,

$ <input tr ',' '\n' | paste -d ',' - - - Hello,Hi,Hullo Hammers,Based,Random 
2

Using perl:

$ echo 'Hello,Hi,Hullo,Hammers,Based,Random' | perl -F, -le ' BEGIN { $n = shift }; for ($i=0; $i < @F; $i += $n) { print join(",", @F[$i .. ($i + $n - 1)]); }' 2 Hello,Hi Hullo,Hammers Based,Random 

This uses the first argument as the number of entries printed per output line (using variable $n). STDIN and any filename arguments are used as the input.

Due to the -F, option (which implicitly enables the -a and -n options), it automatically reads each input line and splits it on commas into array @F, then iterates over the indices of the array, $n items at a time. $n elements are printed on each output line.

NOTE: use the Text::CSV module if you need to parse actual CSV with quoted fields and commas embedded in quotes rather than simple comma-delimited input.

Output with an argument of 3 instead of 2:

$ echo 'Hello,Hi,Hullo,Hammers,Based,Random' | perl -F, -le 'BEGIN{$n = shift};for($i=0;$i<@F;$i+=$n){print join(",",@F[$i..($i+$n-1)])}' 3 Hello,Hi,Hullo Hammers,Based,Random 

And again with 4:

$ echo 'Hello,Hi,Hullo,Hammers,Based,Random' | perl -F, -le 'BEGIN{$n = shift};for($i=0;$i<@F;$i+=$n){print join(",",@F[$i..($i+$n-1)])}' 4 Hello,Hi,Hullo,Hammers Based,Random,, 
2
sed 's/,/\n/2;P;D' 
m=3 sed "s/,/\\n/$m;P;D" 
7
  • 1
    I can't explain to myself how someone could have downvotes this answer. It's really elegant. Commented Jul 3, 2022 at 16:11
  • 1
    @DanieleGrassini Assumes a particular non-standard implementation of sed, and contains a code injection vulnerability unless one assumes full control over the value in $m. Commented Jul 5, 2022 at 11:48
  • @Kusalananda What isn't standard in this sed? Commented Jul 10, 2022 at 23:10
  • 1
    @DanieleGrassini Inserting a newline using \n in the replacement string of the s command. Commented Jul 10, 2022 at 23:15
  • @Kusalananda how we can speak about security implications with out knowing the full environnement? any of the other code coud be considered potentially dangerous in some circumstances. To mee, in regards to this question which dosnt state anything else that "How to split.." seem to be a good solutions. Commented Jul 10, 2022 at 23:17
2

awk again,
input any suite of values separated by , and newlines,
output a fixed-width csv:

awk '{printf((FNR>1?(FNR-1)%n?",":ORS:"")$0)}END{print ""}' RS='[,\n]' n=4 <<END Hello Hi,Hullo,Hammers,Based Random END Hello,Hi,Hullo,Hammers Based,Random 
1

With perl :

echo 'Hello,Hi,Hullo,Hammers,Based,Random' | perl -ne ' @L = (/,?([^,]*,[^,]*)/g); $"="\n" ; print "@L" ' 

This question make me think to python zip/iter builtin functions:

python3 -c 'from sys import argv as F; J = "\n".join _, sep, data, sz = F L = [*map(sep.join, zip(*[iter(data.split(sep))]*int(sz)))] print(J(L)) ' , "Hello,Hi,Hullo,Hammers,Based,Random" 2 
1

Using Raku (formerly known as Perl_6)

~$ raku -ne '.put for .split(",").rotor(3);' file 

Sample Input:

Hello,Hi,Hullo,Hammers,Based,Random 

Sample Output with .rotor(3) (from above):

Hello Hi Hullo Hammers Based Random 

Sample Output changing above to .rotor(2):

Hello Hi Hullo Hammers Based Random 

The code above is a bare-bones implementation in Raku (returning single whitespace between columns). The rotor() call determines the number of columns [ see discussion below regarding the difference between rotor() and batch() ]. Just add a call to .join() if you want to join columns using commas, tabs, pipes, etc.:

~$ raku -ne '.join(",").put for .split(",").rotor(2);' file Hello,Hi Hullo,Hammers Based,Random 

Note, by default rotor() only returns full groups and will drop partial groups at the very end. So perfoming a rotor(4) call on the above six-element sample will result in a single line of output, 4 elements long. To ensure no loss of data, use rotor(4, :partial) or batch(4).

~$ raku -ne '.join(",").put for .split(",").rotor(4);' file Hello,Hi,Hullo,Hammers #COMPARE TO: ~$ raku -ne '.join(",").put for .split(",").batch(4);' file Hello,Hi,Hullo,Hammers Based,Random 

Processing by an authentic CSV-parser (e.g. Raku's Text::CSV module) will validate the resulting CSV file. See the URL below for examples.

https://unix.stackexchange.com/a/701805/227738
https://raku.org

0

Using sed

$ sed -E 's/([^,]*,[^,]*),/\1\ /g' input_file Hello,Hi Hullo,Hammers Based,Random 
$ sed -E 's/(([^,]*,){2}[^,]*),/\1\ /g' input_file Hello,Hi,Hullo Hammers,Based,Random 
-1

Apart from the obvious awk/Perl/sed approach I can recommend Miller

It can extract and modify Text based data very well and is more intuitive to use that it‘s counterparts.

1
  • 4
    ...is more intuitive to use... Please add, at least, an example to show how. Commented Jul 3, 2022 at 16:13

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.