How can I split my file into multiple files?

Question

I want to split a file into multiple files. My input is

Report : 1 ABC DEF GHI JKL End of Report $ Report : 2 ABC DEF GHI JKL $ Report : 2 ABC DEF GHI JKL End of Report $ Report : 3 ABC DEF GHI JKL End of Report $

The output should be:

File 1

Report : 1 ABC DEF GHI JKL End of Report $

File 2

Report : 2 ABC DEF GHI JKL $ Report : 2 ABC DEF GHI JKL End of Report $

File 3

Report : 3 ABC DEF GHI JKL End of Report $

I have tried

awk '{print $0 "Report :"> "/tmp/File" NR}' RS="END OF" test.txt

but I'm not getting appropriate output.

Any guidance would be appreciated.

I'd add the technology that you want to use on the question title: split file based on content using bash — fjuan
– fjuan, Commented Jan 5, 2015 at 10:01

nu11p01n73R · Accepted Answer · 2015-01-05 13:41:25Z

You can try something like

$awk '/^Report/{filename++} {print > "FILE"filename}' input

Test

$awk '/^Report/{filename++} {print > "FILE"filename}' input $ cat FILE1 Report : 1 ABC DEF GHI JKL End of Report $ $ cat FILE2 Report : 2 ABC DEF GHI JKL $ Report : 2 ABC DEF GHI JKL End of Report $ $ cat FILE3 Report : 3 ABC DEF GHI JKL End of Report $

What it does

/^Report/ pattern is true for lines that start with Report the number in the third colum in the same line is the filename that must be used as the filename for the next couple of lines
{filename++} increments the filename value by one
{print > "FILE"filename} prints each line into the files.

Eg if filename is 1 then this line is same as
```
print > FILE1 
```
This is ouput redirection, which is same as the one used in bash etc.

Note that there is no attribute for print if the attribute is missed, then awk prints the entire record. That is it is same as writing print $0 > "FILE"filename

Can you explain me the command. As i am running the same but it's not working for me.
i don't want filename = $3. it can be anything. I have tried for this but it copy every single line as file. awk '/^Report/{print > "tmp/File" NR}' test.txt
@Viru It does because NR get incremented for each line. so each line will go into each file
@Viru or if you want the filenames to be incrementing for each record you can try awk '/^Report/{filename++} {print > "FILE"filename}' input
Can you please give me the solution? i want filename as incremental

Arnab Nandy · Accepted Answer · 2015-01-19 04:55:25Z

Try this,

csplit input.txt '/End of Report$/' '{*}'

Explanation

csplit is a UNIX utility that is used to split a file into two or more smaller files determined by context lines.
input.txt This is the file which will be get splitted.
'/End of Report$/' specific pattern like "End of Report" .
'{*}' option which indicates the whole file.

This seems to be close but not exactly right. It puts the End of Report lines into the wrong files. Matching on the Report lines instead helps somewhat (but gets you a blank first split file) and doesn't keep records with the same number together.

n0741337 · Accepted Answer · 2015-01-07 01:34:51Z

Here's another awk answer:

awk '/^Report/{n=$3} {print > "File"n}' input

This is similar to nu11p01n73R's answer but uses the third field of each Report line to determine the file number.

When /^Report/ matches the line, the set n to $3.
Use n when creating the file name to print each line to

If you have a large number of these blocks, you might need to end up closing files and could use this command instead:

awk '/^Report/{f="File"$3; if(lf != f) {close(lf); lf=f}} {print > f}' input

When /^Report/ matches the line, create a filename f.
If lf (last filename) doesn't match f, first try to close lf then reset lf. Calling close() when lf hasn't been set is safe
print every line to f

Collectives™ on Stack Overflow

How can I split my file into multiple files?

3 Answers 3

10 Comments

Explanation

1 Comment

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

10 Comments

Explanation

1 Comment

Comments

Linked

Related