10

I am able to use the split command successfully to split large file into multiple smaller files. This is being achieved by following command

split -b 1G $temp_path $final_filepath 

But only caveat is that these files many times contain last line which is split across 2 files. Is there any way to avoid that using split or any other command ?

1 Answer 1

10

Yes, don't use the -b parameter. From the split(1) man page:

-b, --bytes=SIZE put SIZE bytes per output file

-C, --line-bytes=SIZE put at most SIZE bytes of lines per output file

-l, --lines=NUMBER put NUMBER lines per output file

By using -b you are telling split to deliniate files at a specific size in bytes (or Kb or MB). If that is the middle of a line, too bad.

Split supports 'number of lines' and a 'max output file size comprised of whole lines'.

Instead, try this:

split -C 1G $temp_path $final_filepath 

The -C flag is not available on all versions of split (notably OS X / Darwin). In that case you can use gsplit which is available in the GNU coreutils package on Homebrew and MacPorts.

3
  • 2
    did not read the man page carefully enough. Commented Apr 3, 2017 at 21:48
  • 1
    No worries, happens to all of us. I'm just glad I was working with split earlier today and happened to know the answer off the cuff. :) Commented Apr 3, 2017 at 21:54
  • But if I want to split at the last line break before I reach a certain size in bytes? I want each file to split at the line break that is closest to, e.g., 1000 bytes. Commented Feb 25 at 21:39

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.