3

I have a bunch of text files, with hard line breaks at 80 characters. I'd like to "unfill" (to use the emacs term) those paragraphs, such that each paragraph is a single line, to make copying and pasting text from those files into other applications easier. Is there a way to do that with a shell script?

For example, I have input text that looks like:

Call me Ishmael. Some years ago- never mind how long precisely- having little or no money in my purse, and nothing particular to interest me on shore, I thought I would sail about a little and see the watery part of the world. It is a way I have of driving off the spleen and regulating the circulation. Whenever I find myself growing grim about the mouth; whenever it is a damp, drizzly November in my soul; whenever I find myself involuntarily pausing before coffin warehouses, and bringing up the rear of every funeral I meet; and especially whenever my hypos get such an upper hand of me, that it requires a strong moral principle to prevent me from deliberately stepping into the street, and methodically knocking people's hats off- then, I account it high time to get to sea as soon as I can. This is my substitute for pistol and ball. With a philosophical flourish Cato throws himself upon his sword; I quietly take to the ship. There is nothing surprising in this. If they but knew it, almost all men in their degree, some time or other, cherish very nearly the same feelings towards the ocean with me. There now is your insular city of the Manhattoes, belted round by wharves as Indian isles by coral reefs- commerce surrounds it with her surf. Right and left, the streets take you waterward. Its extreme downtown is the battery, where that noble mole is washed by waves, and cooled by breezes, which a few hours previous were out of sight of land. Look at the crowds of water-gazers there. 

I would like the output text to look like:

Call me Ishmael. Some years ago- never mind how long precisely- having little or no money in my purse, and nothing particular to interest me on shore, I thought I would sail about a little and see the watery part of the world. It is a way I have of driving off the spleen and regulating the circulation. Whenever I find myself growing grim about the mouth; whenever it is a damp, drizzly November in my soul; whenever I find myself involuntarily pausing before coffin warehouses, and bringing up the rear of every funeral I meet; and especially whenever my hypos get such an upper hand of me, that it requires a strong moral principle to prevent me from deliberately stepping into the street, and methodically knocking people's hats off- then, I account it high time to get to sea as soon as I can. This is my substitute for pistol and ball. With a philosophical flourish Cato throws himself upon his sword; I quietly take to the ship. There is nothing surprising in this. If they but knew it, almost all men in their degree, some time or other, cherish very nearly the same feelings towards the ocean with me. There now is your insular city of the Manhattoes, belted round by wharves as Indian isles by coral reefs- commerce surrounds it with her surf. Right and left, the streets take you waterward. Its extreme downtown is the battery, where that noble mole is washed by waves, and cooled by breezes, which a few hours previous were out of sight of land. Look at the crowds of water-gazers there. 

Is there a way to do that using a shell script? I feel like sed ought to be able to do this, but I'm not sure what the specific commands are to get it to join paragraphs, rather than split them.

2
  • Your sample text has trailing blanks on each line. Is that guaranteed? Many of the answers here depend on those blanks. Commented Jul 20, 2016 at 8:04
  • The trailing spaces are not guaranteed, unfortunately. I didn't notice them at first, but they seem to have been added by StackOverflow. They're certainly not present in the original input text. Commented Jul 20, 2016 at 15:24

3 Answers 3

4

Using (g)awk

awk -vRS= -vORS= '{gsub("\n","")}{print $0 RT}' file 

Splits records on paragraphs and removes all newlines from records.

With perl

perl -pe '/^$/?print:chomp' file 
Sign up to request clarification or add additional context in comments.

1 Comment

perl one is simple and clever :)
4

Here is a pure bash solution:

#!/bin/bash while read -r do if [[ -n $REPLY ]] then echo -n "$REPLY" else echo -e "\n$REPLY" fi done < "gash.txt" 

The newline and trailing spaces are removed by the read. If there is data remaining then we echo without a newline, otherwise we echo with an extra newline. REPLY is the default variable used by read.

Comments

1

Solution using perl

perl -i.bak -pe 's/^$/\n/ ; s/(.+)\n/$1/' *.txt 
  • -i.bak in-place editing as well as create backup files incase they are needed later or something goes wrong in command. Use -i if backup not needed
  • s/^$/\n/ double the empty lines
  • s/(.+)\n/$1/ remove newline character for non-empty lines

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.