BASH: Using cut on space delimited file: Treating two spaces as one

Question

I need to convert file full of lines like this:

# 2007 4 29 10 1 17.98 blah other stuff

into lines formatted like this

2007.04.29.10.01.17

The original line is space delimited, and when a one's place digit number appears (such as 4) it gets listed as ' 4'. When I convert it, I need to be able to change it to '04'. Thus there are spaces that delimit the file, AND spaces that are placeholders for leading zeros.

I need to write a shell script to make that conversion. I tried using the cut command because each character stays in the same exact place, so the 7th char is always a delimiting space and the 8th char is always the ten's digit, or a space that should be a leading zero. However I soon discovered that it treats two spaces as one, which totally throws off the count (Since sometimes I have ' 4' and sometimes I will have '14'.

So: I need a way to read and convert this file, either using cut, or some other method (awk?) that will allow me to do this. Either a way to modify my current code (below) or another approach that would work a lot better would be much appreciated.

Just for reference, my present code is below:

while read LINE do #IF line starts with '#', then if [[ $LINE == "#"* ]]; then #123456789012345678901 # 2008 12 26 11 26 20.36 # 2007 5 10 1 8 10.52 #GET 4 digit year LINEyear=$(echo $LINE | cut -c3-6) #GET 2 digit month if [ $(echo $LINE | cut -c8-8) == " " ]; then LINEmonth=0$(echo $LINE | cut -c8-9) else LINEmonth=$(echo $LINE | cut -c8-9) fi #GET 2 digit day if [ $(echo $LINE | cut -c11-11) == " " ]; then LINEday=0$(echo $LINE | cut -c11-12) else LINEday=$(echo $LINE | cut -c11-12) fi #GET hour, min, sec, (Removed to save space) LINEnew=$LINEyear.$LINEmonth.$LINEday.$LINEhour.$LINEmin.$LINEsec echo $LINEnew fi done

Most times you find yourself writing a loop in shell, you are using the wrong tool. There are exceptions of course, but they involve process or file manipulation (create/destroy) not text manipulation. — Ed Morton
– Ed Morton, Commented May 2, 2013 at 12:24

johnsyweb · Accepted Answer · 2013-05-01 23:56:19Z

2

You can solve this in just one line of awk:

% awk '/^#/ {printf "%04d.%02d.%02d.%02d.%02d.%02d\n", $2, $3, $4, $5, $6, $7}' ~/stuff

Yields:

2007.04.29.10.01.17

answered May 1, 2013 at 23:56

johnsyweb

143k26 gold badges197 silver badges253 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Brian C Over a year ago

Okay. Well now I am getting the correct formatting (Thanks), however I am no longer storing the string as a variable. Which I need to do because I need to compare it to another set file of string later on. I tried LineNew=awk '/^#/ {printf.... etc, but that did not work

johnsyweb Over a year ago

I see. In your question, you're just echoing the matched lines, and that is what I was emulating / fixing. awk can do the comparisons, too, I am sure, but that seems to be a different question and should be asked as such.

Brian C Over a year ago

I just printed the output to a file and then I will make a separate program to compare this to the other file I have, not as neat, but it should work just fine.

Kevin Lee · Accepted Answer · 2013-05-02 02:39:16Z

echo "# 2007 4 29 10 1 17.98 blah other stuff" | tr -s " "

I use tr in conjunction with cut because of the variability in space delimiting, the tr -s ' ' trims the multiple spaces.

Then, use cut to ignore both the # (unless you want that as a field), and then a 2nd time to pick, say the fourth field:

echo "# 2007 4 29 10 1 17.98 blah other stuff" | tr -s " " | cut -d'#' -f2 | cut -d' ' -f4

Collectives™ on Stack Overflow

BASH: Using cut on space delimited file: Treating two spaces as one

2 Answers 2

3 Comments

Comments

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

3 Comments

Comments

Related