1

I have a csv file like this:

# 2022 5 2 8 1 24.8-17.1800 -66.3260 3.6 0.2 0.0 0.0 0.0 2 SOD6 2.20 1.00 P SOD6 3.98 1.00 S SOD5 3.21 1.00 P SOD5 5.79 1.00 S SOD0 4.07 1.00 P SOD0 7.10 1.00 S SOD3 6.47 1.00 P SOD3 11.20 1.00 S # 2022 5 3 0 10 16.8-17.3820 -65.6330 28.0 0.7 0.0 0.0 0.3 3 SOD2 6.24 1.00 P SOD2 10.49 1.00 S SOD9 7.66 1.00 P SOD9 12.75 1.00 S SOD1 10.34 1.00 P SOD3 11.42 1.00 P SOD3 21.11 1.00 S # 2022 5 3 11 28 10.8-17.7600 -65.9840 6.6 0.7 0.0 0.0 0.1 4 SOD3 6.55 1.00 P SOD2 6.89 1.00 P SOD2 11.70 1.00 S SOD9 8.82 1.00 P SOD1 10.04 1.00 P SOD1 17.60 1.00 S 

I was trying to add a black space on the 24th place of each header, this is the header

# 2022 5 2 8 1 24.8-17.1800 -66.3260 3.6 0.2 0.0 0.0 0.0 2 

so the header will look like:

# 2022 5 2 8 1 24.8 -17.1800 -66.3260 3.6 0.2 0.0 0.0 0.0 2 

I tried the following code:

# To read the headers and to add a space on 24th place # of each header, where 'phase.dat' is the csv file grep '# 2022' phase.dat | sed 's/ ./&\s /24' 

But it did not add the space at desired position. Does anyone have an idea what I did wrong?

Stay safe and best regards, Tonino

3
  • If its always the first - sed -e 's/-/ -/' should work. Commented Dec 13, 2022 at 20:07
  • What's your field separator in your CSV? Commented Dec 13, 2022 at 20:07
  • @Cyrus, only spaces, not a tabulationn Commented Dec 13, 2022 at 20:18

2 Answers 2

1

Something like this.

sed 's/^\(# 2022[^-]*\)\(.*\)$/\1 \2/' phase.dat 

If the headers are what you're actually trying to extract and edit.

sed -n 's/^\(# 2022[^-]*\)\(.*\)$/\1 \2/p' phase.dat 

A quick breakdown on the sed code

  • ^ is what they call an anchor in regex, It means from the beginning or start.

  • ( ) Inside of those parenthesis are capture groups. Since it is B.R.E. (Basic Regular Expression) It needs to be escaped/preceded by a \

  • [ ] Is what they call a bracket expression,

    • inside it is also a ^ (it negates) but that means everything EXCEPT for the character next to it, in this case a -

    • * is a next to [ ], they call it a quantifier, which means zero or more string/characters.

  • So the first capture group will match # 2022 from the beginning and everything until it reaches the first -

  • (.*) is the second capture group.

    • .* means zero or more amount of string/character, which basically the rest of the string is captured.
  • $ is also an anchor which means at the end.

  • \1 and \2 refers to the capture groups one and two, which is what ever is inside the ( )


Sign up to request clarification or add additional context in comments.

2 Comments

works, what does "(.*)$/\1 \2/" mean?, thanks in advance
@tonino, I've updated the answer with some short info.
1

Replace \s with a space as shown below

grep '# 2022' phase.dat | sed 's/./& /24' # 2022 5 2 8 1 24.8- 17.1800 -66.3260 3.6 0.2 0.0 0.0 0.0 2 # 2022 5 3 0 10 16.8- 17.3820 -65.6330 28.0 0.7 0.0 0.0 0.3 3 # 2022 5 3 11 28 10.8- 17.7600 -65.9840 6.6 0.7 0.0 0.0 0.1 4 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.