2

I am trying to remove a pattern from some text. I mean:

from [1426467605000,19.44] to 19.44

Here is my input text file:

[1426467605000,19.44],[1426467965000,19.44],[1426468325000,19.38],[1426468685000,19.38],[1426469045000,19.38],[1426469405000,19.38],[1426469764000,19.38],[1426470124000,19.38],[1426470484000,19.38],[1426470845000,19.31],[1426471205000,19.31],[1426471565000,19.31],[1426471925000,19.31],[1426472285000,19.31],[1426472645000,19.31],[1426473005000,19.31],[1426473365000,19.31],[1426473725000,19.31],[1426474085000,19.31],[1426474445000,19.25],[1426474805000,19.25],[1426475164000,19.25],[1426475524000,19.25],[1426475884000,19.55],[1426476245000,19.25],[1426476605000,19.25],[1426476965000,19.25],[1426477325000,19.25],[1426477685000,19.19],[1426478045000,19.19],[1426478405000,19.19],[1426478764000,19.19],[1426479124000,19.19],[1426479484000,19.19],[1426479844000,19.19],[1426480204000,19.13],[1426480564000,19.13],[1426480924000,19.19],[1426481284000,19.19],[1426481644000,19.19],[1426482005000,19.19],[1426482365000,19.19],[1426482725000,19.19], 

Here is my desired output:

19.44 19.44 19.38 19.38 19.38 etc. 

7 Answers 7

4

this grep line should do it:

grep -oP '[^,]*(?=])' 

in short, this line extracts text between , and ], which are the things you want.

Sign up to request clarification or add additional context in comments.

Comments

3

Use grep for that:

grep -oE '[0-9]+\.[0-9]+' file 

The pattern searches for one or more numbers followed by a dot and again one or more numbers.

-o makes grep output the match only and the not the whole line where the match appears. -E allows us to use posix extended regular expression which saves us to escape the +.


An alternative would be to use awk like this:

awk -F, '{print $2}' RS='\\[|\\],|\\],\\[' file 

This command performs a more semantic analysis, it returns the second value from a record. It separates records by [ (start of line) or ], or ],[. It prints the second field of a record where fields are delimited by ,.

If you want to allow that the last record of a line is closed without a , at the end you can simply modify the pattern to:

awk -F, '{print $2}' RS='\\[|\\],?|\\],\\[' file 

which makes the comma at the end of the record separator optional.

7 Comments

Will leave the last ]
No, the output of both commands is the same and both are outputting just the numbers as expected - using the example data from the question.
Apologies, i hadn't seen the final comma, my bad
No problem. It isn't uncommon that a , will be appended to a list, this would make it a little easier to implement the writer..
@JID Have changed the awk command. Now it addresses your concerns.
|
2

An awkalternative:

awk '$0~FS{print $1}' RS=',' FS=']' inputfile 

RS=',' : Changes Record Separator to comma .

FS=']' : Set Field Separator to ].

$0~FS : If FS is present in the current record print the first field (avoid FS in the output)

Comments

1

sed

sed 's/\[[^,]\+,\([^]]\+\)\]/\1/g; s/,/\n/g' 

The first regex looks for: a literal open bracket, some non-comma chars, a comma, capturing parenthesis, some non-close-bracket chars, end capture, and a literal close bracket. It replaces all of that with the captured text. Then, remaining commas are replaced by newlines.

Comments

1

You could use grep.

$ grep -oP ',\K[^\]\[]*(?=\])' file 19.44 19.44 19.38 19.38 19.38 19.38 19.38 19.38 

This regex would fetch the last string present inside the square brackets.

  • , matches the first comma.
  • \K discards the previously matched char comma.
  • [^\]\[]* Negated character class which matches any character but not of ] or [, zero or more times.
  • (?=\]) Positive lookahead which asserts that the match must be followed by a ] character.

Comments

1

You could also go with sed and coreutils:

<infile tr -d '][' | tr , '\n' | sed '1d; n; d' 

Output:

19.44 19.44 19.38 19.38 19.38 . . . 

Explanation

tr deletes the brackets and replaces comma with newline. sed then deletes the first line and every other line after that.

Comments

0

With GNU awk for multi-char RS:

$ awk -v RS='[]],[[\n]' -F, '{print $2}' file 19.44 19.44 19.38 19.38 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.