35

I have a string in Ruby, s (say) which might have any of the standard line endings (\n, \r\n, \r). I want to convert all of those to \ns. What's the best way?

This seems like a super-common problem, but there's not much documentation about it. Obviously there are easy crude solutions, but is there anything built in to handle this?

Elegant, idiomatic-Ruby solutions are best.

EDIT: realized that ^M and \r are the same. But there are still three cases. (See wikipedia.)

4 Answers 4

44

Since ruby 1.9 you can use String::encode with universal_newline: true to get all of your new lines into \n while keeping your encoding unchanged:

s.encode(s.encoding, universal_newline: true) 

Once in a known newline state you can freely convert back to CRLF using :crlf_newline. eg: to convert a file of unknown (possibly mixed) ending to CRLF (for example), read it in binary mode, then :

s.encode(s.encoding, universal_newline: true).encode(s.encoding, crlf_newline: true) 
Sign up to request clarification or add additional context in comments.

4 Comments

You don't need to include the first s.encoding, a simple s.encode(universal_newline: true) or s.encode(crlf_newline: true) does the trick. This helped me with a project today.
@Donovan - You're probably right, however the docs say that the version without an explicit encoding will transcode to Encoding.default_internal, which may or may not be what you want. My version will conservatively preserve your current encoding.
true and you make a good point, but in most cases the default is fine, after all, that's what String.new uses. So, in my case (and I could argue most cases), it would be redundant.
This is apparently much faster than other methods (takes 40% less time than gsub method whereas split-join takes about 40% more time). I compared this to: s.gsub(/\r\n?/, "\n"), s.gsub("\r\n", "\n").gsub("\r", "\n") (about same speed), and s.split(/\r\n?/).join("\n")
41

Best is just to handle the two cases that you want to change specifically and not try to get too clever:

s.gsub /\r\n?/, "\n" 

6 Comments

Two things: You have to put \r\n first in the regex or else it will never match (because anyhing that could otherwise matched b \r\n will be matched by \r first). And '\n' == "\\n", while what you want is "\n".
Change the single quotes to double quotes. Otherwise it doesn't work as intended.
It seems we're all on the same page :)
nicely done that you don't bother changing the default case (\n -> \n is unnecessary. didn't quite realise this at first :)
Interesting answer; I wonder why Ruby doesn't have something like python's os.linesep?
|
4

I think the cleanest solution would be to use a regular expression:

s.gsub! /\r\n?/, "\n" 

2 Comments

oops, this has a trap: double line breaks like \n\n will become \n.
Oops, thanks for pointing that out, seems jleedev was a bit faster though.
-9

Try opening them on NetBeans IDE - Its asked me before, on one of the projects I've opened from elsewhere, if I wanted to fix the line endings. I think there might be a menu option to do it too, but that would be the first thing I would try.

1 Comment

thanks, but this isn't a one-off; this is for processing data in Ruby, not processing Ruby files.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.