16

I'm on Rails 4 and Ruby 1.9.3

I use "strange" characters very often, so I have to declare UTF-8 encoding at the top of all .rb files.

Is there any way to set UTF-8 as the default encoding for Ruby 1.9.3?


I tried all answers, but when running rake db:seed and creating an object whose attributes contain non US-ASCII valid characters, I still receive this error:

`block in trace_on': invalid byte sequence in US-ASCII (ArgumentError) 
2
  • declaring default codepage at the beginning of each file as utf-8 is need, when you use unicode char directly in the same .rb file. Which problem leaded to your question? 'UTF-8' cp is set in ruby 1.9.x by default. Do your have a string with non-utf codepage? Commented Dec 11, 2013 at 14:21
  • 1
    "'UTF-8' cp is set in ruby 1.9.x by default." this not true Commented Dec 11, 2013 at 14:34

4 Answers 4

21

To change the source encoding (i.e. the encoding your actual written source code is in), you have to use the magic comment currently:

# encoding: utf-8 

It is not enough to either set the internal encoding (the encoding of the internal string representation after conversion) or the external encoding (the assumed encoding of read files). You actually have to set the magic encoding comment on top of files to set the source encoding.

In ChiliProject we have a rake task which sets the correct encoding header in all files automatically before a release.

As for encoding defaults:

  • Ruby 1.8 and below didn't knew the concept of string encodings at all. Strings were more or less byte arrays.
  • Ruby 1.9: default string encoding is US_ASCII everywhere.
  • Ruby 2.0 and above: default string encoding is UTF-8.

Thus, if you use Ruby 2.0, you could skip the encoding comment and correctly assume UTF-8 encoding everywhere by default.

Sign up to request clarification or add additional context in comments.

Comments

14

I think you would want one of the following, depending on the context.

Encoding.default_internal = Encoding::UTF_8 Encoding.default_external = Encoding::UTF_8 

This setting is made in the environment.rb file.

2 Comments

This only defined the internal encoding (the internal string representation after conversion) and external encoding (the default encoding of read files), but not the encoding of ruby source files. This can only be changed with magic comments on top of a source file.
I had to use this in a dockerized environment, where it defaulted to US-ASCII. Thank you.
7

in Ruby 1.9 the default is ASCII

in Ruby 2.0 the default is UTF-8.


change Ruby version

or

config.encoding = "utf-8" # application.rb 

and in your database.yml

development: adapter: your_db host: localhost encoding: utf8 

1 Comment

in database.yml is just a recommendation and not necessarily
2

In your application.rb

# Configure the default encoding used in templates for Ruby config.encoding = "utf-8" 

This is not the whole story as pointed out by Holger, check out this question for further explanation.

3 Comments

This only defined the internal encoding (the internal string representation after conversion) and external encoding (the default encoding of read files), but not the encoding of ruby source files. This can only be changed with magic comments on top of a source file.
That answer is the same as the answer someone else already said
how so? I feel my answer is not the best, but adds additional value that the other answers do not mention (the link and the discussion there)

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.