6

I have a pretty simple script in Perl:

use JSON; use open qw/ :std :encoding(utf8) /; #my $ref = JSON::decode_json($json_contents); my $path = "/home/chambres/web/x.org/public_html/cgi-bin/links/admin/booking_import/import/file.json"; my $json_contents = slurp_utf8_file($path); my $ref = JSON->new->utf8->decode($json_contents); sub slurp_utf8_file { my @back; #open my $in, '<:encoding(UTF-8)', $_[0] or die $!; open my $in, "<$_[0]" or die $!; while (<$in>) { push @back, $_ } close ($in); return join("", @back); } 

The file is encoded in UTF-8 in Notepad++:

enter image description here

...yet when I run my script I get:

perl test.cgi Wide character in subroutine entry at test.cgi line 11. 

Line 11 is:

my $ref = JSON->new->utf8->decode($json_contents); 

I'm baffled as to what I'm doing wrong. Maybe I just need a break! Any advice would be much appreciated!

1
  • 2
    Aren't you attempting double UTF-8 decode? You already decode UTF-8 in the slurp, why would you need ->utf8 on the JSON object? Commented Feb 5, 2019 at 13:55

1 Answer 1

5

You are trying to double decode UTF-8:

#!/usr/bin/perl use strict; use warnings; use JSON; use Data::Dumper; open(my $fh, '<:encoding(UTF-8)', $ARGV[0]) or die $!; my @lines = <$fh>; close($fh) or die $!; # Wide character in subroutine entry at dummy.pl line 14. my $ref = JSON->new->utf8->decode(join('', @lines)); # OK, no warning. my $ref = JSON->new->decode(join('', @lines)); print Dumper($ref); exit 0; 

Test run

$ cat dummy.json { "path": "ä⁈" } # with ->utf8 $ perl dummy.pl dummy.json Wide character in subroutine entry at dummy.pl line 14. # without ->utf8 $ perl dummy.pl dummy.json $VAR1 = { 'path' => "\x{e4}\x{2048}" }; 
Sign up to request clarification or add additional context in comments.

3 Comments

Ah man - sometimes you just need an extra pair of eyes to look over it! I think this came from trying to use File::Slurp::read_file, but that apparently isn't good with utf8 - so I moved over to reading the file myself, but guess I didn't take that part out as well. Thanks for saving my sanity :)
@AndrewNewby It is recommended to use File::Slurper instead which has much more straightforward functions. From that you can use read_binary to read the bytes as-is, and then use a standard UTF-8 decoding json decoder.
@Grinnz thanks for the recommendation. I'll try and remember that for the next project :) (for now, the method I'm using is working as needed, so not much point including a whole extra module :))

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.