1

I'm retrieving a string from url content, this way:

NSString *urlStr = [NSString stringWithFormat:@"http://www.example.com/string_to_retrieve"]; NSURL *url = [NSURL URLWithString:urlStr]; NSString *resp = [NSString stringWithContentsOfURL:url encoding:NSASCIIStringEncoding error:nil]; 

and writing it to file, this way:

NSString *outFile = @"/path/to/myfile"; [resp writeToFile:outFile atomically:YES encoding:NSUTF8StringEncoding error:nil]; 

my string contains several special chars like "ö" which should be represented by "F6" hex value, but when i try to open with a hex editor the file where my string is written, i see that "ö" (F6) got converted to two other chars: "ö" (C3 B6)

I tried with several other string encodings in both

NSString *resp = [NSString stringWithContentsOfURL:url encoding:NSASCIIStringEncoding error:nil]; 

and

[resp writeToFile:outFile atomically:YES encoding:NSUTF8StringEncoding error:nil]; 

but always with bad results...

NSASCIIStringEncoding seems to be the only way that i can get that string from my url: if i use other encodings like NSUTF8StringEncoding all i get is nil

NSUTF8StringEncoding seems to be the only way i can write to my file: if i use other encodings like NSASCIIStringEncoding all i get is a 0 byte file

So how can i properly retrieve and write that string to my file?

1 Answer 1

7

F6 is the character code of "ö" in the NSISOLatin1StringEncoding, so

[resp writeToFile:outFile atomically:YES encoding:NSISOLatin1StringEncoding error:nil]; 

should give the desired result. (NSISOLatin2StringEncoding works as well. I am not sure about the differences. The documentation of the supported encodings is not very verbose.)


Update:

  • NSISOLatin1StringEncoding is the ISO-8859-1 encoding, which is intended for "Western European" languages.
  • NSISOLatin2StringEncoding is the ISO-8859-2 encoding, which is intended for "Eastern European" languages.
  • @Esailija states (see comments below) that Windows-1252 might be a better choice, the corresponding encoding is NSWindowsCP1252StringEncoding.
Sign up to request clarification or add additional context in comments.

5 Comments

Thanks NSISOLatin1StringEncoding seems to work! I thought i yet tested my code with that encoding too but maybe i used it in NSString *resp definition and user NSUTF8StringEncoding in writeToFile call which probably gave me bad result.
The difference is that they are different encodings, that just happen to encode ö in the same way. Kinda like virtually any encoding encodes ASCII characters the same way, which means that possible encoding screw ups go unnoticed if only ASCII characters are used. You can look up the differences in the code pages from here and here.
@Esailija: You are right. What I meant is that that the documentation does not specify exactly which encoding corresponds to which standard. In the meantime I figured out that NSISOLatin1StringEncoding = ISO 8859-1 (Western European), and NSISOLatin2StringEncoding = ISO 8859-2 (Eastern European). - But "ö" is not an ASCII character, and is for example encoded differently in the Mac Roman encoding.
@MartinR yes the latin-1 is pretty ambiguous in practice, browsers treat it as Windows-1252 and so does MySQL. But there is no harm in using Windows-1252 instead, because it's practically a superset of ISO-8859-1.
@Esailija: Thank you for the feedback. I will add that information to the answer if that is OK with you.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.