Timeline for Should UTF-16 be considered harmful?

9 events

when toggle format	what		by	license	comment
Aug 13, 2015 at 16:25	history	unlocked	Thomas Owens♦
Aug 13, 2015 at 16:05	history	locked	CommunityBot
Aug 13, 2015 at 15:47	history	edited	user22815	CC BY-SA 3.0	Spelling, other minor improvements for readability.
Jun 19, 2014 at 5:05	comment	added	musiphil		Even though UTF-32 is fixed-width for code points, it is not fixed-width for characters. (Heard of something called "combining characters"?) So you can't go to the N'th character simply by indexing 4N into the byte array.
May 1, 2012 at 0:16	comment	added	Qwertie		Endianness issues are unavoidable as long as different processors continue to use different byte orders. However, it might have been nice if there were a "preferred" byte order for file storage of UTF-16.
Aug 18, 2011 at 21:32	history	made wiki			Post Made Community Wiki
Aug 11, 2011 at 14:30	comment	added	tchrist		@Tronic: Technically, this is not true. Although UCS-4 can store any 32-bit integer, UTF-32 is forbidden from storing the non-character code points that are illegal for interchange, such as 0xFFFF, 0xFFFE, and the all the surrogates. UTF is a transport encoding, not an internal one.
Oct 20, 2010 at 23:34	comment	added	Tronic		Unspecified endianess is supposed to include BOM as the first character, used for determining which way the string should be read. UCS-4 and UTF-32 indeed are the same nowadays, i.e. a numeric UCS value between 0 and 0x10FFFF stored in a 32 bit integer.
Oct 19, 2010 at 7:06	history	answered	Patrick Horgan	CC BY-SA 2.5