Skip to main content
20 events
when toggle format what by license comment
Dec 6, 2017 at 17:11 history tweeted twitter.com/StackSoftEng/status/938455952893898753
Dec 6, 2017 at 11:59 answer added user timeline score: 4
Mar 2, 2015 at 22:18 comment added Deduplicator @Berin: Better: UTF-16 was hacked to support more codepoints, as it allowed far too few, and UTF-8 (as well as UTF-32) was restricted to the range that hack had a chance of covering.
Feb 1, 2015 at 5:41 comment added phuclv utf8everywhere.org
Jan 31, 2015 at 12:34 comment added gnasher729 The theoretical Unicode range is from 0 to 0x10ffff. Nothing more. That's what the Unicode standard says. UTF-8 handles all of Unicode and always will. It doesn't cover the hypothetical range of an encoding that isn't Unicode, but it covers all of Unicode.
Jun 15, 2011 at 4:36 vote accept Joey Adams
Apr 15, 2011 at 13:51 answer added Peter Eisentraut timeline score: 16
Jan 28, 2011 at 1:45 answer added dan04 timeline score: 4
Jan 26, 2011 at 14:19 comment added Berin Loritsch UTF-16 was also expanded to deal with compatibility to UTF-32 in much the same way that UTF-8 was expanded.
Jan 26, 2011 at 14:14 answer added Berin Loritsch timeline score: 0
Jan 26, 2011 at 9:38 comment added Donal Fellows @dan04: Working with Russian texts used to be a pain, as they used multiple encodings that were substantially different and would usually just hack things to work by using different fonts (which would often lie about the encoding in use in their metadata). All in all, a horrible mess. I suspect they've cleaned up though – probably by moving to UTF-8 – because the number of support requests from that direction has dropped right off.
Jan 26, 2011 at 8:57 history edited Zekta Chan
edited tags
Jan 26, 2011 at 6:42 comment added dan04 At least PostgreSQL deliberately deals with multiple character encodings. It sucks to have to deal with a random mix of UTF-8 and windows-1252 because someone just didn't care.
Jan 26, 2011 at 5:48 comment added dan04 @mario: The original definition of UTF-8 allowed up to 6 bytes. It was later artificially restricted to only cover the characters UTF-16 could support.
Jan 26, 2011 at 4:41 answer added Jerry Coffin timeline score: 7
Jan 26, 2011 at 4:19 answer added Dean Harding timeline score: 5
Jan 26, 2011 at 3:45 answer added zneak timeline score: 3
Jan 26, 2011 at 3:44 answer added Mike Samuel timeline score: 0
Jan 26, 2011 at 3:43 comment added mario UTF-8 has 21 encoding bits max, so it could not encode all of the theoretical Unicode range, but all of the currently assigned Unicode characters (why it's believed sufficient).
Jan 26, 2011 at 3:32 history asked Joey Adams CC BY-SA 2.5