Timeline for Should character encodings besides UTF-8 (and maybe UTF-16/UTF-32) be deprecated?
Current License: CC BY-SA 2.5
20 events
| when toggle format | what | by | license | comment | |
|---|---|---|---|---|---|
| Dec 6, 2017 at 17:11 | history | tweeted | twitter.com/StackSoftEng/status/938455952893898753 | ||
| Dec 6, 2017 at 11:59 | answer | added | user | timeline score: 4 | |
| Mar 2, 2015 at 22:18 | comment | added | Deduplicator | @Berin: Better: UTF-16 was hacked to support more codepoints, as it allowed far too few, and UTF-8 (as well as UTF-32) was restricted to the range that hack had a chance of covering. | |
| Feb 1, 2015 at 5:41 | comment | added | phuclv | utf8everywhere.org | |
| Jan 31, 2015 at 12:34 | comment | added | gnasher729 | The theoretical Unicode range is from 0 to 0x10ffff. Nothing more. That's what the Unicode standard says. UTF-8 handles all of Unicode and always will. It doesn't cover the hypothetical range of an encoding that isn't Unicode, but it covers all of Unicode. | |
| Jun 15, 2011 at 4:36 | vote | accept | Joey Adams | ||
| Apr 15, 2011 at 13:51 | answer | added | Peter Eisentraut | timeline score: 16 | |
| Jan 28, 2011 at 1:45 | answer | added | dan04 | timeline score: 4 | |
| Jan 26, 2011 at 14:19 | comment | added | Berin Loritsch | UTF-16 was also expanded to deal with compatibility to UTF-32 in much the same way that UTF-8 was expanded. | |
| Jan 26, 2011 at 14:14 | answer | added | Berin Loritsch | timeline score: 0 | |
| Jan 26, 2011 at 9:38 | comment | added | Donal Fellows | @dan04: Working with Russian texts used to be a pain, as they used multiple encodings that were substantially different and would usually just hack things to work by using different fonts (which would often lie about the encoding in use in their metadata). All in all, a horrible mess. I suspect they've cleaned up though – probably by moving to UTF-8 – because the number of support requests from that direction has dropped right off. | |
| Jan 26, 2011 at 8:57 | history | edited | Zekta Chan | edited tags | |
| Jan 26, 2011 at 6:42 | comment | added | dan04 | At least PostgreSQL deliberately deals with multiple character encodings. It sucks to have to deal with a random mix of UTF-8 and windows-1252 because someone just didn't care. | |
| Jan 26, 2011 at 5:48 | comment | added | dan04 | @mario: The original definition of UTF-8 allowed up to 6 bytes. It was later artificially restricted to only cover the characters UTF-16 could support. | |
| Jan 26, 2011 at 4:41 | answer | added | Jerry Coffin | timeline score: 7 | |
| Jan 26, 2011 at 4:19 | answer | added | Dean Harding | timeline score: 5 | |
| Jan 26, 2011 at 3:45 | answer | added | zneak | timeline score: 3 | |
| Jan 26, 2011 at 3:44 | answer | added | Mike Samuel | timeline score: 0 | |
| Jan 26, 2011 at 3:43 | comment | added | mario | UTF-8 has 21 encoding bits max, so it could not encode all of the theoretical Unicode range, but all of the currently assigned Unicode characters (why it's believed sufficient). | |
| Jan 26, 2011 at 3:32 | history | asked | Joey Adams | CC BY-SA 2.5 |