0

So i have this character:

🀀

MAHJONG TILE EAST WIND Which has the Unicode point U+1F000 (U+D83C U+DC00) and the UTF-8 encoding F0 9F 80 80

My question is how to I escape this in javascript?

I see \uff00 all the time, but that is for ASCII as 8 bytes will only take you up to 255. Just putting \u1F000' returns the (incorrect) 'ἀ0' and trying to fill in the extra bytes with 0s just returns \u0001F000'. How do I escape values that are higher (such as my above character?).

And how do I escape not just the Unicode point but also the UTF-8 encoding?

Taking on to this, I have noticed that the node REPL is able to show many Unicode values but not some (such as Emoji) even when my terminal window (mac) normally could. Is there any rhyme or reason to this

2 Answers 2

1

You can escape the char using \uXXXX x2 (for 32-bit values) format.

To use UTF-8 strings look into typed arrays and TextEncoder / TextDecoder. They are fairly new so you may need to use polyfill in some browsers.

Example

document.write('<h1>\uD83C\uDC00</h1>');

Sign up to request clarification or add additional context in comments.

2 Comments

I see and so for code points where two values are not give (i.e. U+271F) Do i just have to do the math? Also is there a way to use the UTF-8 encoding?
@Startec updated about utf-8. You can use 16-bit values the same way, f.ex. \u271f -> ✟
1

JavaScript does not support UTF-8 strings. All JavaScript strings are UCS-2 (but it supports UTF-16-style surrogate pairs). You can escape astral plane characters with two 16-bit characters: "\ud83c\udc00".

"🀀".charCodeAt(0).toString(16) // => "d83c" "🀀".charCodeAt(1).toString(16) // => "dc00" console.log("\ud83c\udc00") // => 🀀 

This also means that JavaScript doesn't know how to get the correct length of strings containing astrals, and that any indexing or substringing has a chance of being wrong:

"🀀".length // => 2 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.