I want to replace unicode character pairs in XeTeX similar to Country flags unicode char. I don't want to specify every option (every country code) and want to be flexible for other emoji (skin color variations, other non-country-code flags etc.). The problem with the code in the mentioned answer is, that it doesn't allow single pairable characters (even if I modify the False-statement accordingly).
After days of fiddling I found a solution (see below) which works for almost all cases I could imagine.
My questions:
- How can I check the last cases (9: following ~ and 21: following alignment character) and why is it not working if I put a letter between my character and the alignment character (case 22) but works with a seperating space (case 21)?
- Have I forgotten any cases?
- Is there a more elegant way? My solution seems very crude to me.
My solution including example-list with mentioned cases (not working cases [last two items] are commented out):
\documentclass[varwidth,border=10pt]{standalone} \usepackage{expl3} \usepackage{newunicodechar} \renewcommand{\familydefault}{\sfdefault} % math is more obvious \ExplSyntaxOn \cs_new_protected:Npn \single_uni:n #1 { \int_to_Hex:n{`#1} } % method is very specific, am I forgetting something? Not possible with ^^7e(~), ... \cs_new_protected:Nn \dual_uni:n { \peek_catcode:NTF \c_space_token { \int_to_Hex:n{`#1} }{ \peek_catcode:NTF \c_other_token { \int_to_Hex:n{`#1} }{ \peek_charcode:NTF ^^24 { % math, \c_math_toggle_token not working \int_to_Hex:n{`#1} }{ \peek_charcode:NTF ^^5e { % math sub, \c_math_subscript_token not working \int_to_Hex:n{`#1} }{ \peek_charcode:NTF ^^5f { % math super, \c_math_superscript_token not working \int_to_Hex:n{`#1} }{ \peek_catcode:NTF \c_group_end_token { % end group \int_to_Hex:n{`#1} }{ \peek_catcode:NTF \c_group_begin_token { % begin group \int_to_Hex:n{`#1} }{ \peek_catcode:NTF \c_alignment_token { % has no effect?! \int_to_Hex:n{`#1} }{ \dual_uni_cont:nn{#1} } } } } } } } } } \cs_new_protected:Nn \dual_uni_cont:nn{ \textbf{\int_to_Hex:n{`#1}}\textit{\int_to_Hex:n{`#2}} } \newunicodechar{➀}{\single_uni:n{➀}} % single_uni \newunicodechar{➁}{\dual_uni:n{➁}} % dual_uni % ➂ undeclared newunicodechar \ExplSyntaxOff \begin{document} \begin{enumerate} \item[] expected \quad -- \quad result \item ➂ \quad -- \quad ➂ % comparsion undeclared uni (expected blank, because not in font) \item 2780 \quad -- \quad ➀ % comparsion single_uni \item \textbf{2781}\textit{78} \quad -- \quad ➁x % following letter \item \textbf{2781}\textit{21} \quad -- \quad ➁! % following other \item \textbf{2781}\textit{2782} \quad -- \quad ➁➂ % following unicode (other) \item \textbf{2781}\textit{2781} \quad -- \quad ➁➀ % following "newunicode" as \single_uni (active?!) \item \textbf{2781}\textit{2780}x \quad -- \quad ➁➁x % following "newunicode" as \dual_uni (active?!), following letter \item 2781 x \quad -- \quad ➁ x % following space \item 2781~x \quad -- \quad ➁~x % following active \item 2781 \quad -- \quad ➁\\ % following newline \phantom{nothing} % nothing for a new line \item $\textbf{2781}\textit{78}$ \quad -- \quad $➁x$ % in math (following letter) \item $2781^x$ \quad -- \quad $➁^x$ % following math superscript \item $2781_x$ \quad -- \quad $➁_x$ % following math subscript \item $2781$ \quad -- \quad $➁$ % in math (following math toggle) \item 2781\$ \quad -- \quad ➁\$ % following \$ \item 2781\textbullet \quad -- \quad ➁\textbullet % following command \item 2781\footnote{x} \quad -- \quad ➁\footnote{x} % following command \item {2781} \quad -- \quad {➁} % following group end \item 2781{x} \quad -- \quad ➁{x} % following group begin \item 2781 \quad -- \quad ➁% %x following comment \item \begin{tabular}{llllll} % in tabular, following space 2781&x &\quad -- \quad& ➁ x& x\\ \end{tabular} \item \begin{tabular}{llllll} % in tabular, following letter, alignment 2781&x &\quad -- \quad& %➁x& x\\ \end{tabular} \item \begin{tabular}{llllll} % in tabular, following alignment, space, letter 2781&x &\quad -- \quad& %➁& x\\ \end{tabular} \end{enumerate} \end{document} The result looks like this at the moment (the last two items have no result): If characters, which are defined as potential double-characters find a following "partner", the result is the original character in bold font and the following in italics. I the character is nou double-character or hasn't found a "partner", it is printed normally: 
Update: I tried another approach with different problems: Compare macro names instead of meaning. But this question still needs answering...
\str_case:nnFin the cited question? I would be very happy, if you could make a small example :-)