Skip to main content
forgotten sentence obsolete removed
Source Link
user4686
user4686

If you only need filenames, but do not need them to be "human readable", then you could take advantage of \pdfstringdef

\documentclass{article} \usepackage[utf8]{inputenc} \usepackage[T1]{fontenc} \usepackage[unicode]{hyperref} \makeatletter \begingroup \catcode`| 0 \catcode`\\ 12 |gdef|makestring@i\#1#2#3#4% {#1#2#3|if|relax#4|expandafter|@gobbletwo|fi|makestring@i#4} |endgroup \newcommand*{\makestring}[2]{% \pdfstringdef\makestring@{#2}% \edef#1{\expandafter\makestring@i\makestring@\relax}% } \makeatother \begin{document} \makestring{\foo}{æüßéñ} \texttt{\meaning\foo} \end{document} 

enter image description here


A variation on this theme which is much more efficient, it show the utf8 bytes. One could produce in hexadecimal if desired. (in fact there are possibly macros in utf8.def which could be used here)

\documentclass{article} \usepackage[utf8]{inputenc} \usepackage[T1]{fontenc} \makeatletter \newcommand*\MakeString[2]{% \begingroup \def\UTFviii@two@octets##1##2{\the\numexpr`##1\relax\the\numexpr`##2}% \def\UTFviii@three@octets##1##2##3{\the\numexpr`##1\relax\the\numexpr`##2\relax\the\numexpr`##3\relax}% \def\UTFviii@four@octets##1##2##3##4{\the\numexpr`##1\relax\the\numexpr`##2\relax\the\numexpr`##3\relax\the\numexpr`##4\relax}% \xdef#1{#2}% \endgroup } \makeatother \begin{document} \MakeString{\foo}{æüßéñ} \texttt{\meaning\foo} \show\foo \end{document} 

Produces:

> \foo=macro: ->195166195188195159195169195177. l.23 \show\foo 

I should improve so that each byte produce a three-digits decimal, here leading zeros are stripped!


Ok here it is with no stripping and 2-hex digits per byte.

edit removed usage of extra package. Defined \Byte@tohex macro possibly already provided by utf8-inputenc internally, not checked.

\documentclass{article} \usepackage[utf8]{inputenc} \usepackage[T1]{fontenc} \makeatletter % I have not checked but maybe utf8-inputenc provides already % similar macro (not even using e-TeX) \def\Byte@tohex #1% {\expandafter \Byte@tohex@\the\numexpr(`#1+8)/16-1\expandafter .\the\numexpr`#1.}% \def\Byte@tohex@ #1.#2.% {\Byte@onehex #1.% \expandafter\Byte@onehex\the\numexpr #2-16*#1.% } \def\Byte@onehex #1.% {\ifcase #1 0\or1\or2\or3\or4\or5\or6\or7\or8\or9% \or A\or B\or C\or D\or E\or F% \fi }% \newcommand*\MakeString[2]{% \begingroup \def\UTFviii@two@octets##1##2{\Byte@tohex{##1}\Byte@tohex{##2}}% \def\UTFviii@three@octets##1##2##3{\Byte@tohex{##1}\Byte@tohex{##2}\Byte@tohex{##3}}% \def\UTFviii@four@octets##1##2##3##4{\Byte@tohex{##1}\Byte@tohex{##2}\Byte@tohex{##3}\Byte@tohex{##4}}% \xdef#1{#2}% \endgroup } \makeatother \begin{document} \MakeString{\foo}{æüßéñ} \texttt{\meaning\foo} \show\foo \end{document} 

produces in log

> \foo=macro: ->C3A6C3BCC39FC3A9C3B1. l.27 \show\foo 

(coding efficiency could be improved)

If you only need filenames, but do not need them to be "human readable", then you could take advantage of \pdfstringdef

\documentclass{article} \usepackage[utf8]{inputenc} \usepackage[T1]{fontenc} \usepackage[unicode]{hyperref} \makeatletter \begingroup \catcode`| 0 \catcode`\\ 12 |gdef|makestring@i\#1#2#3#4% {#1#2#3|if|relax#4|expandafter|@gobbletwo|fi|makestring@i#4} |endgroup \newcommand*{\makestring}[2]{% \pdfstringdef\makestring@{#2}% \edef#1{\expandafter\makestring@i\makestring@\relax}% } \makeatother \begin{document} \makestring{\foo}{æüßéñ} \texttt{\meaning\foo} \end{document} 

enter image description here


A variation on this theme which is much more efficient, it show the utf8 bytes. One could produce in hexadecimal if desired. (in fact there are possibly macros in utf8.def which could be used here)

\documentclass{article} \usepackage[utf8]{inputenc} \usepackage[T1]{fontenc} \makeatletter \newcommand*\MakeString[2]{% \begingroup \def\UTFviii@two@octets##1##2{\the\numexpr`##1\relax\the\numexpr`##2}% \def\UTFviii@three@octets##1##2##3{\the\numexpr`##1\relax\the\numexpr`##2\relax\the\numexpr`##3\relax}% \def\UTFviii@four@octets##1##2##3##4{\the\numexpr`##1\relax\the\numexpr`##2\relax\the\numexpr`##3\relax\the\numexpr`##4\relax}% \xdef#1{#2}% \endgroup } \makeatother \begin{document} \MakeString{\foo}{æüßéñ} \texttt{\meaning\foo} \show\foo \end{document} 

Produces:

> \foo=macro: ->195166195188195159195169195177. l.23 \show\foo 

I should improve so that each byte produce a three-digits decimal, here leading zeros are stripped!


Ok here it is with no stripping and 2-hex digits per byte.

edit removed usage of extra package. Defined \Byte@tohex macro possibly already provided by utf8-inputenc internally, not checked.

\documentclass{article} \usepackage[utf8]{inputenc} \usepackage[T1]{fontenc} \makeatletter % I have not checked but maybe utf8-inputenc provides already % similar macro (not even using e-TeX) \def\Byte@tohex #1% {\expandafter \Byte@tohex@\the\numexpr(`#1+8)/16-1\expandafter .\the\numexpr`#1.}% \def\Byte@tohex@ #1.#2.% {\Byte@onehex #1.% \expandafter\Byte@onehex\the\numexpr #2-16*#1.% } \def\Byte@onehex #1.% {\ifcase #1 0\or1\or2\or3\or4\or5\or6\or7\or8\or9% \or A\or B\or C\or D\or E\or F% \fi }% \newcommand*\MakeString[2]{% \begingroup \def\UTFviii@two@octets##1##2{\Byte@tohex{##1}\Byte@tohex{##2}}% \def\UTFviii@three@octets##1##2##3{\Byte@tohex{##1}\Byte@tohex{##2}\Byte@tohex{##3}}% \def\UTFviii@four@octets##1##2##3##4{\Byte@tohex{##1}\Byte@tohex{##2}\Byte@tohex{##3}\Byte@tohex{##4}}% \xdef#1{#2}% \endgroup } \makeatother \begin{document} \MakeString{\foo}{æüßéñ} \texttt{\meaning\foo} \show\foo \end{document} 

produces in log

> \foo=macro: ->C3A6C3BCC39FC3A9C3B1. l.27 \show\foo 

(coding efficiency could be improved)

If you only need filenames, but do not need them to be "human readable", then you could take advantage of \pdfstringdef

\documentclass{article} \usepackage[utf8]{inputenc} \usepackage[T1]{fontenc} \usepackage[unicode]{hyperref} \makeatletter \begingroup \catcode`| 0 \catcode`\\ 12 |gdef|makestring@i\#1#2#3#4% {#1#2#3|if|relax#4|expandafter|@gobbletwo|fi|makestring@i#4} |endgroup \newcommand*{\makestring}[2]{% \pdfstringdef\makestring@{#2}% \edef#1{\expandafter\makestring@i\makestring@\relax}% } \makeatother \begin{document} \makestring{\foo}{æüßéñ} \texttt{\meaning\foo} \end{document} 

enter image description here


A variation on this theme which is much more efficient, it show the utf8 bytes. One could produce in hexadecimal if desired. (in fact there are possibly macros in utf8.def which could be used here)

\documentclass{article} \usepackage[utf8]{inputenc} \usepackage[T1]{fontenc} \makeatletter \newcommand*\MakeString[2]{% \begingroup \def\UTFviii@two@octets##1##2{\the\numexpr`##1\relax\the\numexpr`##2}% \def\UTFviii@three@octets##1##2##3{\the\numexpr`##1\relax\the\numexpr`##2\relax\the\numexpr`##3\relax}% \def\UTFviii@four@octets##1##2##3##4{\the\numexpr`##1\relax\the\numexpr`##2\relax\the\numexpr`##3\relax\the\numexpr`##4\relax}% \xdef#1{#2}% \endgroup } \makeatother \begin{document} \MakeString{\foo}{æüßéñ} \texttt{\meaning\foo} \show\foo \end{document} 

Produces:

> \foo=macro: ->195166195188195159195169195177. l.23 \show\foo 

I should improve so that each byte produce a three-digits decimal, here leading zeros are stripped!


Ok here it is with no stripping and 2-hex digits per byte.

edit removed usage of extra package. Defined \Byte@tohex macro possibly already provided by utf8-inputenc internally, not checked.

\documentclass{article} \usepackage[utf8]{inputenc} \usepackage[T1]{fontenc} \makeatletter % I have not checked but maybe utf8-inputenc provides already % similar macro (not even using e-TeX) \def\Byte@tohex #1% {\expandafter \Byte@tohex@\the\numexpr(`#1+8)/16-1\expandafter .\the\numexpr`#1.}% \def\Byte@tohex@ #1.#2.% {\Byte@onehex #1.% \expandafter\Byte@onehex\the\numexpr #2-16*#1.% } \def\Byte@onehex #1.% {\ifcase #1 0\or1\or2\or3\or4\or5\or6\or7\or8\or9% \or A\or B\or C\or D\or E\or F% \fi }% \newcommand*\MakeString[2]{% \begingroup \def\UTFviii@two@octets##1##2{\Byte@tohex{##1}\Byte@tohex{##2}}% \def\UTFviii@three@octets##1##2##3{\Byte@tohex{##1}\Byte@tohex{##2}\Byte@tohex{##3}}% \def\UTFviii@four@octets##1##2##3##4{\Byte@tohex{##1}\Byte@tohex{##2}\Byte@tohex{##3}\Byte@tohex{##4}}% \xdef#1{#2}% \endgroup } \makeatother \begin{document} \MakeString{\foo}{æüßéñ} \texttt{\meaning\foo} \show\foo \end{document} 

produces in log

> \foo=macro: ->C3A6C3BCC39FC3A9C3B1. l.27 \show\foo 
replaced xintbinhex by directly defined \Byte@tohex
Source Link
user4686
user4686

If you only need filenames, but do not need them to be "human readable", then you could take advantage of \pdfstringdef

\documentclass{article} \usepackage[utf8]{inputenc} \usepackage[T1]{fontenc} \usepackage[unicode]{hyperref} \makeatletter \begingroup \catcode`| 0 \catcode`\\ 12 |gdef|makestring@i\#1#2#3#4% {#1#2#3|if|relax#4|expandafter|@gobbletwo|fi|makestring@i#4} |endgroup \newcommand*{\makestring}[2]{% \pdfstringdef\makestring@{#2}% \edef#1{\expandafter\makestring@i\makestring@\relax}% } \makeatother \begin{document} \makestring{\foo}{æüßéñ} \texttt{\meaning\foo} \end{document} 

enter image description here


A variation on this theme which is much more efficient, it show the utf8 bytes. One could produce in hexadecimal if desired. (in fact there are possibly macros in utf8.def which could be used here)

\documentclass{article} \usepackage[utf8]{inputenc} \usepackage[T1]{fontenc} \makeatletter \newcommand*\MakeString[2]{% \begingroup \def\UTFviii@two@octets##1##2{\the\numexpr`##1\relax\the\numexpr`##2}% \def\UTFviii@three@octets##1##2##3{\the\numexpr`##1\relax\the\numexpr`##2\relax\the\numexpr`##3\relax}% \def\UTFviii@four@octets##1##2##3##4{\the\numexpr`##1\relax\the\numexpr`##2\relax\the\numexpr`##3\relax\the\numexpr`##4\relax}% \xdef#1{#2}% \endgroup } \makeatother \begin{document} \MakeString{\foo}{æüßéñ} \texttt{\meaning\foo} \show\foo \end{document} 

Produces:

> \foo=macro: ->195166195188195159195169195177. l.23 \show\foo 

I should improve so that each byte produce a three-digits decimal, here leading zeros are stripped!


Ok here it is with no stripping and 2-hex digits per byte but I used.

edit removed usage of extra package. Defined xintbinhex\Byte@tohex macro possibly already provided by lazinessutf8-inputenc internally, not checked.

\documentclass{article} \usepackage[utf8]{inputenc} \usepackage[T1]{fontenc} \usepackage{xintbinhex}\makeatletter % I have not checked but maybe utf8-inputenc provides already \makeatletter% similar macro (not even using e-TeX) \newcommand*\MakeString@tohex[1]\def\Byte@tohex #1% {\expandafter  \Byte@tohex@\the\numexpr(`#1+8)/16-1\expandafter  {\expandafter\@gobble\romannumeral0\xintdectohex.\the\numexpr`#1.}% \def\Byte@tohex@ #1.#2.% {\the\numexpr256+`#1\relax\Byte@onehex #1.% \expandafter\Byte@onehex\the\numexpr #2-16*#1.% } \def\Byte@onehex #1.% {\ifcase #1 0\or1\or2\or3\or4\or5\or6\or7\or8\or9% \or A\or B\or C\or D\or E\or F% \fi }% \newcommand*\MakeString[2]{% \begingroup \def\UTFviii@two@octets##1##2{\MakeString@tohex\Byte@tohex{##1}\MakeString@tohex\Byte@tohex{##2}}% \def\UTFviii@three@octets##1##2##3{\MakeString@tohex\Byte@tohex{##1}\MakeString@tohex\Byte@tohex{##2}\MakeString@tohex\Byte@tohex{##3}}% \def\UTFviii@four@octets##1##2##3##4{\MakeString@tohex\Byte@tohex{##1}\MakeString@tohex\Byte@tohex{##2}\MakeString@tohex\Byte@tohex{##3}\MakeString@tohex\Byte@tohex{##4}}% \xdef#1{#2}% \endgroup } \makeatother   \begin{document} \MakeString{\foo}{æüßéñ} \texttt{\meaning\foo} \show\foo \end{document} 

produces in log

> \foo=macro: ->C3A6C3BCC39FC3A9C3B1. l.27 \show\foo 

(coding efficiency could be improved)

If you only need filenames, but do not need them to be "human readable", then you could take advantage of \pdfstringdef

\documentclass{article} \usepackage[utf8]{inputenc} \usepackage[T1]{fontenc} \usepackage[unicode]{hyperref} \makeatletter \begingroup \catcode`| 0 \catcode`\\ 12 |gdef|makestring@i\#1#2#3#4% {#1#2#3|if|relax#4|expandafter|@gobbletwo|fi|makestring@i#4} |endgroup \newcommand*{\makestring}[2]{% \pdfstringdef\makestring@{#2}% \edef#1{\expandafter\makestring@i\makestring@\relax}% } \makeatother \begin{document} \makestring{\foo}{æüßéñ} \texttt{\meaning\foo} \end{document} 

enter image description here


A variation on this theme which is much more efficient, it show the utf8 bytes. One could produce in hexadecimal if desired. (in fact there are possibly macros in utf8.def which could be used here)

\documentclass{article} \usepackage[utf8]{inputenc} \usepackage[T1]{fontenc} \makeatletter \newcommand*\MakeString[2]{% \begingroup \def\UTFviii@two@octets##1##2{\the\numexpr`##1\relax\the\numexpr`##2}% \def\UTFviii@three@octets##1##2##3{\the\numexpr`##1\relax\the\numexpr`##2\relax\the\numexpr`##3\relax}% \def\UTFviii@four@octets##1##2##3##4{\the\numexpr`##1\relax\the\numexpr`##2\relax\the\numexpr`##3\relax\the\numexpr`##4\relax}% \xdef#1{#2}% \endgroup } \makeatother \begin{document} \MakeString{\foo}{æüßéñ} \texttt{\meaning\foo} \show\foo \end{document} 

Produces:

> \foo=macro: ->195166195188195159195169195177. l.23 \show\foo 

I should improve so that each byte produce a three-digits decimal, here leading zeros are stripped!


Ok here it is with no stripping and 2-hex digits per byte but I used xintbinhex by laziness.

\documentclass{article} \usepackage[utf8]{inputenc} \usepackage[T1]{fontenc} \usepackage{xintbinhex} \makeatletter \newcommand*\MakeString@tohex[1] {\expandafter\@gobble\romannumeral0\xintdectohex{\the\numexpr256+`#1\relax}}% \newcommand*\MakeString[2]{% \begingroup \def\UTFviii@two@octets##1##2{\MakeString@tohex{##1}\MakeString@tohex{##2}}% \def\UTFviii@three@octets##1##2##3{\MakeString@tohex{##1}\MakeString@tohex{##2}\MakeString@tohex{##3}}% \def\UTFviii@four@octets##1##2##3##4{\MakeString@tohex{##1}\MakeString@tohex{##2}\MakeString@tohex{##3}\MakeString@tohex{##4}}% \xdef#1{#2}% \endgroup } \makeatother   \begin{document} \MakeString{\foo}{æüßéñ} \texttt{\meaning\foo} \show\foo \end{document} 

produces in log

> \foo=macro: ->C3A6C3BCC39FC3A9C3B1. l.27 \show\foo 

(coding efficiency could be improved)

If you only need filenames, but do not need them to be "human readable", then you could take advantage of \pdfstringdef

\documentclass{article} \usepackage[utf8]{inputenc} \usepackage[T1]{fontenc} \usepackage[unicode]{hyperref} \makeatletter \begingroup \catcode`| 0 \catcode`\\ 12 |gdef|makestring@i\#1#2#3#4% {#1#2#3|if|relax#4|expandafter|@gobbletwo|fi|makestring@i#4} |endgroup \newcommand*{\makestring}[2]{% \pdfstringdef\makestring@{#2}% \edef#1{\expandafter\makestring@i\makestring@\relax}% } \makeatother \begin{document} \makestring{\foo}{æüßéñ} \texttt{\meaning\foo} \end{document} 

enter image description here


A variation on this theme which is much more efficient, it show the utf8 bytes. One could produce in hexadecimal if desired. (in fact there are possibly macros in utf8.def which could be used here)

\documentclass{article} \usepackage[utf8]{inputenc} \usepackage[T1]{fontenc} \makeatletter \newcommand*\MakeString[2]{% \begingroup \def\UTFviii@two@octets##1##2{\the\numexpr`##1\relax\the\numexpr`##2}% \def\UTFviii@three@octets##1##2##3{\the\numexpr`##1\relax\the\numexpr`##2\relax\the\numexpr`##3\relax}% \def\UTFviii@four@octets##1##2##3##4{\the\numexpr`##1\relax\the\numexpr`##2\relax\the\numexpr`##3\relax\the\numexpr`##4\relax}% \xdef#1{#2}% \endgroup } \makeatother \begin{document} \MakeString{\foo}{æüßéñ} \texttt{\meaning\foo} \show\foo \end{document} 

Produces:

> \foo=macro: ->195166195188195159195169195177. l.23 \show\foo 

I should improve so that each byte produce a three-digits decimal, here leading zeros are stripped!


Ok here it is with no stripping and 2-hex digits per byte.

edit removed usage of extra package. Defined \Byte@tohex macro possibly already provided by utf8-inputenc internally, not checked.

\documentclass{article} \usepackage[utf8]{inputenc} \usepackage[T1]{fontenc} \makeatletter % I have not checked but maybe utf8-inputenc provides already % similar macro (not even using e-TeX) \def\Byte@tohex #1% {\expandafter  \Byte@tohex@\the\numexpr(`#1+8)/16-1\expandafter  .\the\numexpr`#1.}% \def\Byte@tohex@ #1.#2.% {\Byte@onehex #1.% \expandafter\Byte@onehex\the\numexpr #2-16*#1.% } \def\Byte@onehex #1.% {\ifcase #1 0\or1\or2\or3\or4\or5\or6\or7\or8\or9% \or A\or B\or C\or D\or E\or F% \fi }% \newcommand*\MakeString[2]{% \begingroup \def\UTFviii@two@octets##1##2{\Byte@tohex{##1}\Byte@tohex{##2}}% \def\UTFviii@three@octets##1##2##3{\Byte@tohex{##1}\Byte@tohex{##2}\Byte@tohex{##3}}% \def\UTFviii@four@octets##1##2##3##4{\Byte@tohex{##1}\Byte@tohex{##2}\Byte@tohex{##3}\Byte@tohex{##4}}% \xdef#1{#2}% \endgroup } \makeatother \begin{document} \MakeString{\foo}{æüßéñ} \texttt{\meaning\foo} \show\foo \end{document} 

produces in log

> \foo=macro: ->C3A6C3BCC39FC3A9C3B1. l.27 \show\foo 

(coding efficiency could be improved)

added variant producing in hexadecimal the utf8-encoded string
Source Link
user4686
user4686

If you only need filenames, but do not need them to be "human readable", then you could take advantage of \pdfstringdef

\documentclass{article} \usepackage[utf8]{inputenc} \usepackage[T1]{fontenc} \usepackage[unicode]{hyperref} \makeatletter \begingroup \catcode`| 0 \catcode`\\ 12 |gdef|makestring@i\#1#2#3#4% {#1#2#3|if|relax#4|expandafter|@gobbletwo|fi|makestring@i#4} |endgroup \newcommand*{\makestring}[2]{% \pdfstringdef\makestring@{#2}% \edef#1{\expandafter\makestring@i\makestring@\relax}% } \makeatother \begin{document} \makestring{\foo}{æüßéñ} \texttt{\meaning\foo} \end{document} 

enter image description here


A variation on this theme which is much more efficient, it show the utf8 bytes. One could produce in hexadecimal if desired. (in fact there are possibly macros in utf8.def which could be used here)

\documentclass{article} \usepackage[utf8]{inputenc} \usepackage[T1]{fontenc} \makeatletter \newcommand*\MakeString[2]{% \begingroup \def\UTFviii@two@octets##1##2{\the\numexpr`##1\relax\the\numexpr`##2}% \def\UTFviii@three@octets##1##2##3{\the\numexpr`##1\relax\the\numexpr`##2\relax\the\numexpr`##3\relax}% \def\UTFviii@four@octets##1##2##3##4{\the\numexpr`##1\relax\the\numexpr`##2\relax\the\numexpr`##3\relax\the\numexpr`##4\relax}% \xdef#1{#2}% \endgroup } \makeatother \begin{document} \MakeString{\foo}{æüßéñ} \texttt{\meaning\foo} \show\foo \end{document} 

Produces:

> \foo=macro: ->195166195188195159195169195177. l.23 \show\foo 

(I I should improve so that each byte produce a three-digits decimal, here leading zeros are stripped,!


Ok here it is, with no stripping and 2-hex digits per byte but I used xintbinhex by laziness).

\documentclass{article} \usepackage[utf8]{inputenc} \usepackage[T1]{fontenc} \usepackage{xintbinhex} \makeatletter \newcommand*\MakeString@tohex[1] {\expandafter\@gobble\romannumeral0\xintdectohex{\the\numexpr256+`#1\relax}}% \newcommand*\MakeString[2]{% \begingroup \def\UTFviii@two@octets##1##2{\xintDecToHex\MakeString@tohex{\the\numexpr`##1\relax##1}\xintDecToHex\MakeString@tohex{\the\numexpr`##2\relax##2}}% \def\UTFviii@three@octets##1##2##3{\xintDecToHex\MakeString@tohex{\the\numexpr`##1\relax##1}\xintDecToHex\MakeString@tohex{\the\numexpr`##2\relax##2}\xintDecToHex\MakeString@tohex{\the\numexpr`##3\relax##3}}% \def\UTFviii@four@octets##1##2##3##4{\xintDecToHex\MakeString@tohex{\the\numexpr`##1\relax##1}\xintDecToHex\MakeString@tohex{\the\numexpr`##2\relax##2}\xintDecToHex\MakeString@tohex{\the\numexpr`##3\relax##3}\xintDecToHex\MakeString@tohex{\the\numexpr`##4\relax##4}}% \xdef#1{#2}% \endgroup } \makeatother   \begin{document} \MakeString{\foo}{æüßéñ} \texttt{\meaning\foo} \show\foo \end{document} 

produces in log

> \foo=macro: ->C3A6C3BCC39FC3A9C3B1. l.2527 \show\foo 

(coding efficiency could be improved)

If you only need filenames, but do not need them to be "human readable", then you could take advantage of \pdfstringdef

\documentclass{article} \usepackage[utf8]{inputenc} \usepackage[T1]{fontenc} \usepackage[unicode]{hyperref} \makeatletter \begingroup \catcode`| 0 \catcode`\\ 12 |gdef|makestring@i\#1#2#3#4% {#1#2#3|if|relax#4|expandafter|@gobbletwo|fi|makestring@i#4} |endgroup \newcommand*{\makestring}[2]{% \pdfstringdef\makestring@{#2}% \edef#1{\expandafter\makestring@i\makestring@\relax}% } \makeatother \begin{document} \makestring{\foo}{æüßéñ} \texttt{\meaning\foo} \end{document} 

enter image description here


A variation on this theme which is much more efficient, it show the utf8 bytes. One could produce in hexadecimal if desired. (in fact there are possibly macros in utf8.def which could be used here)

\documentclass{article} \usepackage[utf8]{inputenc} \usepackage[T1]{fontenc} \makeatletter \newcommand*\MakeString[2]{% \begingroup \def\UTFviii@two@octets##1##2{\the\numexpr`##1\relax\the\numexpr`##2}% \def\UTFviii@three@octets##1##2##3{\the\numexpr`##1\relax\the\numexpr`##2\relax\the\numexpr`##3\relax}% \def\UTFviii@four@octets##1##2##3##4{\the\numexpr`##1\relax\the\numexpr`##2\relax\the\numexpr`##3\relax\the\numexpr`##4\relax}% \xdef#1{#2}% \endgroup } \makeatother \begin{document} \MakeString{\foo}{æüßéñ} \texttt{\meaning\foo} \show\foo \end{document} 

Produces:

> \foo=macro: ->195166195188195159195169195177. l.23 \show\foo 

(I should improve so that each byte produce a three-digits decimal, here leading zeros are stripped, here it is, but I used xintbinhex by laziness)

\documentclass{article} \usepackage[utf8]{inputenc} \usepackage[T1]{fontenc} \usepackage{xintbinhex} \makeatletter \newcommand*\MakeString[2]{% \begingroup \def\UTFviii@two@octets##1##2{\xintDecToHex{\the\numexpr`##1\relax}\xintDecToHex{\the\numexpr`##2\relax}}% \def\UTFviii@three@octets##1##2##3{\xintDecToHex{\the\numexpr`##1\relax}\xintDecToHex{\the\numexpr`##2\relax}\xintDecToHex{\the\numexpr`##3\relax}}% \def\UTFviii@four@octets##1##2##3##4{\xintDecToHex{\the\numexpr`##1\relax}\xintDecToHex{\the\numexpr`##2\relax}\xintDecToHex{\the\numexpr`##3\relax}\xintDecToHex{\the\numexpr`##4\relax}}% \xdef#1{#2}% \endgroup } \makeatother \begin{document} \MakeString{\foo}{æüßéñ} \texttt{\meaning\foo} \show\foo \end{document} 

produces in log

> \foo=macro: ->C3A6C3BCC39FC3A9C3B1. l.25 \show\foo 

If you only need filenames, but do not need them to be "human readable", then you could take advantage of \pdfstringdef

\documentclass{article} \usepackage[utf8]{inputenc} \usepackage[T1]{fontenc} \usepackage[unicode]{hyperref} \makeatletter \begingroup \catcode`| 0 \catcode`\\ 12 |gdef|makestring@i\#1#2#3#4% {#1#2#3|if|relax#4|expandafter|@gobbletwo|fi|makestring@i#4} |endgroup \newcommand*{\makestring}[2]{% \pdfstringdef\makestring@{#2}% \edef#1{\expandafter\makestring@i\makestring@\relax}% } \makeatother \begin{document} \makestring{\foo}{æüßéñ} \texttt{\meaning\foo} \end{document} 

enter image description here


A variation on this theme which is much more efficient, it show the utf8 bytes. One could produce in hexadecimal if desired. (in fact there are possibly macros in utf8.def which could be used here)

\documentclass{article} \usepackage[utf8]{inputenc} \usepackage[T1]{fontenc} \makeatletter \newcommand*\MakeString[2]{% \begingroup \def\UTFviii@two@octets##1##2{\the\numexpr`##1\relax\the\numexpr`##2}% \def\UTFviii@three@octets##1##2##3{\the\numexpr`##1\relax\the\numexpr`##2\relax\the\numexpr`##3\relax}% \def\UTFviii@four@octets##1##2##3##4{\the\numexpr`##1\relax\the\numexpr`##2\relax\the\numexpr`##3\relax\the\numexpr`##4\relax}% \xdef#1{#2}% \endgroup } \makeatother \begin{document} \MakeString{\foo}{æüßéñ} \texttt{\meaning\foo} \show\foo \end{document} 

Produces:

> \foo=macro: ->195166195188195159195169195177. l.23 \show\foo 

I should improve so that each byte produce a three-digits decimal, here leading zeros are stripped!


Ok here it is with no stripping and 2-hex digits per byte but I used xintbinhex by laziness.

\documentclass{article} \usepackage[utf8]{inputenc} \usepackage[T1]{fontenc} \usepackage{xintbinhex} \makeatletter \newcommand*\MakeString@tohex[1] {\expandafter\@gobble\romannumeral0\xintdectohex{\the\numexpr256+`#1\relax}}% \newcommand*\MakeString[2]{% \begingroup \def\UTFviii@two@octets##1##2{\MakeString@tohex{##1}\MakeString@tohex{##2}}% \def\UTFviii@three@octets##1##2##3{\MakeString@tohex{##1}\MakeString@tohex{##2}\MakeString@tohex{##3}}% \def\UTFviii@four@octets##1##2##3##4{\MakeString@tohex{##1}\MakeString@tohex{##2}\MakeString@tohex{##3}\MakeString@tohex{##4}}% \xdef#1{#2}% \endgroup } \makeatother   \begin{document} \MakeString{\foo}{æüßéñ} \texttt{\meaning\foo} \show\foo \end{document} 

produces in log

> \foo=macro: ->C3A6C3BCC39FC3A9C3B1. l.27 \show\foo 

(coding efficiency could be improved)

added variant not using hyperref
Source Link
user4686
user4686
Loading
added variant not using hyperref
Source Link
user4686
user4686
Loading
Source Link
user4686
user4686
Loading