Revisions to Changing escape character in \tl_set_rescan:Nnn

deleted 2309 characters in body

edited Jun 21, 2019 at 13:11

72.6k
20
165
241

However, if you want to have \tl_set_rescan:Nnn use the catcode setting in #2 in force when grabbing the token list to retokenize, as you said in the comments, you can write a simple wrapper around \tl_set_rescan:Nnn to have it use that catcode setting. As an example, I defined a wrapper which works somewhat like \verb, in the sense that if the next token is a {, the argument is delimited by the next (balanced) }, and if it is something else, the argument is delimited by the next something else:

\documentclass{article} \usepackage{expl3} \ExplSyntaxOn \cs_new_protected:Npn \siracusa_dynamic_tl_set_rescan:Nnn #1#2 { \group_begin: \peek_catcode:NTF \c_group_begin_token { \__dynamic_tl_rescan_brace:Nnn #1 {#2} } { \__dynamic_tl_rescan_delim:Nnw #1 {#2} } } \cs_new_protected:Npn \__dynamic_tl_rescan_brace:Nnn #1 #2 { #2 \__dynamic_tl_rescan_do:Nnn #1 {#2} } \cs_new_protected:Npn \__dynamic_tl_rescan_delim:Nnw #1 #2 #3 { \cs_set:Npn \__siracusa_tmp:w ##1 ##2 ##3 #3 { \__dynamic_tl_rescan_do:Nnn ##1 {##2} {##3} } #2 \__siracusa_tmp:w #1 {#2} } \cs_new_protected:Npn \__dynamic_tl_rescan_do:Nnn #1 #2 #3 { \group_end: \tl_set_rescan:Nnn #1 {#2} {#3} } \ExplSyntaxOff \begin{document} \ExplSyntaxOn \group_begin: \char_set_catcode_escape:N \~ ~char_set_catcode_letter:N ~\ ~char_set_catcode_space:n {32} ~tl_set:Nn ~l_tmpa_tl {<\verb|\LaTeX| \LaTeX>} ~tl_analysis_show:N ~l_tmpa_tl ~group_end: %%%%%%%%%% \siracusa_dynamic_tl_set_rescan:Nnn \l_tmpa_tl { \char_set_catcode_space:n {32} \char_set_catcode_letter:N \\ } {<\verb|\LaTeX| \LaTeX>} \tl_analysis_show:N \l_tmpa_tl \ExplSyntaxOff \end{document}

With this code, both token lists are:

> < (the character <) > \ (the letter \) > v (the letter v) > e (the letter e) > r (the letter r) > b (the letter b) > | (the character |) > \ (the letter \) > L (the letter L) > a (the letter a) > T (the letter T) > e (the letter e) > X (the letter X) > | (the character |) > (blank space ) > \ (the letter \) > L (the letter L) > a (the letter a) > T (the letter T) > e (the letter e) > X (the letter X) > > (the character >)

However, if you want to have \tl_set_rescan:Nnn use the catcode setting in #2 in force when grabbing the token list to retokenize, as you said in the comments, you can write a simple wrapper around \tl_set_rescan:Nnn to have it use that catcode setting. As an example, I defined a wrapper which works somewhat like \verb, in the sense that if the next token is a {, the argument is delimited by the next (balanced) }, and if it is something else, the argument is delimited by the next something else:

\documentclass{article} \usepackage{expl3} \ExplSyntaxOn \cs_new_protected:Npn \siracusa_dynamic_tl_set_rescan:Nnn #1#2 { \group_begin: \peek_catcode:NTF \c_group_begin_token { \__dynamic_tl_rescan_brace:Nnn #1 {#2} } { \__dynamic_tl_rescan_delim:Nnw #1 {#2} } } \cs_new_protected:Npn \__dynamic_tl_rescan_brace:Nnn #1 #2 { #2 \__dynamic_tl_rescan_do:Nnn #1 {#2} } \cs_new_protected:Npn \__dynamic_tl_rescan_delim:Nnw #1 #2 #3 { \cs_set:Npn \__siracusa_tmp:w ##1 ##2 ##3 #3 { \__dynamic_tl_rescan_do:Nnn ##1 {##2} {##3} } #2 \__siracusa_tmp:w #1 {#2} } \cs_new_protected:Npn \__dynamic_tl_rescan_do:Nnn #1 #2 #3 { \group_end: \tl_set_rescan:Nnn #1 {#2} {#3} } \ExplSyntaxOff \begin{document} \ExplSyntaxOn \group_begin: \char_set_catcode_escape:N \~ ~char_set_catcode_letter:N ~\ ~char_set_catcode_space:n {32} ~tl_set:Nn ~l_tmpa_tl {<\verb|\LaTeX| \LaTeX>} ~tl_analysis_show:N ~l_tmpa_tl ~group_end: %%%%%%%%%% \siracusa_dynamic_tl_set_rescan:Nnn \l_tmpa_tl { \char_set_catcode_space:n {32} \char_set_catcode_letter:N \\ } {<\verb|\LaTeX| \LaTeX>} \tl_analysis_show:N \l_tmpa_tl \ExplSyntaxOff \end{document}

With this code, both token lists are:

> < (the character <) > \ (the letter \) > v (the letter v) > e (the letter e) > r (the letter r) > b (the letter b) > | (the character |) > \ (the letter \) > L (the letter L) > a (the letter a) > T (the letter T) > e (the letter e) > X (the letter X) > | (the character |) > (blank space ) > \ (the letter \) > L (the letter L) > a (the letter a) > T (the letter T) > e (the letter e) > X (the letter X) > > (the character >)

added 2311 characters in body

Source Link

edited Jun 20, 2019 at 11:00

Phelype Oleinik

72.6k
20
165
241

However, if you want to have \tl_set_rescan:Nnn use the catcode setting in #2 in force when grabbing the token list to retokenize, as you said in the comments, you can write a simple wrapper around \tl_set_rescan:Nnn to have it use that catcode setting. As an example, I defined a wrapper which works somewhat like \verb, in the sense that if the next token is a {, the argument is delimited by the next (balanced) }, and if it is something else, the argument is delimited by the next something else:

\documentclass{article} \usepackage{expl3} \ExplSyntaxOn \cs_new_protected:Npn \siracusa_dynamic_tl_set_rescan:Nnn #1#2 { \group_begin: \peek_catcode:NTF \c_group_begin_token { \__dynamic_tl_rescan_brace:Nnn #1 {#2} } { \__dynamic_tl_rescan_delim:Nnw #1 {#2} } } \cs_new_protected:Npn \__dynamic_tl_rescan_brace:Nnn #1 #2 { #2 \__dynamic_tl_rescan_do:Nnn #1 {#2} } \cs_new_protected:Npn \__dynamic_tl_rescan_delim:Nnw #1 #2 #3 { \cs_set:Npn \__siracusa_tmp:w ##1 ##2 ##3 #3 { \__dynamic_tl_rescan_do:Nnn ##1 {##2} {##3} } #2 \__siracusa_tmp:w #1 {#2} } \cs_new_protected:Npn \__dynamic_tl_rescan_do:Nnn #1 #2 #3 { \group_end: \tl_set_rescan:Nnn #1 {#2} {#3} } \ExplSyntaxOff \begin{document} \ExplSyntaxOn \group_begin: \char_set_catcode_escape:N \~ ~char_set_catcode_letter:N ~\ ~char_set_catcode_space:n {32} ~tl_set:Nn ~l_tmpa_tl {<\verb|\LaTeX| \LaTeX>} ~tl_analysis_show:N ~l_tmpa_tl ~group_end: %%%%%%%%%% \siracusa_dynamic_tl_set_rescan:Nnn \l_tmpa_tl { \char_set_catcode_space:n {32} \char_set_catcode_letter:N \\ } {<\verb|\LaTeX| \LaTeX>} \tl_analysis_show:N \l_tmpa_tl \ExplSyntaxOff \end{document}

With this code, both token lists are:

> < (the character <) > \ (the letter \) > v (the letter v) > e (the letter e) > r (the letter r) > b (the letter b) > | (the character |) > \ (the letter \) > L (the letter L) > a (the letter a) > T (the letter T) > e (the letter e) > X (the letter X) > | (the character |) > (blank space ) > \ (the letter \) > L (the letter L) > a (the letter a) > T (the letter T) > e (the letter e) > X (the letter X) > > (the character >)

However, if you want to have \tl_set_rescan:Nnn use the catcode setting in #2 in force when grabbing the token list to retokenize, as you said in the comments, you can write a simple wrapper around \tl_set_rescan:Nnn to have it use that catcode setting. As an example, I defined a wrapper which works somewhat like \verb, in the sense that if the next token is a {, the argument is delimited by the next (balanced) }, and if it is something else, the argument is delimited by the next something else:

\documentclass{article} \usepackage{expl3} \ExplSyntaxOn \cs_new_protected:Npn \siracusa_dynamic_tl_set_rescan:Nnn #1#2 { \group_begin: \peek_catcode:NTF \c_group_begin_token { \__dynamic_tl_rescan_brace:Nnn #1 {#2} } { \__dynamic_tl_rescan_delim:Nnw #1 {#2} } } \cs_new_protected:Npn \__dynamic_tl_rescan_brace:Nnn #1 #2 { #2 \__dynamic_tl_rescan_do:Nnn #1 {#2} } \cs_new_protected:Npn \__dynamic_tl_rescan_delim:Nnw #1 #2 #3 { \cs_set:Npn \__siracusa_tmp:w ##1 ##2 ##3 #3 { \__dynamic_tl_rescan_do:Nnn ##1 {##2} {##3} } #2 \__siracusa_tmp:w #1 {#2} } \cs_new_protected:Npn \__dynamic_tl_rescan_do:Nnn #1 #2 #3 { \group_end: \tl_set_rescan:Nnn #1 {#2} {#3} } \ExplSyntaxOff \begin{document} \ExplSyntaxOn \group_begin: \char_set_catcode_escape:N \~ ~char_set_catcode_letter:N ~\ ~char_set_catcode_space:n {32} ~tl_set:Nn ~l_tmpa_tl {<\verb|\LaTeX| \LaTeX>} ~tl_analysis_show:N ~l_tmpa_tl ~group_end: %%%%%%%%%% \siracusa_dynamic_tl_set_rescan:Nnn \l_tmpa_tl { \char_set_catcode_space:n {32} \char_set_catcode_letter:N \\ } {<\verb|\LaTeX| \LaTeX>} \tl_analysis_show:N \l_tmpa_tl \ExplSyntaxOff \end{document}

With this code, both token lists are:

> < (the character <) > \ (the letter \) > v (the letter v) > e (the letter e) > r (the letter r) > b (the letter b) > | (the character |) > \ (the letter \) > L (the letter L) > a (the letter a) > T (the letter T) > e (the letter e) > X (the letter X) > | (the character |) > (blank space ) > \ (the letter \) > L (the letter L) > a (the letter a) > T (the letter T) > e (the letter e) > X (the letter X) > > (the character >)

Source Link

answered Jun 19, 2019 at 11:13

Phelype Oleinik

72.6k
20
165
241

At the time you do ~tl_set:Nn ~l_tmpa_tl {<\verb|\LaTeX| \LaTeX>}, you have already set the catcode of \ to 11 and the catcode of a space character to 10, so at the time ~tl_set:Nn expands (grabbing the text as argument), the pseudo control sequences are not tokenized as control sequences, so TeX doesn't add any space after them, and what is tokenized is:

<_₁₂\_₁₁v_₁₁e_₁₁r_₁₁b_₁₁|_₁₂\_₁₁L_₁₁a_₁₁T_₁₁e_₁₁X_₁₁|_₁₂ _₁₀\_₁₁L_₁₁a_₁₁T_₁₁e_₁₁X_₁₁>_₁₂

Note that as you did not insert any space after the pseudo control sequences (\verb and \LaTeX) they did not show up, as expected. Plus, the space is there after the second |_₁₂ because at the time ~tl_set:Nn expanded the space was not being ignored.

Now, you end the group and everything is back to normal. At the time TeX expands \tl_set_rescan:Nnn and grabs <\verb|\LaTeX| \LaTeX> as argument, \ is the control character and the space is catcode 9, i.e., ignored. Thus immediately when TeX sees it, the text is tokenized as the 7 tokens:

<_₁₂\verb |_₁₂\LaTeX |_₁₂\LaTeX >_₁₂

Notice that the space after | never existed in the first place, and notice also that after the three control sequences \verb, \LaTeX, and \LaTeX, TeX inserts the usual after-control-sequence space. So in this catcode regime, what TeX actually sees in the first place is <\verb |\LaTeX |\LaTeX >. And now \tl_set_rescan:Nnn does its thing and retokenizes the whole thing as:

<_₁₂\_₁₁v_₁₁e_₁₁r_₁₁b_₁₁ _₁₀|_₁₂\_₁₁L_₁₁a_₁₁T_₁₁e_₁₁X_₁₁ _₁₀|_₁₂\_₁₁L_₁₁a_₁₁T_₁₁e_₁₁X_₁₁ _₁₀>_₁₂

In short: the problem is your two inputs are different to start with.

Stack Exchange Network

Return to Answer