5

I'm playing around a bit with the \tl_set_rescan:Nnn function (originally to make this answer more concise), but I'm struggling to make even pretty simple uses of this function work.

Say we want to make all \ characters letters and spaces have their usual catcode. As far as I understand, the following code should produce identical outputs:

\documentclass{article} \usepackage{expl3} \begin{document} \ExplSyntaxOn \group_begin: \char_set_catcode_escape:N \~ ~char_set_catcode_letter:N ~\ ~char_set_catcode_space:n {32} ~tl_set:Nn ~l_tmpa_tl {<\verb|\LaTeX| \LaTeX>} ~tl_show:N ~l_tmpa_tl ~group_end: %%%%%%%%%% \tl_set_rescan:Nnn \l_tmpa_tl { \char_set_catcode_space:n {32} \char_set_catcode_letter:N \\ } {<\verb|\LaTeX| \LaTeX>} \tl_show:N \l_tmpa_tl \ExplSyntaxOff \end{document} 

outputs

> \l_tmpa_tl=<\verb|\LaTeX| \LaTeX>. > \l_tmpa_tl=<\verb |\LaTeX |\LaTeX >. 

The result of the first token list is correct, \ was made a letter character and thus no extra spaces are output after the control sequences. However, in the rescan attempt the control sequences do still exist. Also note the missing space after the second |.

How do you make the second version produce the expected result? Or perhaps a bit more broad, as these functions don't seem to work well with verbatim input, what are the designated use cases for them?

1 Answer 1

9

At the time you do ~tl_set:Nn ~l_tmpa_tl {<\verb|\LaTeX| \LaTeX>}, you have already set the catcode of \ to 11 and the catcode of a space character to 10, so at the time ~tl_set:Nn expands (grabbing the text as argument), the pseudo control sequences are not tokenized as control sequences, so TeX doesn't add any space after them, and what is tokenized is:

<12\11v11e11r11b11|12\11L11a11T11e11X11|1210\11L11a11T11e11X11>12

Note that as you did not insert any space after the pseudo control sequences (\verb and \LaTeX) they did not show up, as expected. Plus, the space is there after the second |12 because at the time ~tl_set:Nn expanded the space was not being ignored.

Now, you end the group and everything is back to normal. At the time TeX expands \tl_set_rescan:Nnn and grabs <\verb|\LaTeX| \LaTeX> as argument, \ is the control character and the space is catcode 9, i.e., ignored. Thus immediately when TeX sees it, the text is tokenized as the 7 tokens:

<12\verb |12\LaTeX |12\LaTeX >12

Notice that the space after | never existed in the first place, and notice also that after the three control sequences \verb, \LaTeX, and \LaTeX, TeX inserts the usual after-control-sequence space. So in this catcode regime, what TeX actually sees in the first place is <\verb |\LaTeX |\LaTeX >. And now \tl_set_rescan:Nnn does its thing and retokenizes the whole thing as:

<12\11v11e11r11b1110|12\11L11a11T11e11X1110|12\11L11a11T11e11X1110>12

In short: the problem is your two inputs are different to start with.

8
  • I could look (tex.stackexchange.com/a/496258/7832), hence the original query, I think the problem is related to the use \newlinechar before and after using \tl_set_rescan:Nnn(which internally assigns other values to \newlinechar). Commented Jun 19, 2019 at 11:28
  • @PabloGonzálezL Give me a minute to try to understand the problem in your question... Commented Jun 19, 2019 at 11:46
  • @PhelypeOleinik I should have read the "TeXhackers note" for \tl_set_rescan:Nnn more carefully, as it implies the second argument is read in before the new catcodes are applied. The documentation before reads as if it would be absorbed under the new catcode regime to me. So those functions seem to be useless when dealing with verbatim code then. Commented Jun 19, 2019 at 22:58
  • 2
    @siracusa If I'm not mistaken I read it somewhere that to support a consistent API, the expl3 functions, by design, don't do catcode changes while grabbing an argument precisely to avoid unexpected behaviour of a function. Which makes sense: you use a function without the need to worry that this one or that one will grab its contents verbatim or under some peculiar catcode regime. Of course, if you do need that, then you have to work yourself around this feature to get the catcodes right. Commented Jun 20, 2019 at 9:55
  • @PhelypeOleinik Thanks for the implementation. expl3 should definitely have some basic support for handling verbatim material like in your wrapper, IMHO. The thing is, if you do all the catcode setting yourself, you don't need \tl_set_rescan:Nnn anymore. Actually, replacing it by \tl_set:Nn #1 {#3} in your implementation gives the same result. Commented Jun 21, 2019 at 0:28

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.