0

I would like to define a command which takes two arguments and generates a link to a webpage:

\newcommand{\urltocode}[2]{...} 

The problem is, however, when I use \urltocode{foo}{bar}, if foo contains some non-ascii characters, pdflatex would scream, where foo should just be pasted into the URL as is. In practice, foo contains special symbols like # for http ids as well as unicode symbols. Is there a way to define \urltocode such that foo is absolutely not re-interpreted?

To provide an example, I would like to have

\newcommand{\urltocode}[2]{\href{http://something.com/#1}{#2}} 

where #1 is expected to contain all sorts of symbols. I would like to use \urltocode{foo}{bar} as natural as it is in foo.

5
  • 1
    Why do you need bar? Without it, just use \url{<url>}, already provided by hyperref. Commented Oct 20, 2022 at 5:13
  • 1
    Remark: In TeX terminology, such commands are called "verbatim"; however they will usually not work inside the argument of other commands. Nevertheless, you need some deep knowledge of TeX catcode etc. to write them properly, as such it's recommended that you include a minimal working example of what you actually want to do so people can just give you working code. Commented Oct 20, 2022 at 10:23
  • 1
    Although for a not-very-technical TeX user, the macro \DefineVerbatimToScantokens from listings - How to define macro that only makes argument substitution? - TeX - LaTeX Stack Exchange or tex.stackexchange.com/a/624949/250119 might work right out of the box. (copy the relevant code there, then do \DefineVerbatimToScantokens\urltocode{mm}{something}) As long as it isn't used inside the argument of another command, that is. Commented Oct 20, 2022 at 10:24
  • @Werner what I need in fact is \href{my-url/foo}{bar}. bar here is for display. Commented Oct 21, 2022 at 0:39
  • I have updated the question with an example. Commented Oct 21, 2022 at 0:42

1 Answer 1

1

Seems the problems you encounter are due to the intertwining of the TeX program tokenizing/lexing bits of .tex-input at the time of reading things from the .tex-input-file and the TeX program parsing the sequence of tokens at the time of carrying out macros and in later stages.

A tokenizer splits a piece of input into units/words.
A lexer performs lexical analysis—adds information to a unit/word providing info on what kind of thing that unit/word is.
A parser analyses combinations of units/words.

Tokenizers divide input into "words". Lexers work at the "word" level, adding additional information to the "words". Parsers work at the grammatical level and analyze combinations of units/words.

The content of a .tex-input-file by and by is always read and split into units (tokenizing) which are called tokens whereby additional info is added (lexing/lexical analyzer) like whether the token in question is a control sequence token or an explicit character token in which case information about the category of that token is added.
In later stages of processing, when tokens are already there, sequences of tokens are parsed.

You can't get around TeX tokenizing/lexing .tex-input coming from a .tex-input-file or typed at the console.
But you can (temporarily) adjust the tokenizing- and lexing-apparatus of TeX so that tokens get created which TeX won't complain about when parsing things while carrying out the \href-macro. This is what category codes in TeX are about: Category codes are about adjusting what tokens the tokenizing- and lexing-apparatus of TeX will produce.

A common trick is temporarily switching to verbatim-category-code-régime for reading/tokenizing/lexing from the .tex-input-file and later passing the resulting set of tokens to \scantokens.

The gist with verbatim-category-code-régime is: A .tex-input file forms a sequence of characters (not character-tokens - character-tokens exist during the TeX run; the file exists in the file system). If TeX reads/tokenizes/lexes things under verbatim-category-code-régime, character tokens come into being whose unexpanded writing to a file or terminal yields (more or less) the same set of characters as contained in the file that was read/tokenized/lexed.

\scantokens in turn is like writing tokens unexpanded to file and then reading and tokenizing and lexing the content of that file under the category-code-régime (set of tokenizing- and lexing-rules) current at the time of carrying out \scantokens. Tokens produced and appended to the token stream this way are processed as usual.

user202729 already gave some pointers to a routine \DefineVerbatimToScantokens:

\DefineVerbatimToScantokens{⟨control-word-token⟩}{⟨xparse-argument-specifiers⟩}{% ⟨verbatim-material to be passed to \scantokens⟩ }%

Using that you can, e.g., try s.th. like this:

\documentclass{article} \usepackage{hyperref} %=== Code of \DefineVerbatimToScantokens ======================== % With older LaTeX-releases uncomment the following line: %\usepackage{xparse} \NewDocumentCommand\DefineVerbatimToScantokens{mm}{% \begingroup \catcode`\^^I=12\relax \InnerDefineVerbatimToScantokens{#1}{#2}% }% \begingroup \makeatletter \def\InnerDefineVerbatimToScantokens#1#2{% \endgroup \NewDocumentCommand\InnerDefineVerbatimToScantokens{mm+v}{% \endgroup\ReplaceHashloop{##3}{##1}{##2}% }% \newcommand\ReplaceHashloop[3]{% \ifcat$\detokenize\expandafter{\Hashcheck##1#1}$% \expandafter\@firstoftwo\else\expandafter\@secondoftwo\fi {% \NewDocumentCommand{##2}{##3}{% \begingroup\newlinechar=\endlinechar \scantokens{\endgroup##1#2}% }% }{% \expandafter\ReplaceHashloop\expandafter{\Hashreplace##1}{##2}{##3}% }% }% \@ifdefinable\Hashcheck{\long\def\Hashcheck##1#1{}}% \@ifdefinable\Hashreplace{\long\def\Hashreplace##1#1{##1####}}% }% \catcode`\%=12\relax \catcode`\#=12\relax \InnerDefineVerbatimToScantokens{#}{%}% %=== End of code of \DefineVerbatimToScantokens ================= % Be aware that indenting does matter within \DefineVerbatimToScantokens's % <verbatim-material to be passed to \scantokens>-argument: \DefineVerbatimToScantokens\urltocode{vv}{\href{http://something.com/#1}{#2}}% \begin{document} \urltocode{foo}{bar} \end{document} 

With this the phrase bar is a link to the url https://something.com/foo.


Probably, instead of all this verbatim-\scantokens-trickery, you can do with hyperref's \hyperbaseurl.
But in the past I ran into problems with that because it seems back then \href's routine for detecting the kind of url/the kind of link seemed not to know when to take \hyperbaseurl into account and when not to do so. E.g.,

\hyperbaseurl{https://something.com/} \href{foo}{bar} 

might produce a link to a non-existing file foo rather than a link to an url https://something.com/foo.
See. e.g., \hyperbaseurl works for .html, but not for .pdf

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.