2

I would like to convert the below file to MathML format as well need the LaTeX coding in MathML Semantic tag.

MWE:

\documentclass{article} \usepackage[T1]{fontenc} \begin{document} \article{Article Title Here} \author{Author Name Here} \maketitle \section{Introduction} This is the sample paragraph. \begin{equation}\label{eq1-11} T\,^{\prime}_{\mu \nu} = \left( \frac{\partial \xi^\alpha} {\partial\xi^{\prime\mu}}\right) \left( \frac{\partial \xi^\beta}{\partial \xi^{\prime\nu}} \right) T_{\alpha \beta} \end{equation} Please refer the equations \ref{eq1-11} for the further testing. \end{document} 
2
  • sorry, I am a bit busy right now, so I cannot test it. I did some tests in this area some time ago: github.com/michal-h21/mathdimen . I don't know if it still works, as I said it is quite old. Commented Oct 9, 2020 at 13:38
  • Must it be tex4ht? tex.stackexchange.com/q/227195/107497 gives that as an option, plus pandoc, latexml, and a link to a website with dozens of other options. Commented Oct 9, 2020 at 14:47

2 Answers 2

1

There are several possible approaches how to achieve this:

  1. configure TeX4ht to catch all math content and typeset it twice - once using MathML, second time as a verbatim text.
  2. parse MathML content and convert it back to the LaTeX code
  3. pre-process the input TeX file and modify it in the way it will be easier for working with

The first method could reuse the code that we use for the MathJax option in TeX4ht, see file mathjax-latex-4ht.4ht for details.

The second method won't produce the same LaTeX code as was the original input. It may be a problem for you. LuaXML can be used for the conversion.

I will present the third method in my answer. It consists of two components - the input filter that parses the input LaTeX file for the math content and marks it with some additional macros, and make4ht DOM filter that modifies the resulting HTML file to produce the correct MathML structure.

Here is the input filter. It reads input from the standard input and prints the modified output.

File altmath.lua:

-- insert envrionmnets that should be handled by the script here local math_environments = { equation = true, displaymath = true, ["equation*"] = true, } -- macros that will be inserted to the updated document local macros = [[ \NewDocumentCommand\inlinemath {mv} {\HCode{<span class="inlinemath">}#1\HCode{<span class="alt">}\NoFonts #2\EndNoFonts\HCode{</span></span>}} \NewDocumentEnvironment{altdisplaymath}{} {\ifvmode\IgnorePar\fi\EndP\HCode{<div class="altmath">}} {\ifvmode\IgnorePar\fi\EndP\HCode{</div>}} ]] -- we will insert macros before the second control sequence (we assume that first is \documentclass local cs_counter = 0 -- we will hanlde inline and diplay math differently local inline = 1 local display = 2 local function handle_math(input, nexts, stop, buffer, mathtype) local content = input:sub(nexts, stop) local format = "\\inlinemath{%s}{%s}" -- format used to insert math content back to the doc -- set format for display math if mathtype == display then format = [[ \begin{altdisplaymath} %s \begin{verbatim} %s \end{verbatim} \end{altdisplaymath} ]] end buffer[#buffer + 1] = string.format(format, content, content ) end local function find_next(input, start, buffer) -- find next cs or math start local nexts, stop = input:find("[$\\]", start) local mathtype if nexts then -- save current text chunk from the input buffer buffer[#buffer+1] = input:sub(start, nexts - 1) local kind, nextc = input:match("(.)(.)", nexts) if kind == "\\" then -- handle cs -- insert our custom TeX macros before second control sequence cs_counter = cs_counter + 1 if cs_counter == 2 then buffer[#buffer+1] = macros end if nextc == "(" then -- inline math _, stop = input:find("\\)", nexts) mathtype = inline elseif nextc == "[" then -- display math _, stop = input:find("\\]", nexts) mathtype = display else -- maybe environment? -- find environment name local env_name = input:match("^begin%s*{(.-)}", nexts+1) -- it must be enabled as math environment if env_name and math_environments[env_name] then _, stop = input:find("\\end%s*{" .. env_name .. "}", nexts) mathtype = display else -- not math environment buffer[#buffer+1] = "\\" -- save backspace that was eaten by the processor return stop + 1 -- return back to the main loop end end else -- handle $ if nextc == "$" then -- display math _, stop = input:find("%$%$", nexts + 1) mathtype = display else -- inline math _, stop = input:find("%$", nexts + 1) mathtype = inline end end if not stop then -- something failed, move one char next return nexts + 1 end -- save math content to the buffer handle_math(input, nexts, stop, buffer, mathtype) else -- if we cannot find any more cs or math, we need to insert rest of the input -- to the output buffer buffer[#buffer+1] = input:sub(start, string.len(input)) return nil end return stop + 1 end -- process the input buffer, detect inline and display math and also math environments local function process(input) local buffer = {} -- buffer where text chunks are stored local start = 1 start = find_next(input, start,buffer) while start do start = find_next(input, start, buffer) end return table.concat(buffer) -- convert output buffer to string end local content = io.read("*all") print(process(content)) 

You can test it using the following command:

texlua altmath.lua < sample.tex 

This is modified version of your original TeX file:

\documentclass{article} \NewDocumentCommand\inlinemath {mv} {\HCode{<span class="inlinemath">}#1\HCode{<span class="alt">}\NoFonts #2\EndNoFonts\HCode{</span></span>}} \NewDocumentEnvironment{altdisplaymath}{} {\ifvmode\IgnorePar\fi\EndP\HCode{<div class="altmath">}} {\ifvmode\IgnorePar\fi\EndP\HCode{</div>}} \usepackage[T1]{fontenc} \begin{document} \title{Article Title Here} \author{Author Name Here} \maketitle \section{Introduction} This is the sample paragraph with \inlinemath{$a=b^2$}{$a=b^2$} inline math. Different \inlinemath{\(a=c^2\)}{\(a=c^2\)} type of math. \begin{altdisplaymath} \begin{equation}\label{eq1-11} T\,^{\prime}_{\mu \nu} = \left( \frac{\partial \xi^\alpha} {\partial\xi^{\prime\mu}}\right) \left( \frac{\partial \xi^\beta}{\partial \xi^{\prime\nu}} \right) T_{\alpha \beta} \end{equation} \begin{verbatim} \begin{equation}\label{eq1-11} T\,^{\prime}_{\mu \nu} = \left( \frac{\partial \xi^\alpha} {\partial\xi^{\prime\mu}}\right) \left( \frac{\partial \xi^\beta}{\partial \xi^{\prime\nu}} \right) T_{\alpha \beta} \end{equation} \end{verbatim} \end{altdisplaymath} Please refer the equations \ref{eq1-11} for the further testing. \end{document} 

You can see that it inserts macro definitions after the \documentclass command. It defines the \inlinemath command and altdisplaymath environment. The definitions contain code that inserts HTML tags directly to the converted file. They are designed to be used just with TeX4ht.

You can convert your file to HTML using

texlua altmath.lua < sample.tex | make4ht -j sample - "mathml" 

It produces a following code:

<span class='inlinemath'><!-- l. 14 --><math xmlns='http://www.w3.org/1998/Math/MathML' display='inline'><mi>a</mi> <mo class='MathClass-rel'>=</mo> <msup><mrow><mi>b</mi></mrow><mrow><mn>2</mn></mrow></msup></math><span class='alt'>$a=b^2$</span></span> 

or

<div class='altmath'> <!-- tex4ht:inline --><table class='equation'><tr><td> <!-- l. 16 --><math xmlns='http://www.w3.org/1998/Math/MathML' display='block' class='equation'> <mstyle class='label' id='x1-1001r1'></mstyle><!-- endlabel --><mi>T</mi><msubsup><mrow><mspace width='0.17em' class='thinspace'></mspace></mrow><mrow><mi mathvariant='italic'>μν</mi></mrow><mrow><mi>′</mi></mrow></msubsup> <mo class='MathClass-rel'>=</mo> <mrow><mo form='prefix' fence='true'> (</mo><mrow> <mfrac><mrow><mi>∂</mi><msup><mrow><mi>ξ</mi></mrow><mrow><mi>α</mi></mrow></msup></mrow> <mrow><mi>∂</mi><msup><mrow><mi>ξ</mi></mrow><mrow><mi mathvariant='italic'>′μ</mi></mrow></msup></mrow></mfrac> </mrow><mo form='postfix' fence='true'>)</mo></mrow> <mrow><mo form='prefix' fence='true'> (</mo><mrow> <mfrac><mrow><mi>∂</mi><msup><mrow><mi>ξ</mi></mrow><mrow><mi>β</mi></mrow></msup></mrow> <mrow><mi>∂</mi><msup><mrow><mi>ξ</mi></mrow><mrow><mi mathvariant='italic'>′ν</mi></mrow></msup></mrow></mfrac> </mrow><mo form='postfix' fence='true'>)</mo></mrow> <msub><mrow><mi>T</mi></mrow><mrow><mi mathvariant='italic'>αβ</mi></mrow></msub> </math></td><td class='eq-no'>(1)</td></tr></table> <!-- l. 18 --><p class='nopar'> </p> <pre id='verbatim-1' class='verbatim'> \begin{equation}\label{eq1-11} T\,^{\prime}_{\mu \nu} = \left( \frac{\partial \xi^\alpha} {\partial\xi^{\prime\mu}}\right) \left( \frac{\partial \xi^\beta}{\partial \xi^{\prime\nu}} \right) T_{\alpha \beta} \end{equation} </pre> <!-- l. 23 --><p class='nopar'> </p></div> 

We need to use make4ht DOM filter to create a correct MathML structure. Save the following file as build.lua:

local domfilter = require "make4ht-domfilter" -- find mathml and insert TeX as an alternative annotation local function update_mathml(element, class) local alt_element_t = element:query_selector(class) if not alt_element_t and not alt_element_t[1] then return nil end -- save alt element contents and remove it from the document local alt_contents = alt_element_t[1]:get_children() alt_element_t[1]:remove_node() -- create a new structure of the mathml element -> -- mathml -- semantics -- mrow -> math content -- annotation -> saved TeX local mathml = element:query_selector("math")[1] local mathml_contents = mathml:get_children() local semantics = mathml:create_element("semantics") local mrow = semantics:create_element("mrow") mrow._children = mathml_contents -- this trick places saved original mathml content into a new <mrow> semantics:add_child_node(mrow) local annotation = semantics:create_element("annotation", {encoding="application/x-tex"}) annotation._children = alt_contents semantics:add_child_node(annotation) mathml._children = {semantics} end local process = domfilter { function(dom) for _, inline in ipairs(dom:query_selector(".inlinemath")) do update_mathml(inline, ".alt") end for _, display in ipairs(dom:query_selector(".altmath")) do update_mathml(display, ".verbatim") end return dom end } 

It parses the HTML files for our custom <span> and <div> elements, get the alt text and inserts it as an '` element of the MathML code.

This is the result:

 <h3 class='sectionHead'><span class='titlemark'>1 </span> <a id='x1-10001'></a>Introduction</h3> <!-- l. 14 --><p class='noindent'>This is the sample paragraph with <span class='inlinemath'><!-- l. 14 --><math display='inline' xmlns='http://www.w3.org/1998/Math/MathML'><semantics><mrow><mi>a</mi> <mo class='MathClass-rel'>=</mo> <msup><mrow><mi>b</mi></mrow><mrow><mn>2</mn></mrow></msup></mrow><annotation encoding='application/x-tex'>$a=b^2$</annotation></semantics></math></span> inline math. Different <span class='inlinemath'><!-- l. 14 --><math display='inline' xmlns='http://www.w3.org/1998/Math/MathML'><semantics><mrow><mrow><mi>a</mi> <mo class='MathClass-rel'>=</mo> <msup><mrow><mi>c</mi></mrow><mrow><mn>2</mn></mrow></msup></mrow></mrow><annotation encoding='application/x-tex'>\(a=c^2\)</annotation></semantics></math></span> type of math. </p><div class='altmath'> <!-- tex4ht:inline --><table class='equation'><tr><td> <!-- l. 16 --><math class='equation' xmlns='http://www.w3.org/1998/Math/MathML' display='block'><semantics><mrow> <mstyle id='x1-1001r1' class='label'></mstyle><!-- endlabel --><mi>T</mi><msubsup><mrow><mspace width='0.17em' class='thinspace'></mspace></mrow><mrow><mi mathvariant='italic'>μν</mi></mrow><mrow><mi>′</mi></mrow></msubsup> <mo class='MathClass-rel'>=</mo> <mrow><mo fence='true' form='prefix'> (</mo><mrow> <mfrac><mrow><mi>∂</mi><msup><mrow><mi>ξ</mi></mrow><mrow><mi>α</mi></mrow></msup></mrow> <mrow><mi>∂</mi><msup><mrow><mi>ξ</mi></mrow><mrow><mi mathvariant='italic'>′μ</mi></mrow></msup></mrow></mfrac> </mrow><mo fence='true' form='postfix'>)</mo></mrow> <mrow><mo fence='true' form='prefix'> (</mo><mrow> <mfrac><mrow><mi>∂</mi><msup><mrow><mi>ξ</mi></mrow><mrow><mi>β</mi></mrow></msup></mrow> <mrow><mi>∂</mi><msup><mrow><mi>ξ</mi></mrow><mrow><mi mathvariant='italic'>′ν</mi></mrow></msup></mrow></mfrac> </mrow><mo fence='true' form='postfix'>)</mo></mrow> <msub><mrow><mi>T</mi></mrow><mrow><mi mathvariant='italic'>αβ</mi></mrow></msub> </mrow><annotation encoding='application/x-tex'> \begin{equation}\label{eq1-11} T\,^{\prime}_{\mu \nu} = \left( \frac{\partial \xi^\alpha} {\partial\xi^{\prime\mu}}\right) \left( \frac{\partial \xi^\beta}{\partial \xi^{\prime\nu}} \right) T_{\alpha \beta} \end{equation} </annotation></semantics></math></td><td class='eq-no'>(1)</td></tr></table> <!-- l. 18 --><p class='nopar'> </p> <!-- l. 23 --><p class='nopar'> </p></div> 
2
  • Awesome... Is it possible to get both MathML and LaTeX tags by using htlatex? in a same file? Commented Oct 12, 2020 at 4:31
  • 1
    @MadyYuvi this could work with htlatex as well. in theory. You would need to make temporary TeX file, because htlatex doesn't support pipes. YOu would also need to post-process the HTML file with the DOM filter. But anyway, htlatex is officially obsolete, because make4ht does lot of fixes on the generated HTML file. Including fixes on MathML. You most likely don't get correct MathML elements with htlatex only. Commented Oct 12, 2020 at 6:52
0

Provided MWE having many LaTeX coding errors, I've fixed and the modified tags are:

\documentclass{article} \usepackage[T1]{fontenc} \begin{document} \title{Article Title Here} \author{Author Name Here} \maketitle \section{Introduction} This is the sample paragraph. \begin{equation}\label{eq1-11} T\,^{\prime}_{\mu \nu} = \left( \frac{\partial \xi^{\alpha}} {\partial\xi^{\prime\mu}}\right) \left( \frac{\partial \xi^{\beta}}{\partial \xi^{\prime\nu}} \right) T_{\alpha \beta} \end{equation} Please refer the equations \ref{eq1-11} for the further testing. \end{document} 

After correct the errors, I've run the command

htlatex test "xhtml,mathml,mathml-" " -cunihft" "-cvalidate -p"

It converts nicely...

EDIT

If you need to get display the LaTeX tags in the converted HTML, then use the below .cfg file:

conversion.cfg

\RequirePackage{verbatim,etoolbox} \Preamble{xhtml} \def\AltMathOne#1${\HCode{\detokenize{\(#1\)}}$} \Configure{$}{}{}{\expandafter\AltMathOne} \def\AltlMath#1\){\HCode{\detokenize{\(#1\)}}\)} \Configure{()}{\AltlMath}{} \def\AltlDisplay#1\]{\HCode{\detokenize{\[#1\]}}\]} \Configure{[]}{\AltlDisplay}{} \def\AltDisplayOne#1#2$${#1\HCode{\detokenize{$$#2$$}}$$} \Configure{$$}{}{}{\AltDisplayOne}{}{} \newcommand\VerbMath[1]{% \ifcsdef{#1}{% \renewenvironment{#1}{% \NoFonts% \Configure{verbatim}{}{} % suppress <br /> tags \texttt{\string\begin\{#1\}}\HCode{\Hnewline}% we need to use \texttt to get all characters right \verbatim}{\endverbatim\texttt{\string\end\{#1\}}\EndNoFonts}% }{}% } \VerbMath{align} \VerbMath{equation} \VerbMath{equation*} \begin{document} \EndPreamble 

Then the run command:

htlatex sample "conversion" " " "-cvalidate -p" 
16
  • 1
    What are the "many" coding errors? I see only \article instead of \title... Commented Oct 9, 2020 at 8:53
  • 1
    @MadyYuvi: Thanks for your reply. But i expecting mathml coding as well input LaTeX Coding also. In your example i have found only Mathml and not found LaTeX coding. Commented Oct 9, 2020 at 8:58
  • @campa I've modified OP's question and removed those errors... Commented Oct 9, 2020 at 11:55
  • 2
    What I meant is that (1) when you correct errors it would be helpful if you would describe them in the answer, and (2) you keep talking about errors (plural) while I see only one. (A couple of questionable other things, yes, but only one real error.) Commented Oct 9, 2020 at 11:57
  • @MadyUuvi: How to get both MathML output and LaTeX output in the html file? Please advise. Commented Oct 9, 2020 at 13:25

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.