Writing algorithm formulas using MathJax

Question

From Sutton and Barto's book Reinforcement Learning (Adaptive Computation and Machine Learning series), the following definition is given for Q-Learning :

I', planning to ask a question in combining the above algorithm with policy gradient learning but I'm struggling to format it correctly with MathJax. Here is what I have so far which looks awful in comparison to above algorithm:

$$ Algorithm \hspace{1mm} parameters: step size \hspace{1mm} \alpha \in (0 , 1] , \epsilon > 0 \\ Initialize \hspace{1mm} Q \hspace{1mm} ( s, a ), \ \forall s \in S^+ , a \in A ( s ), arbitrarily \hspace{1mm} except \hspace{1mm} that \hspace{1mm} Q ( terminal , . ) = 0 \\ Loop \hspace{1mm} for \hspace{1mm} each \hspace{1mm} step \hspace{1mm} of \hspace{1mm} episode: \\ Choose \hspace{1mm} A \hspace{1mm} from \hspace{1mm} S \hspace{1mm} using \hspace{1mm} some \hspace{1mm} policy \hspace{1mm} derived \hspace{1mm} from \hspace{1mm} Q (eg \hspace{1mm} \epsilon \hspace{1mm} greedy) $$

Can some pointers in writing out RL algorithms with MathJax be shared ? Ideally can my mathjax code be amended such that it renders the same output as above Q-Learning algorithm (in image) ?

Is mathjax being used in this site or some other math notation rendering library ?

can you not just put the image in your question above and refer to the lines needed for your question? — David
– David, Commented Jun 23, 2020 at 13:23
@DavidIreland yes I could but I thought it would be more clear to have the complete algorithm with the lines I've changed instead of pointing out each line I'm changing. — blue-sky
– blue-sky, Commented Jun 23, 2020 at 13:26
@blue-sky This question is more appropriate for our Meta. I will migrate it. — nbro
– nbro Mod, Commented Jun 23, 2020 at 13:54
There is a relevant discussion on CS.SE I think most of the above figure can be done in Unicode, although that takes additional effort. As far as I'm aware, there hasn't been any more support for algorithm formatting since, so if unicode will not do, then using markdown lists is probably the best we can do. — Discrete lizard
– Discrete lizard, Commented Jun 23, 2020 at 14:22

Someone_Evil · Accepted Answer · 2020-06-23 14:40:17Z

Don't use MathJax unless you need to. Here are some hacks to do formatting like this:

Blockquotes (>). Useful for marking out a block of text as different, without doing much additional formatting to them.
Inline MathJax. You can enclose MathJax in $ to make it render in line (ie. not on a seperate, centered line. eg. $\alpha$ gives $\alpha$
Indenting. Number of ways to achieve this. $\quad$ (and similar) might be most familiar to you, but piles of unbreakable spaces (     ) also works as a hack-y way to do this.
Linebreaks. You can do linebreaks by having multiple spaces at the end of a line, or using the html tag <br/>

To see it in action with your specific example:

Algorithm parameters: step size $\alpha \in (0 , 1] , \epsilon > 0$
Initialize $Q ( s, a ), \ \forall s \in S^+ , a \in A ( s ),$ arbitrarily except that $Q ( terminal , \cdot ) = 0$

Loop for each episode:
$\quad$Initialize $S$
$\quad$Loop for each step of episode:
$\qquad$Choose $A$ from $S$ using some policy derived from $Q$ (eg $\epsilon$-greedy)
$\qquad$Take action $A$, observe $R, S'$
$\qquad Q(S,A) \leftarrow Q(S,A) + \alpha[R+\gamma \max_a(S', a) - Q(S, A)]$
$\qquad S \leftarrow S'$
$\quad$ until $S$ is terminal

Which is produced from:

> Algorithm parameters: step size $\alpha \in (0 , 1] , \epsilon > 0$ Initialize $Q ( s, a ), \ \forall s \in S^+ , a \in A ( s ),$ arbitrarily except that $Q ( terminal , \cdot ) = 0$ > > Loop for each episode: $\quad$Initialize $S$ $\quad$Loop for each step of episode: $\qquad$Choose $A$ from $S$ using some policy derived from $Q$ (eg $\epsilon$-greedy) $\qquad$Take action $A$, observe $R, S'$ $\qquad Q(S,A) \leftarrow Q(S,A) + \alpha[R+\gamma \max_a(S', a) - Q(S, A)]$ $\qquad S \leftarrow S'$ $\quad$ until $S$ is terminal

$\begingroup$ thanks! also works with StackEdit.io which uses KatEx $\endgroup$

blue-sky
– blue-sky

2020-06-23 20:12:14 +00:00
Commented Jun 23, 2020 at 20:12 — blue-sky
– blue-sky, Commented Jun 23, 2020 at 20:12

Stack Exchange Network

Writing algorithm formulas using MathJax

1 Answer 1

You must log in to answer this question.

Hot Network Questions

Writing algorithm formulas using MathJax

1 Answer 1

You must log in to answer this question.

Related

Hot Network Questions