1
$\begingroup$

It's the algorithm which combines path tracing and reinforcement learning. I can't understand what $p_\omega$ is.

enter image description here

The algorithm is clear. The actions are the directions and the states are the hit points. For updating the Q-values, the hit points are considered as parts of a discretized diagram like Voronoi diagram and the directions of rays are generated by a discretization of a hemisphere surface which divides it into equal areas. The Q-values are initialized as equal for each state.

First, we generate a camera ray, and when it hits a diffuse surface, it scatters according to a discrete PDF which is calculated by normalizing the Q-values of the state of hit point.

Second, The Q-values are updated along the path, we have the state $s$ and the next state $s\prime$ is generated from the direction generated by PDF of Q-values in $s$.

The area of each path on the hemispheres is $2\pi/n$, and integral in the update equation of Q-values is calculated like this:

$\dfrac{2\pi}{n}\sum_{k=0}^{n-1}Q_k(y)f_s(\omega_k, y, -\omega)cos\theta_k$

I understand all of the algorithm, but I don't know what $p_\omega$ is in the second part. Is it $1/2\pi$ or $n/2\pi$ ? Could someone prove what it really is? I need the proof.

I guess it is $1/2\pi$ because the path tracer here only use the PDF generated by Q-values. It doesn't use the Q-values at all. Am I correct? If I am correct, how can I prove it?

EDIT: For clarification, a ray generated through a patch uniformly and the Q-values are generated as I said. Because we only change the directions so more samples are generated inside the patches which receive more light.

I used $n/2\pi$ , but scene was so dark, so the PDF must be either $1/2\pi$ or a normalized combination of PDF's of patches. If the second is true, what it really is?

EDIT 2: Should I ask this question in the statistics community of stack exchange?

$\endgroup$
1
  • $\begingroup$ I haven't looked into this beyond what you posted, but this seems like the probability with which you sampled your new ray. Since it mentions proportional to Q, I am assuming you do some inverse transform sampling, so you need to take it into account. $\endgroup$ Commented Nov 6, 2019 at 15:58

1 Answer 1

1
$\begingroup$

I have to answer my question. Consider that all patches have the same value. Then choosing a patch uniformly and selecting a random point uniformly on that patch is as the same as sampling the hemisphere uniformly. If $p_s=Pr(Q_k \in Q)=1/n$, which n is the number of patches, I can conclude that $p_\omega=p_s*(n/2\pi)=1/2\pi$ .

And if patches don't have the same value, then $p_s=Pr(Q_k \in Q)=q_k/(q_1+...+q_n)$ and $p_\omega=p_s*(n/2\pi)=(q_k/(q_1+...+q_n))*(n/2\pi)$

$\endgroup$
1
  • $\begingroup$ The whole point of this method is that patches wouldn't have the same value, that's where you gain something. $\endgroup$ Commented Nov 6, 2019 at 18:22

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.