I'm implementing Algorithm S.1 (Pixel Ground Truth Generation) from the ARU-Net paper by Grüner et al., but I'm having trouble understanding a few key steps in the context of non-linear (curved) baselines.
In particular, I'm confused about the following:
Local Text Orientation (θ): The paper mentions computing the local text orientation for each baseline P, but does not define how. If the baseline is curved, what is the proper method for estimating θ at a point or for the entire line?
Interline Distance (d): The algorithm requires the interline distance of P, but how is this determined when adjacent baselines are non linear? Is the global average being calculated?
Polygonal Chain of Length d and Orientation θ + 90°: How exactly should this be constructed at the endpoints p₁ and pₙ of a baseline? If the line is curved. Do you compute θ at those endpoints?
I’ve read the relevant sections (especially Def. 3.3.2 and 3.3.3) but still don’t have an intuitive or practical understanding. For simple baselines its pretty straight forward, since θ stays the same. I provided you with the algorithm and an example image where you can see both the baselines and separators.

