Skip to content

Conversation

@Myriad-Dreamin
Copy link

@Myriad-Dreamin Myriad-Dreamin commented Sep 12, 2025

A docstring is an item in the source code that holds the documentation of one or multiple language items.

This RFC (docstring syntax) describes:

  • the language items that can be documented.
  • the concept of docstring processors including the official implementation and the third-party processors.
  • docstring syntax of existing languages.
  • issues about syntax of "doc comments" and "content of a docstring".
  • A proposed design about the docstring syntax.

This RFC discusses:

  • backward compatibility of syntax.
  • customizing the docstring handlers (third-party processors).
  • error tolerance.
  • multiple-lingual docstrings and internationalization.
@Myriad-Dreamin Myriad-Dreamin marked this pull request as draft September 12, 2025 18:58
@SillyFreak SillyFreak mentioned this pull request Sep 12, 2025
2 tasks
@Myriad-Dreamin Myriad-Dreamin changed the title Add RFC for docstring syntax rfc: docstring syntax Sep 12, 2025
@Myriad-Dreamin Myriad-Dreamin marked this pull request as ready for review September 13, 2025 02:16
Copy link
Member

@tingerrr tingerrr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As of now, I'm a little confused about the overall structure, it seems the reference and guide-level explanations were lost during the conversion to Typst. I think this should be closer to the structure of the markdown template.

All in all, it seems you're in favor of adopting the tidy v4 design for this, at least from the Proposed Design section mentioning the possible places.

For the exact syntax we will require some strict write-up of when a comment starts and ends and where it can be, the examples illustrate this well already, but they are not explicit enough for implementors.

Regarding doc processors, as I've mentioned in one of the comments, I believe we should consider these in two parts, a doc string reader and a doc string consumer. The primary focus of this RFC should be what a doc string reader must be able to do and what a doc string consumer can expect.

Thanks for writing this all up!

///
/// - `a`: The first element of the tuple.
/// - `b`: The second element of the tuple.
#let (a, b) = (b, a)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's assume I'm going over a module using some kind of JSON reflection API that dumps bindings and their docs for further processing. I would assume to get back an object that mirrors how it is accessed in Typst. For example, if this was in a module named foo, then I would expect something like:

{ "type": "module", "name": "foo", "members": [ { "type": "var", "name": "a", }, { "type": "var", "name": "b", }, ], }

Because I can access foo.a and foo.b from Typst directly, and there is no object that contains a and b other than the module itself, which item is the doc string attached to then?

We could attach the sub items in the list a and b, but the actual doc string is sort of homeless.

I first wanted to mention that documenting top-level destructuring patterns seems like a weird thing to do, but this is not too uncommon for the workaround people use for private bindings, as noted in Section 1.2.


```typ
/// A show rule that does nothing.
#show: it => it
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should only document actually bound variables and modules, things that can be named by downstream consumers of a package or module.

a > b and
/// `a` must be less than `c`.
a < c
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above, those are not nameable by any downstream consumers, so documenting them is sort of moot. These conditions can (and should be) be mentioned on the doc string of condition itself.


_docstring processors_ parses, checks, and renders docstrings:
- documentation tools, like `tidy`.
- analyzers and test frameworks, like `typst-ide`, `tinymist`, and `tytanic`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should split this into a doc string parser + module collector kind of thing and leave compiling/rendering/testing up to implementations with the addition of certain assumptions.

A test framework does not care much about the doc string being rendered, it just cares about examples and bindings required to compile those examples. A doc generator may only care about the content and referenced items without requiring to render examples (though it could). The language server probably doesn't want to render examples either and only requires the descriptions, but that's obviously your area of expertise.

At least to me, it seems that we should think mainly about the syntax and semantics such that all tools can agree on them and leave the rendering and processing up to them.


While a docstring has compile errors, they SHOULD be tolerated when rendering the docstring. The typicial errors SHOULD be ignored are undefined references. There are three main reasons can cause "sensible" undefined references:
- undefined processor-specific definitions: _Docstring processors_ can add unique definitions to the scope of a docstring. For example, `example` function is defined by tidy, however, it can be not defined by the official docstring processor.
- undefined definitions from the user modules: A common case is that developer would like to create examples with the current definitions provided by the package. However, a definition may be invalid when the developer is editing the code. It is not good if docstrings are not rendered during editing.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While this is a good point for the language server, I don't think we should put the assumption of a doc string not having to be valid on all consumers of doc strings, this is certainly a special case for language servers. A testing framework or doc generator should (and I can't think of anyone who would not) bail out when an error is encountered.


Two main factors can affect the backward compatibility:
- The typst version to render the docstring.
- The docstring processor used to process the docstring.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think not just the processor used, but its version too influences this, so perhaps we should add this here too.


```toml
[docs]
prelude = "@preview/tidy:0.4.0"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should note down how this exactly plays out. At first, I assumed this example means one can access a binding called tidy in their doc strings and nothing else without further configuration, but from the paragraph below it seems that it is more akin to star-importing this package.

This should probably also be an array of package specs to allow more than one.

Comment on lines +315 to +320
Despite adding complexity, not introducing new syntax has two disadvantages:
- P1: it is not friendly to add license information when we also want to add a module-level docstring.
- When commenting out code, the commented code may be occasionally recognized as a doc comment or cause failing compilation of the related docstring.
Despite adding complexity, it doesn't add much:
- regular users don't have to learn if they are not going to document code.
- the introduced syntax is relatively easy and it is easy to learn and discover. When one want to add documentation, they can soonly find the syntax in the official documentation by the keyword `docstring` or `documentation`. Tutorial can also mention the syntax in short, because it is simple.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This whole part is a little hard to read for me, is this advocating for using /// or against it?

Comment on lines +366 to +394
The Example 6 to 8 are all document nothing (incorrect places).

Example 6 (not at the start of the file):
```typ
#!/usr/bin/env typst

/// Hello
/// World.
```

Example 7 (not exactly before "Let bindings"):
```typ
#!/usr/bin/env typst

/// Hello
/// World.

#let id(x) = x
```

Example 8 (not exactly before "Let bindings"):
```typ
#!/usr/bin/env typst

/// Hello
/// World.
// Random comment.
#let id(x) = x
```
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should these all be warnings or errors? I hardly see it as beneficial to make it a warning. For reference, Rust warns in places where docs are useless but not noticing it is non-fatal (on expressions) and fails in places where it's invalid (on parameters).

Comment on lines +303 to +309
Possible places and suggested by this RFC:

- Exactly before "Parameters" or "Destructed variables". This is because the parameter list in typst are usually very long.

Possible places and not suggested by this RFC:

- "Other Expressions"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these definitions for places for which you're not sure if they should be included? If yes, I think those definitions are fine, but we need to define how one exactly documents destructured variables and maybe an example consumer is supposed to do with them.

@tingerrr tingerrr mentioned this pull request Sep 14, 2025
@Myriad-Dreamin
Copy link
Author

@tingerrr tingerrr added rfc Is an RFC S-waiting-on-revision Status: Waiting on revision from author T-all RFC affects a wide portion of the ecosystem labels Oct 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

rfc Is an RFC S-waiting-on-revision Status: Waiting on revision from author T-all RFC affects a wide portion of the ecosystem

2 participants