add(grammargen): combined language/blob generation API#12
Conversation
- Add GenerateLanguageAndBlob and GenerateLanguageAndBlobWithContext for combined language/blob generation in a single pass - Provides direct access to both compiled Language struct and serialized blob without requiring callers to use diagnostic reports - Follows existing API pattern with both context-less and context-aware variants for cancellation support
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request enhances the Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
| | ||
| // GenerateLanguageAndBlob compiles a Grammar into both a Language struct and | ||
| // its serialized blob form in a single generation pass. | ||
| func GenerateLanguageAndBlob(g *Grammar) (*gotreesitter.Language, []byte, error) { |
There was a problem hiding this comment.
This zero-context wrapper keeps the public API parallel to GenerateLanguage/GenerateLanguageWithContext, so callers that only need the outputs do not have to switch over to the diagnostics-oriented GenerateWithReport surface.
| // GenerateLanguageAndBlobWithContext is like GenerateLanguageAndBlob but | ||
| // accepts a context for cancellation. | ||
| func GenerateLanguageAndBlobWithContext(ctx context.Context, g *Grammar) (*gotreesitter.Language, []byte, error) { | ||
| report, err := generateWithReportCtx(ctx, g, reportBuildOptions{ |
There was a problem hiding this comment.
This is the crux of the change: reuse the existing combined generation path so the language and blob come from one pass, rather than asking downstream callers to do GenerateLanguage plus a separate blob generation step.
There was a problem hiding this comment.
Code Review
This pull request introduces GenerateLanguageAndBlob and GenerateLanguageAndBlobWithContext to provide a convenient API for generating both a compiled Language and its serialized blob in a single pass. The implementation is a clean, thin wrapper around existing internal functionality. My main feedback is to add a dedicated test case for this new public API to ensure its correctness and long-term maintainability, a point you also raised in the PR description.
| func GenerateLanguageAndBlob(g *Grammar) (*gotreesitter.Language, []byte, error) { | ||
| return GenerateLanguageAndBlobWithContext(context.Background(), g) | ||
| } |
There was a problem hiding this comment.
While this is a thin wrapper, adding a dedicated test case would be beneficial to verify the public API contract. It would ensure that both the Language object and the []byte blob are generated correctly in one pass and that they are consistent with each other.
You've noted this in the PR description, and I agree that adding a test would improve long-term maintainability. A simple test could:
- Call
GenerateLanguageAndBlob. - Verify both returned values are non-nil.
- Use the returned
*gotreesitter.Languageto parse a sample input and check for errors. - Decode the returned
[]byteblob into a new language object. - Use the decoded language to parse the same input and verify the resulting S-expression matches the one from step 3.
What does this PR do?
Adds
grammargen.GenerateLanguageAndBlobandgrammargen.GenerateLanguageAndBlobWithContextso callers can get both the compiledLanguageand the serialized blob from one generation pass, without having to depend on the diagnostic report API.This is intended to unblock downstream callers like
danmujithat want to cache the blob but still need the liveLanguageobject immediately.Why this approach?
The combined generation path already exists internally via
generateWithReportCtx(..., includeLanguage: true, includeBlob: true)and is exposed indirectly throughGenerateWithReport. The gap is that callers who only need the operational outputs have to opt into a diagnostics-oriented API.A small public wrapper keeps the common case direct, preserves the existing
GenerateLanguage/GenerateLanguageWithContextshape, and avoids forcing downstream code to couple to report internals.Correctness
go test ./...)go test ./grammars/ -run TestSupportedLanguagesParseSmoke)go test ./grammars/ -run TestCorrectness)go test -tags 'cgo treesitter_c_parity' ./cgo_harness/)Notes:
Performance
go test -bench=. -benchmem -count=5)Notes:
Maintainability
Notes:
Self-review
Review your own diff before requesting review from others. Walk through it as if you're seeing it for the first time. Leave comments on your own PR pointing out the notable parts — the tricky bits, the non-obvious decisions, anything a reviewer's eye would naturally land on. Comment on what you think the crux of the solution is and why.
Notes:
Due diligence
Notes: