Skip to content

Pull requests: pytorch/rl

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

[Test] Add tests and benchmarks for collector throughput optimizations Benchmarks rl/benchmark changes CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Collectors Tests Incomplete or broken unit tests
#3567 opened Mar 24, 2026 by vmoens Loading…
[Performance] Enable torch.compile(fullgraph=True) for step_and_maybe_reset CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Performance Performance issue or suggestion for improvement
#3566 opened Mar 23, 2026 by vmoens Loading…
[Performance] Add fast path for step() and TransformedEnv._step() when _trust_step_output is set CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Performance Performance issue or suggestion for improvement Transforms
#3565 opened Mar 23, 2026 by vmoens Loading…
[Performance] Streamline collector inner loop carrier update CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Collectors Performance Performance issue or suggestion for improvement
#3564 opened Mar 23, 2026 by vmoens Loading…
[Performance] Add update_traj_ids flag to Collector to skip trajectory tracking CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Collectors Performance Performance issue or suggestion for improvement
#3563 opened Mar 23, 2026 by vmoens Loading…
[Performance] Add _trust_step_output flag to skip step validation overhead CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Performance Performance issue or suggestion for improvement Transforms
#3562 opened Mar 23, 2026 by vmoens Loading…
[Performance] Add out= parameter to _StepMDP for output buffer reuse CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Performance Performance issue or suggestion for improvement
#3561 opened Mar 23, 2026 by vmoens Loading…
[Performance] Add _skip_maybe_reset flag to bypass auto-reset in step_and_maybe_reset CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Performance Performance issue or suggestion for improvement
#3560 opened Mar 23, 2026 by vmoens Loading…
Fix vLLM >= 0.17 compatibility: migrate to native WeightTransferConfig API CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. llm/ LLM-related PR, triggers LLM CI tests Modules sota-implementations/ WeightUpdate
#3556 opened Mar 21, 2026 by vmoens Loading…
Export torchrl_logger from torchrl.__init__ for GRPO scripts CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
#3555 opened Mar 21, 2026 by vmoens Loading…
[CI] Add torch_geometric integration tests CI Has to do with CI setup (e.g. wheels & builds, tests...) CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Environments Adds or modifies an environment wrapper
#3552 opened Mar 14, 2026 by vmoens Loading…
1 of 2 tasks
[LLM] Wire MATH and Countdown into GRPO and Expert Iteration scripts CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. sota-implementations/
#3546 opened Mar 5, 2026 by vmoens Loading…
[LLM] Add Countdown numbers-game environment CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Documentation Improvements or additions to documentation llm/ LLM-related PR, triggers LLM CI tests
#3545 opened Mar 5, 2026 by vmoens Loading…
[LLM] Add MATH (competition mathematics) environment CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Documentation Improvements or additions to documentation llm/ LLM-related PR, triggers LLM CI tests
#3544 opened Mar 5, 2026 by vmoens Loading…
[LLM] Simplify IFEval reward aggregator CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. llm/ LLM-related PR, triggers LLM CI tests
#3543 opened Mar 5, 2026 by vmoens Loading…
[LLM] Rewrite GSM8K reward function to follow standard GRPO conventions CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. llm/ LLM-related PR, triggers LLM CI tests sota-implementations/
#3542 opened Mar 5, 2026 by vmoens Loading…
[CI] Install torchcodec from source CI Has to do with CI setup (e.g. wheels & builds, tests...) CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Record
#3541 opened Mar 4, 2026 by vmoens Loading…
[Feature] PILCO CI Has to do with CI setup (e.g. wheels & builds, tests...) CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Documentation Improvements or additions to documentation Feature New feature Modules Objectives sota-implementations/ Transforms
#3537 opened Feb 27, 2026 by PSXBRosa Loading…
3 of 6 tasks
[Feature] Added Genesis Environments to TorchRL CI Has to do with CI setup (e.g. wheels & builds, tests...) CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Environments Adds or modifies an environment wrapper Feature New feature
#3536 opened Feb 26, 2026 by ParamThakkar123 Loading…
6 of 10 tasks
[Feature] Added Lazy implementation of priority updates for replaybuffer prototype CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Feature New feature ReplayBuffers
#3507 opened Feb 13, 2026 by ParamThakkar123 Loading…
3 of 10 tasks
[Feature] Added support for TDMPC2 dataset CI Has to do with CI setup (e.g. wheels & builds, tests...) CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Data Data-related PR, will launch data-related jobs Documentation Improvements or additions to documentation Environments Adds or modifies an environment wrapper Feature New feature
#3501 opened Feb 12, 2026 by ParamThakkar123 Loading…
6 of 10 tasks
[Feature] Added OpenEnv environments CI Has to do with CI setup (e.g. wheels & builds, tests...) CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Documentation Improvements or additions to documentation Environments Adds or modifies an environment wrapper Feature New feature llm/ LLM-related PR, triggers LLM CI tests Trainers
#3470 opened Feb 9, 2026 by ParamThakkar123 Loading…
6 of 10 tasks
[Feature] Extended Support delayed spec initialization for exploration modules CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Feature New feature Modules
#3450 opened Feb 5, 2026 by ParamThakkar123 Loading…
3 of 10 tasks
[Feature] Added MCTSPolicyBase, MCTSPolicy, AlphaGoPolicy, AlphaStarPolicy, and MuZeroPolicy CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Documentation Improvements or additions to documentation Feature New feature Modules
#3449 opened Feb 5, 2026 by ParamThakkar123 Loading…
6 of 10 tasks
[Algorithm] DPO CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Documentation Improvements or additions to documentation llm/ LLM-related PR, triggers LLM CI tests Objectives
#3427 opened Jan 31, 2026 by vmoens Loading…
ProTip! Filter pull requests by the default branch with base:main.