NVIDIA / TransformerEngine Public

Notifications You must be signed in to change notification settings
Fork 565
Star 3k

Code
Issues 246
Pull requests 91
Discussions
Actions
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Security
Insights

Pull requests: NVIDIA/TransformerEngine

Labels 59 Milestones 0

New pull request New

91 Open 1,744 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

support cuda graph capture offloading module

#2435 opened Dec 1, 2025 by lhb8125 • Draft

13 tasks

[PyTorch] Add FA4 Support

#2432 opened Nov 28, 2025 by yaox12 • Draft

1 of 16 tasks

[Pytorch][Bug]MXFP8 Split tensor Bug fix

#2427 opened Nov 26, 2025 by vthumbe1503

Loading…

2 of 13 tasks

[PyTorch] Convert sample tuple to list in cudagraph input reuse

#2426 opened Nov 26, 2025 by buptzyb

Loading…

13 tasks

Fix FusedAdam DTensor compatibility issue

#2425 opened Nov 26, 2025 by shjwudp

Loading…

13 tasks

[JAX] Add tutorial for integrating TE/JAX quantization into an existing framework

#2423 opened Nov 26, 2025 by jberchtold-nvidia

Loading…

8 of 13 tasks

[JAX] Wrapper for Permutation Triton kernel MoE

#2419 opened Nov 25, 2025 by tdophung • Draft

9 of 16 tasks

[Common] Add kFloat64 partial support

#2417 opened Nov 24, 2025 by phu0ngng

Loading…

7 of 13 tasks

[Common] Persistent NVFP4 cast + transpose kernel 2.11.0

#2412 opened Nov 21, 2025 by Oleg-Goncharov

Loading…

6 of 13 tasks

[PyTorch][NVFP4][MOE] NVFP4 Grouped Quantize with Hadamard Transform MoE

#2411 opened Nov 21, 2025 by zhongbozhu

Loading…

2 of 16 tasks

[Common] NVTEGroupedTensor class and helpers MoE

#2388 opened Nov 14, 2025 by phu0ngng

Loading…

7 of 13 tasks

Enables specified cp rank slicing

#2387 opened Nov 14, 2025 by jomitchellnv

Loading…

1 of 13 tasks

[JAX] Re-use RHT matrix constant

#2386 opened Nov 14, 2025 by jberchtold-nvidia • Draft

8 of 13 tasks

[Draft] TopK Fusion to JAX MoE

#2385 opened Nov 14, 2025 by mingxu1067

Loading…

5 of 13 tasks

Set RPATH for cuda libraries from python package

#2381 opened Nov 14, 2025 by take-cheeze • Draft

4 of 13 tasks

[JAX] Add CP + THD + AG + Striped>1 + SWA support

#2379 opened Nov 13, 2025 by KshitijLakhani

Loading…

8 of 13 tasks

[JAX] NVFP4 2D 1x1x for Weight

#2365 opened Nov 10, 2025 by phu0ngng • Draft

13 tasks

[JAX] cuBlasMp integration for CollectiveGemm custom op

#2361 opened Nov 7, 2025 by denera

Loading…

5 of 13 tasks

Add device-Initiated Grouped GEMM supporting m_splits on device MoE

#2360 opened Nov 7, 2025 by QiZhangNV

Loading…

1 of 13 tasks

Add num_splits support for FA3 backend

#2357 opened Nov 6, 2025 by wdykas

Loading…

13 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!