- Notifications
You must be signed in to change notification settings - Fork 3.1k
Pull requests: openai/parameter-golf
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Record: GDN-Hybrid (Gated DeltaNet + Sliding Window Attention) - quantized_bpb 1.02046
#1562 opened Apr 12, 2026 by joshkmartinez Loading…
Record: SP8192 + Triple Recurrence + Banking + Fused MLP + Muon 0.97 — val_bpb 1.0783 (3-seed mean)
#1561 opened Apr 12, 2026 by EthanYangTW Loading…
Record: VarLen Attention + Triton Fused MLP + Doc-TTT + Warmdown 0.75 + Chunk 48 — val_bpb 1.07406 (3-seed mean)
#1560 opened Apr 12, 2026 by dexhunter Contributor Loading…
11 tasks done
[Architectural Proof-of-Concept] Saliency-Boosted GPTQ & High-Entropy Routing
#1558 opened Apr 12, 2026 by Subramanyam6 Loading…
Record: SP8192 + Improved Parallel Residuals + Muon 0.97 + TTT 5ep + N-gram Tilt + Hessian SDClip — val_bpb 1.07730
#1557 opened Apr 12, 2026 by ndokutovich Loading…
7 tasks done
Non-Record: JEPA-NTP Auxiliary Losses (Negative Result)
#1556 opened Apr 12, 2026 by sidhanth97 Loading…
Record: TMA Megakernel + Improved Parallel Residuals + Tap-In min_match=1 — val_bpb 1.07636 (3-seed mean)
#1555 opened Apr 11, 2026 by andrewbaggio1 Loading…
11 tasks done
Non-record: Universal Transformer + Legal Pre-Quant TTT (Training-Slice Variant)
#1554 opened Apr 11, 2026 by dentity007 Loading…
Non-record: GDN-Hybrid (Gated DeltaNet + SWA) — val_bpb 1.209735
#1553 opened Apr 11, 2026 by Abhishek8108 Loading…
RecurLoRA v2: Pass Index Embeddings + Low-Rank Adapters on SP8192 Depth Recurrence
#1552 opened Apr 11, 2026 by Tanush1912 Loading…
4 tasks
[notable non record] Train-Time Overparameterization: Better Models Through Transient Expansion
#1551 opened Apr 11, 2026 by andrewmouldon Loading…
Non-record: Pre-Quant AdamW TTT (Compiled) + SP8192 + Depth Recurrence — val_bpb 1.0587 (3-seed mean)
#1550 opened Apr 11, 2026 by translatingthename Loading…
4 tasks done
Non-record: Frozen Random Backbone + Rank-304 LoRA Adapters (val_bpb 1.3220)
#1549 opened Apr 11, 2026 by dljr-github Loading…
[non record] Investigating the Tied-Embedding Bottleneck: Why Boundary Blocks Underperform and What It Means for 16MB Models
#1546 opened Apr 11, 2026 by SPThole Loading…
PTQ int6-attn + int5-mlp, 20L×256d, mlp=5 — val_bpb 1.3286
#1543 opened Apr 11, 2026 by PavelPaha Loading…
Submission: Recursive Layer Sharing | 13.9 MB | 1.53 BPB
#1542 opened Apr 11, 2026 by negrurv Loading…
Record: SP8192 + Improved Parallel Residuals + Muon 0.97 + LR 0.03 + Legal TTT — val_bpb 1.07785 (3-seed mean)
#1541 opened Apr 11, 2026 by bigbag Loading…
Record: SP8192 + VarLen Attention + LoRA TTT + Fused MLP — val_bpb 1.0777 (3-seed mean)
#1540 opened Apr 11, 2026 by aryanbhosale Contributor Loading…
records: David Ghazaryan — MoE + BigramHash4096 (val_bpb 1.11799)
#1538 opened Apr 11, 2026 by davie2009kh Loading…
Non-Record: CAT, Sparsity (Structured and Hessian-Guided), MoE, KAN Negative Results
#1537 opened Apr 11, 2026 by pireylow Loading…
Non-record: SP8192 + VarLen Attention + Doc-Independent LoRA TTT + Banking + Muon 0.97 — val_bpb 1.07747 (3-seed mean)
#1536 opened Apr 11, 2026 by dexhunter Contributor Loading…
Recursive Transformer - Non-Record Submission — 1.07424983 val_bpb (4h depth-recurrent hybrid transformer run)
#1535 opened Apr 11, 2026 by newjordan Loading…
SP4096 + Depth Recurrence + Parallel Residuals + Legal N-Gram
#1534 opened Apr 11, 2026 by someone114514 Loading…
Record: SP8192 + Banking + Triple Recurrence + Parallel Residuals + Muon 0.97 + TTT — val_bpb 1.0790 (5-seed mean)
#1533 opened Apr 11, 2026 by aryanbhosale Contributor Loading…
Previous Next
ProTip! Exclude everything labeled
bug with -label:bug.