Skip to content

[Architectural Proof-of-Concept] Saliency-Boosted GPTQ & High-Entropy Routing#1558

Open
Subramanyam6 wants to merge 4 commits intoopenai:mainfrom
Subramanyam6:submission/adamw-vt-gptq
Open

[Architectural Proof-of-Concept] Saliency-Boosted GPTQ & High-Entropy Routing#1558
Subramanyam6 wants to merge 4 commits intoopenai:mainfrom
Subramanyam6:submission/adamw-vt-gptq

Conversation

@Subramanyam6
Copy link
Copy Markdown

This PR is submitted as an architectural proof-of-concept and an explicit pitch for the $500
RunPod Advanced Competitor Grant.

Due to a metric scaling bug (bytes_per_token=3.5), we initially logged a pre-quantization BPB of
0.9756, which corrects to a true pre-quantization BPB of 1.402 (starved of convergence due to a 10x
throughput deficit without Flash Attention 3).

However, the architectural innovations introduced here—Optimizer Saliency-Boosted GPTQ (a
zero-byte architectural shield) and High-Entropy Routing (QK-Gain 4.0)—are highly novel. We are
requesting the compute grant to debug the Int6 outlier clamping and build the Latent-PABU CUDA kernel,
clearing our path to the top leaderboard spot.

Bala Subramanyam Duggirala added 4 commits April 11, 2026 18:53
Adds records/track_10min_16mb/2026-04-08_11L_XSA_BigramSVD_AdamWvtGPTQ with README.md, submission.json, train_gpt.py, and 3-seed training logs. Made-with: Cursor
…fields - requirements.txt for numpy, sentencepiece, zstandard (per challenge FAQ) - submission.json: val_loss, corrected multi-seed means/std for transparency Made-with: Cursor
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

1 participant