Skip to content

Non-record: MTP Auxiliary Loss Exploration#955

Open
genji0306 wants to merge 2 commits intoopenai:mainfrom
genji0306:approach-mtp-auxiliary-loss
Open

Non-record: MTP Auxiliary Loss Exploration#955
genji0306 wants to merge 2 commits intoopenai:mainfrom
genji0306:approach-mtp-auxiliary-loss

Conversation

@genji0306
Copy link
Copy Markdown

Summary

Exploration of the existing Multi-Token Prediction (MTP) code path in train_gpt.py, which is fully implemented but disabled (MTP_NUM_HEADS=0) across all current submissions. MTP heads are excluded from the exported artifact, so enabling them as an auxiliary loss has zero impact on the 16MB constraint.

Status

Pending H100 validation. Will update with results after compute access.

A systematic orchestration framework that coordinates solution analysis, hypothesis generation, experiment design, and automated testing for Parameter Golf. Includes control plane CLI, JSON schemas, metric helpers, and experiment materializer. Full source: https://github.com/genji0306/parameter-golf-orchestrator
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

1 participant