Skip to content

Fix champs-scalar-coupling to include test molecule structures#70

Closed
jvpoulos wants to merge 1 commit intoopenai:mainfrom
jvpoulos:fix-champs-scalar-coupling-structures
Closed

Fix champs-scalar-coupling to include test molecule structures#70
jvpoulos wants to merge 1 commit intoopenai:mainfrom
jvpoulos:fix-champs-scalar-coupling-structures

Conversation

@jvpoulos
Copy link

@jvpoulos jvpoulos commented Sep 4, 2025

The prepare script explicitly filters structures to only include train molecules. This is incorrect for this competition --- in the Kaggle competition, structures.csv contains all molecules (both train and test). The test molecules need their structures to make predictions, but they're being filtered out.

Changes:

  • Modified prepare.py to include both train and test molecules in structures.csv
  • Updated checksums.yaml to reflect the new structures.csv checksum
  • Updated assertions to validate both train and test molecules

After fixing the data preparation issue, I used a LightGBM model trained on 25% of the data to make predictions:

 { "competition_id": "champs-scalar-coupling", "score": 0.5823, "gold_threshold": -2.87509, "silver_threshold": -2.03119, "bronze_threshold": -1.90122, "median_threshold": -0.9529, "any_medal": false, "gold_medal": false, "silver_medal": false, "bronze_medal": false, "above_median": false, "submission_exists": true, "valid_submission": true, "is_lower_better": true, "created_at": "2025-09-03T22:56:19.592098", "submission_path": "mlebench/competitions/champs-scalar-coupling/submission.csv" } 
@thesofakillers
Copy link
Contributor

Thank you for catching this. You are correct and there is a mistake in the prepare.py, and your fix seems right.

As explained in the readme in #66 we won't be merging this fix in yet, and will release it as a batch of fixes in a upcoming v2 to be released on openai/preparedness. I've added as tracked in #71. I will try to put you as co-author for when we release the fix.

For submissions to the v1 leaderboard, please proceed as if this issue was not present.

thesofakillers added a commit that referenced this pull request Sep 8, 2025
thesofakillers added a commit that referenced this pull request Oct 8, 2025
* catalogue issue described in #70 * catalogue #77 * its called frontier-evals now
li-seeker pushed a commit to ycs-atc/mle-bench that referenced this pull request Dec 17, 2025
li-seeker pushed a commit to ycs-atc/mle-bench that referenced this pull request Dec 17, 2025
…ion (openai#78) * catalogue issue described in openai#70 * catalogue openai#77 * its called frontier-evals now
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

2 participants