Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Arena-Hard-Auto: An automatic LLM benchmark.