Beta version: *Information might not be fully accurate. Please report any discrepancies.
Beta version: *Information might not be fully accurate. Please report any discrepancies.
Extra-hard subset of BIG-bench focusing on challenging reasoning and knowledge tasks.
Score Distribution