lm-harmony

community

https://github.com/socialfoundations/lm-harmony

Activity Feed

AI & ML interests

None defined yet.

Recent Activity

ghzhang233 updated a Space 8 days ago

lm-harmony/README

ghzhang233 updated a Space 8 days ago

lm-harmony/README

View all activity

Organization Card

Community About org cards

Which model would you rather have: the weaker student who crammed for the test, or the stronger student who walked in underprepared? Existing leaderboards mostly reward the former.

LM-Harmony is a multi-task leaderboard for model potential. Instead of judging deployment-ready performance out of the box, we use a train-before-test paradigm: every model is fine-tuned on the same benchmark-specific training set before evaluation.

Across diverse tasks, LM-Harmony yields far more stable and consistent rankings than standard direct-evaluation leaderboards. If you care about which model will perform better after you fine-tune it on your own data, the ranking you see here is much more likely to generalize to your workload.

models 0

None public yet

datasets 0

None public yet

AI & ML interests

Recent Activity

Team members 2

models 0

datasets 0