Post
6
Published codeskills-bench : a small benchmark for code changes engineers still hesitate to trust agents with.
23 small Python tasks covering migrations, merge conflicts, hidden side-effects, and subtle bugs.
Details here: https://huggingface.co/blog/namanvats/codeskill-bench
23 small Python tasks covering migrations, merge conflicts, hidden side-effects, and subtle bugs.
Details here: https://huggingface.co/blog/namanvats/codeskill-bench