The Bestiary Collection Decensored language models made using Heretic (https://github.com/p-e-w/heretic) • 6 items • Updated Nov 16 • 75
Online RLHF Collection Datasets, code, and models for online RLHF (i.e., iterative DPO) • 19 items • Updated Jun 12, 2024 • 5