Curia: A Multi-Modal Foundation Model for Radiology Paper โข 2509.06830 โข Published Sep 8, 2025 โข 20
cmarkea/distilcamembert-base-sentiment Text Classification โข 68.1M โข Updated Oct 26, 2024 โข 5.9k โข โข 37
view post Post 1852 Wow, impressive 340B model by nvidia with a nice permissive license! ๐ The technical report is full of insights and seems to use a different learning rate schedule than cosine, probably a variant of WSD. Hope to get more info on that! ๐ nvidia/nemotron-4-340b-666b7ebaf1b3867caf2f1911 ๐ค 1 1 + Reply