Edit Models filters

Model Tree

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

One-click Deployment

Inference Endpoints

Microsoft Foundry

Amazon SageMaker AI

Misc

text-generation-inference

Eval Results (legacy)

text-embeddings-inference

4-bit precision

8-bit precision

Mixture of Experts

Carbon Emissions

Models

611

Base only

Active filters: rlhf

TheBloke/CapybaraHermes-2.5-Mistral-7B-GGUF

7B • Updated Jan 31, 2024 • 2.51k • 128

MYTH-Lab/GoT-R1-8B

Text Generation • 8B • Updated May 28 • 30 • 1

MYTH-Lab/GoT-R1-14B

Text Generation • 15B • Updated May 28 • 90 • 2

mradermacher/GoT-R1-8B-GGUF

Text Generation • 8B • Updated Jun 26 • 137 • 1

mradermacher/GoT-R1-14B-GGUF

Text Generation • 15B • Updated Jun 26 • 136 • 1

mradermacher/GoT-R1-8B-i1-GGUF

Text Generation • 8B • Updated Jun 26 • 375 • 1

mradermacher/GoT-R1-14B-i1-GGUF

Text Generation • 15B • Updated Jun 26 • 341 • 1

Junjun2333/HPSv3-PlusPlus

Image-Text-to-Text • Updated Jun 23 • 33 • 2

idealab-cs2/reappraisal-4b-grpo-committee

Text Generation • 4B • Updated 6 days ago • 623 • 1

dataopsnick/diffusiongemma-26B-A4B-it-ByzantineSilk

Text Generation • Updated 4 days ago • 76 • 1

sileod/deberta-v3-base-tasksource-nli

Zero-Shot Classification • 0.2B • Updated Aug 13, 2024 • 10.8k • • 133

stanfordnlp/SteamSHP-flan-t5-xl

Updated Oct 10, 2023 • 55 • 43

stanfordnlp/SteamSHP-flan-t5-large

Updated Oct 10, 2023 • 128 • 33

trl-lib/llama-7b-se-peft

Updated Apr 6, 2023 • 4

sileod/deberta-v3-large-tasksource-nli

Zero-Shot Classification • 0.4B • Updated Feb 17, 2024 • 1.15k • • 40

sileod/deberta-v3-large-tasksource-rlhf-reward-model

Text Classification • Updated Mar 28, 2023 • 210 • 11

trl-lib/llama-7b-se-rl-peft

Updated Apr 14, 2023 • 103

trl-lib/llama-7b-se-rm-peft

Updated Apr 6, 2023 • 8

toloka/gpt2-large-rl-prompt-writing

Text Generation • 0.8B • Updated Apr 21, 2023 • 6 • 3

AdamG012/chat-opt-1.3b-rlhf-actor-deepspeed

Text Generation • Updated Apr 25, 2023 • 6 • 5

AdamG012/chat-opt-1.3b-rlhf-critic-deepspeed

Text Generation • Updated Apr 25, 2023 • 12 • 3

AdamG012/chat-opt-1.3b-rlhf-actor-ema-deepspeed

Text Generation • Updated Apr 25, 2023 • 7 • 8

sileod/mdeberta-v3-base-tasksource-nli

Zero-Shot Classification • 0.3B • Updated Oct 19, 2023 • 62 • 18

agi-css/socially-good-lm

Text Generation • Updated May 29, 2023 • 8 • 5

agi-css/hh-rlhf-sft

Text Generation • Updated Jun 1, 2023 • 6 • 3

agi-css/better-base

Text Generation • Updated Jun 1, 2023 • 8 • 6

argilla/roberta-base-reward-model-falcon-dolly

Text Classification • 82.1M • Updated Jun 16, 2023 • 14 • 4

merve/peft-copy-test

Text Generation • Updated Jun 14, 2023 • 4

PKU-Alignment/beaver-7b-v1.0

Reinforcement Learning • 7B • Updated May 9, 2024 • 160 • 13

lyogavin/Anima33B-DPO-Belle-1k

Text Generation • Updated Jul 2, 2023 • 1