Edit Models filters

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

reinforcement-learning

Inference Endpoints

text-generation-inference

Eval Results (legacy)

text-embeddings-inference

4-bit precision

8-bit precision

Mixture of Experts

Carbon Emissions

Models

70,148

Full-text search

Active filters: reinforcement-learning

ChengyuDu0123/HER-32B

Text Generation • 33B • Updated 5 days ago • 78 • 14

nvidia/NitroGen

Reinforcement Learning • Updated 4 days ago • 495

Adilbai/stock-trading-rl-agent

Reinforcement Learning • Updated Jan 8 • 101 • 102

hkust-nlp/drkernel-14b

Text Generation • 15B • Updated 3 days ago • 18 • 4

zai-org/GLM-TTS

Text-to-Speech • Updated 28 days ago • 448 • 313

gagansuie/oxidize-models

Other • Updated about 15 hours ago • 505 • 3

dayll/SEAD-14B

Text Generation • 15B • Updated about 8 hours ago • 19 • 3

biang889/ProAct

Question Answering • Updated 4 days ago • 3

ValueFX9507/Tifa-DeepsexV2-7b-MGRPO-GGUF-Q8

Reinforcement Learning • 8B • Updated Mar 28, 2025 • 4.45k • 192

JonusNattapong/AI-XAUUSD-Trading

Reinforcement Learning • Updated Oct 10, 2025 • 16

dariakryvosheieva/video-prompt-enhancer

Reinforcement Learning • Updated Dec 10, 2025 • 10 • 2

ParamTatva/sanskrit-ppo-hopper-v5

Reinforcement Learning • Updated 1 day ago • 2

sb3/tqc-Walker2DBulletEnv-v0

Reinforcement Learning • Updated Oct 11, 2022 • 2 • 1

danieladejumo/ppo-MountainCarContinuous-v0

Reinforcement Learning • Updated Jun 30, 2022 • 3 • 1

qgallouedec/ppo-LiftCube-v0

Robotics • Updated Jun 10, 2024 • 16 • 1

rstar2-reproduce/rStar2-Agent-14B

Text Generation • 15B • Updated Sep 1, 2025 • 18 • 26

shiviktech/Trident

Text Generation • 4B • Updated Jan 7 • 6 • 1

Intel/deepmath-v1

Text Generation • 4B • Updated Dec 8, 2025 • 37 • 11

PrimeIntellect/INTELLECT-3

Text Generation • 107B • Updated Nov 27, 2025 • 1.65k • 206

PrimeIntellect/INTELLECT-3-FP8

Text Generation • 107B • Updated Nov 27, 2025 • 187 • • 22

AIPlans/Qwen3-0.6B-ReMax

Reinforcement Learning • 0.6B • Updated Dec 22, 2025 • 8 • 2

HumanPlane/LACUNA

Reinforcement Learning • Updated Jan 1 • 9 • 6

SEGAgentRL/LLDS-A-GSPO-Qwen2.5-3B-Ins

Reinforcement Learning • 3B • Updated 25 days ago • 35 • 1

Emperorizzis/ASTRA-14B-Thinking-v1

Text Generation • 15B • Updated 8 days ago • 33 • 8

Guyohm/ppo-LunarLander-v3

Reinforcement Learning • Updated 14 days ago • 23 • 1

AdityaaXD/Multi-Agent_Reinforcement_Learning_Trading_System_Models

Reinforcement Learning • Updated 9 days ago • 156 • 1

Jarrodbarnes/KernelBench-RLVR-120b

Text Generation • 117B • Updated 7 days ago • 18 • 1

hkust-nlp/drkernel-8b

Text Generation • 8B • Updated 3 days ago • 16 • 1

mradermacher/drkernel-14b-GGUF

Reinforcement Learning • 15B • Updated 2 days ago • 626 • 1

ValueFX9507/Tifa-Deepsex-14b-CoT-GGUF-Q4

Reinforcement Learning • 15B • Updated Feb 13, 2025 • 1.83k • 821