-
agentlans/prompt-safety-classification
Viewer • Updated • 72.1k • 34 -
Jammies-io/safety-refusal
Viewer • Updated • 100 • 4 -
RefusalBench: Generative Evaluation of Selective Refusal in Grounded Language Models
Paper • 2510.10390 • Published • 4 -
nvidia/Aegis-AI-Content-Safety-Dataset-2.0
Viewer • Updated • 33.4k • 3.08k • 72
Daniel Bis
danielbis
·
AI & ML interests
https://scholar.google.com/citations?user=ArMgXHYAAAAJ&hl=en
Recent Activity
updated
a collection
7 days ago
safety
updated
a collection
9 days ago
safety
updated
a collection
9 days ago
safety
Organizations
None yet