Rom's picture

Rom

wrom

·

wr0om

AI & ML interests

LLM Security

Recent Activity

upvoted a paper about 1 hour ago

Alignment Makes Language Models Normative, Not Descriptive

upvoted a paper 29 days ago

Extracting Recurring Vulnerabilities from Black-Box LLM-Generated Software

authored a paper about 1 month ago

Step-Wise Refusal Dynamics in Autoregressive and Diffusion Language Models

View all activity

Organizations

Papers 1

arxiv:2602.02600

spaces 2

silenced_biases

Sbb

models 0

None public yet

datasets 3

wrom/silenced_biases

Updated Jan 8 • 5 • 1

wrom/HebrewBible_HapaxLegomenon

Viewer • Updated Sep 4, 2025 • 249 • 7 • 1

wrom/Language-Vision-Hallucinations

Viewer • Updated Nov 1, 2024 • 350 • 11 • 2