-
Attention Is All You Need
Paper • 1706.03762 • Published • 105 -
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Paper • 1810.04805 • Published • 24 -
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter
Paper • 1910.01108 • Published • 21 -
Language Models are Few-Shot Learners
Paper • 2005.14165 • Published • 18
Ooi Boon Pin
ooibp
·
AI & ML interests
None yet
Recent Activity
liked
a Space
about 1 month ago
nanotron/ultrascale-playbook
liked
a Space
about 1 month ago
HuggingFaceFW/blogpost-fineweb-v1
liked
a Space
about 1 month ago
HuggingFaceTB/smol-training-playbook
Organizations
None yet