Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing
    • Website
      • Tasks
      • HuggingChat
      • Collections
      • Languages
      • Organizations
    • Community
      • Blog
      • Posts
      • Daily Papers
      • Learn
      • Discord
      • Forum
      • GitHub
    • Solutions
      • Team & Enterprise
      • Hugging Face PRO
      • Enterprise Support
      • Inference Providers
      • Inference Endpoints
      • Storage Buckets

  • Log In
  • Sign Up
Yikang Shen's picture
7 8 1

Yikang Shen

YikangS
ankits0052's profile picture gss1147's profile picture bikash05's profile picture
·
  • yikangshen

AI & ML interests

None yet

Organizations

IBM's profile picture JetMoE's profile picture IBM Research's profile picture Learning Machine's profile picture

upvoted a collection almost 2 years ago

Power-LM

Collection
Dense & MoE LLMs trained with power learning rate scheduler. • 3 items • Updated Mar 2 • 16
upvoted 3 papers almost 2 years ago

Power Scheduler: A Batch Size and Token Number Agnostic Learning Rate Scheduler

Paper • 2408.13359 • Published Aug 23, 2024 • 23

Octo-planner: On-device Language Model for Planner-Action Agents

Paper • 2406.18082 • Published Jun 26, 2024 • 48

Efficient Continual Pre-training by Mitigating the Stability Gap

Paper • 2406.14833 • Published Jun 21, 2024 • 20
upvoted 2 papers about 2 years ago

Aligning Large Multimodal Models with Factually Augmented RLHF

Paper • 2309.14525 • Published Sep 25, 2023 • 32

The Consensus Game: Language Model Generation via Equilibrium Search

Paper • 2310.09139 • Published Oct 13, 2023 • 14
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs