Models
Datasets
Spaces
Buckets new
Docs
Enterprise
Pricing
Log In
Sign Up

Yuquan Xie's picture

Yuquan Xie

xieyuquan

dawn0815's profile picture

Minuskid's profile picture

·

xieyuquanxx

AI & ML interests

LLM, multi-modal

Organizations

xieyuquan 's collections 5

Secrets of RLHF in Large Language Models Part II: Reward Modeling

Paper • 2401.06080 • Published Jan 11, 2024 • 28
Scaling Laws for Reward Model Overoptimization in Direct Alignment Algorithms

Paper • 2406.02900 • Published Jun 5, 2024 • 13
AgentGym: Evolving Large Language Model-based Agents across Diverse Environments

Paper • 2406.04151 • Published Jun 6, 2024 • 24
Understanding and Diagnosing Deep Reinforcement Learning

Paper • 2406.16979 • Published Jun 23, 2024 • 10

TroL: Traversal of Layers for Large Language and Vision Models

Paper • 2406.12246 • Published Jun 18, 2024 • 36
A Closer Look into Mixture-of-Experts in Large Language Models

Paper • 2406.18219 • Published Jun 26, 2024 • 17
ThinK: Thinner Key Cache by Query-Driven Pruning

Paper • 2407.21018 • Published Jul 30, 2024 • 32
Meltemi: The first open Large Language Model for Greek

Paper • 2407.20743 • Published Jul 30, 2024 • 68

Law of Vision Representation in MLLMs

Paper • 2408.16357 • Published Aug 29, 2024 • 95
CogVLM2: Visual Language Models for Image and Video Understanding

Paper • 2408.16500 • Published Aug 29, 2024 • 57
Learning to Move Like Professional Counter-Strike Players

Paper • 2408.13934 • Published Aug 25, 2024 • 23
Building and better understanding vision-language models: insights and future directions

Paper • 2408.12637 • Published Aug 22, 2024 • 133

A Simple and Effective L_2 Norm-Based Strategy for KV Cache Compression

Paper • 2406.11430 • Published Jun 17, 2024 • 25
VoCo-LLaMA: Towards Vision Compression with Large Language Models

Paper • 2406.12275 • Published Jun 18, 2024 • 31

Bootstrapping Language Models with DPO Implicit Rewards

Paper • 2406.09760 • Published Jun 14, 2024 • 41
BPO: Supercharging Online Preference Learning by Adhering to the Proximity of Behavior LLM

Paper • 2406.12168 • Published Jun 18, 2024 • 7
WPO: Enhancing RLHF with Weighted Preference Optimization

Paper • 2406.11827 • Published Jun 17, 2024 • 17
Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs

Paper • 2406.18629 • Published Jun 26, 2024 • 42

Secrets of RLHF in Large Language Models Part II: Reward Modeling

Paper • 2401.06080 • Published Jan 11, 2024 • 28
Scaling Laws for Reward Model Overoptimization in Direct Alignment Algorithms

Paper • 2406.02900 • Published Jun 5, 2024 • 13
AgentGym: Evolving Large Language Model-based Agents across Diverse Environments

Paper • 2406.04151 • Published Jun 6, 2024 • 24
Understanding and Diagnosing Deep Reinforcement Learning

Paper • 2406.16979 • Published Jun 23, 2024 • 10

A Simple and Effective L_2 Norm-Based Strategy for KV Cache Compression

Paper • 2406.11430 • Published Jun 17, 2024 • 25
VoCo-LLaMA: Towards Vision Compression with Large Language Models

Paper • 2406.12275 • Published Jun 18, 2024 • 31

TroL: Traversal of Layers for Large Language and Vision Models

Paper • 2406.12246 • Published Jun 18, 2024 • 36
A Closer Look into Mixture-of-Experts in Large Language Models

Paper • 2406.18219 • Published Jun 26, 2024 • 17
ThinK: Thinner Key Cache by Query-Driven Pruning

Paper • 2407.21018 • Published Jul 30, 2024 • 32
Meltemi: The first open Large Language Model for Greek

Paper • 2407.20743 • Published Jul 30, 2024 • 68

Bootstrapping Language Models with DPO Implicit Rewards

Paper • 2406.09760 • Published Jun 14, 2024 • 41
BPO: Supercharging Online Preference Learning by Adhering to the Proximity of Behavior LLM

Paper • 2406.12168 • Published Jun 18, 2024 • 7
WPO: Enhancing RLHF with Weighted Preference Optimization

Paper • 2406.11827 • Published Jun 17, 2024 • 17
Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs

Paper • 2406.18629 • Published Jun 26, 2024 • 42

Law of Vision Representation in MLLMs

Paper • 2408.16357 • Published Aug 29, 2024 • 95
CogVLM2: Visual Language Models for Image and Video Understanding

Paper • 2408.16500 • Published Aug 29, 2024 • 57
Learning to Move Like Professional Counter-Strike Players

Paper • 2408.13934 • Published Aug 25, 2024 • 23
Building and better understanding vision-language models: insights and future directions

Paper • 2408.12637 • Published Aug 22, 2024 • 133

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs