Papers - Text - Research
updated
An Interdisciplinary Comparison of Sequence Modeling Methods for
Next-Element Prediction
Paper
• 1811.00062
• Published
• 2
mT5: A massively multilingual pre-trained text-to-text transformer
Paper
• 2010.11934
• Published
• 4
Bootstrap Your Own Skills: Learning to Solve New Tasks with Large
Language Model Guidance
Paper
• 2310.10021
• Published
• 2
Gemma: Open Models Based on Gemini Research and Technology
Paper
• 2403.08295
• Published
• 50
Scan and Snap: Understanding Training Dynamics and Token Composition in
1-layer Transformer
Paper
• 2305.16380
• Published
• 5
Unleashing the Power of Pre-trained Language Models for Offline
Reinforcement Learning
Paper
• 2310.20587
• Published
• 18
Structural Similarities Between Language Models and Neural Response
Measurements
Paper
• 2306.01930
• Published
• 2
Contrastive Decoding Improves Reasoning in Large Language Models
Paper
• 2309.09117
• Published
• 40
A Thorough Examination of Decoding Methods in the Era of LLMs
Paper
• 2402.06925
• Published
• 1
In-context Vectors: Making In Context Learning More Effective and
Controllable Through Latent Space Steering
Paper
• 2311.06668
• Published
• 5