arxiv:2605.30260

How LoRA Remembers? A Parametric Memory Law for LLM Finetuning

Published on May 28

· Submitted by

Ningyu Zhang on May 29

alibaba-inc

Upvote

Authors:

Abstract

Research investigates the quantitative limits of parametric memory in large language models using LoRA as a probe, establishing a power law relationship and developing a threshold-guided optimization method for improved memory performance.

AI-generated summary

Large Language Models (LLMs) must continuously learn and update knowledge to remain effective in dynamic real-world environments. While Low-Rank Adaptation (LoRA) is widely used for such memory updates, existing studies mainly rely on qualitative downstream evaluations, leaving the quantitative capacity limits and underlying dynamics of exact parametric memory largely unexplored. To bridge this gap, we employ LoRA as a controlled memory capacity probe within the latent space to systematically quantify exact parametric memory. We introduce the Parametric Memory Law, a robust power law linking loss reduction Delta L to effective parameters and sequence length. At the token level, fine-grained analysis reveals a deterministic phase transition, demonstrating that a prediction probability of p > 0.5 constitutes a sufficient condition for verbatim recall under greedy decoding. Driven by these insights, we introduce MemFT, a threshold-guided optimization strategy that dynamically redistributes the training budget toward sub-threshold tokens. Empirical evaluations demonstrate that MemFT can enhance memory fidelity and efficiency. Code will be released at https://github.com/zjunlp/ParametricMemoryLaw.

View arXiv page View PDF GitHub 3 Add to collection

Community

Ningyu

Paper submitter 1 day ago

We uncover a quantitative law of parametric memory in LLMs, showing that exact recall emerges through a sharp probability threshold and can be significantly improved with threshold-guided fine-tuning.