gist-sparse-attention 's Collections

GSA

Models and Datasets of paper: [Forget, Then Recall: Learnable Compression and Selective Unfolding via Gist Sparse Attention]