Model Card for `ConfRover-base-20M-v1.0`

ConfRover base model trained with forward simulation and iid sampling

Model Details

Model Description

ConfRover is a deep generative model for protein 3D conformation and motion dynamics. It leverages diffusion probability model to learn the distribution of protein 3D conformations and captures the their temporal dependencies between frames through temporal causal transformers. Models are trained using molecular dynamics (MD) trajectories data and can generate protein conformation ensembles and motion trajectories conditioned on the input protein amino acid sequence.

This version was trained with tasks of forward simulation and iid sampling at 1:1 ratio.

Basic info

Model ID	ConfRover-base-20M-v1.0
Variant	base
Size	20M
Version	v1.0
Recommend	For forward simulation and iid sampling tasks
License	Apache-2.0

Model Sources

Repository: https://github.com/ByteDance-Seed/ConfRover
Paper: https://arxiv.org/abs/2505.17478
Website: https://ByteDance-Seed.github.io/ConfRover

How to Get Started with the Model

Use the code below to get started with the model.

from confrover import ConfRover

model = ConfRover.from_pretrained(<model_name>)

model.to("cuda")

model.generate(
    case_id=<case_name>,
    seqres=<amino_acid_sequence>,
    output_dir=</path/to/save/output>,
    task_mode=<"forward"|"iid"|"interp">,
    n_replicates=<int>, # number of replicated trajectories (forward and interp) or total number of conformation samples (iid)
    n_frames=<int>, # number of frames in the trajectory, including the conditioning frames.
    stride_in_10ps=256, # time interval between frames in the unit of 10 ps.
    conditions=..., # information for conditioning frames for forward simulation and interp. See `ConfRover.generate` for more details.
)

Technical Specifications

ConfRover contains encoder, temporal module, and diffusion decoder.

The encoder maps the input amino acid sequence (through a folding model) and coordinates of context frames to a latent representation.
The temporal module models the temporal dependencies between frames using an interleaving of causal transformers (across the temporal dimension) and pairformers (to update structures).
The diffusion model learns the probability distribution of protein conformations and generates samples conditioned on the input sequence and conditioning representation.

Bias, Risks, and Limitations

ConfRover is trained on limited MD trajectories data and may not generalize well to out-of-distribution data. The quality of generated conformations is also limited by the quality of the input data and the computational resources. Currently, ConfRover only supports protein conformation generation and models the coordinates of heavy atoms.

Citation

@article{confrover2025,
  title={Simultaneous Modeling of Protein Conformation and Dynamics via Autoregression},
  author={Shen, Yuning and Wang, Lihao and Yuan, Huizhuo and Wang, Yan and Yang, Bangji and Gu, Quanquan},
  journal={arXiv preprint arXiv:2505.17478},
  year={2025}
}

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model Card for ConfRover-base-20M-v1.0