zuazo's picture
Add the LibriBrain holdout score
3f98265 verified
metadata
license: bsd-3-clause
tags:
  - meg
  - brain-signals
  - speech-detection
  - conformer
  - libribrain
datasets:
  - pnpl/LibriBrain
metrics:
  - f1
library_name: pytorch
model-index:
  - name: megconformer-speech-detection
    results:
      - task:
          type: audio-classification
          name: Speech classification
        dataset:
          name: LibriBrain 2025 PNPL (Standard track, speech task)
          type: pnpl/LibriBrain
          split: holdout
        metrics:
          - name: F1-macro
            type: f1
            value: 0.889
            args:
              average: macro

MEGConformer for Speech Detection

Conformer-based MEG decoder for binary speech detection, trained with 10 different random seeds for reproducibility.

Model Performance

Seed Val F1-Macro Checkpoint
0 (best) 87.06% seed-0/pytorch_model.ckpt
6 86.80% seed-6/pytorch_model.ckpt
4 86.62% seed-4/pytorch_model.ckpt
1 86.54% seed-1/pytorch_model.ckpt
2 86.37% seed-2/pytorch_model.ckpt
5 86.29% seed-5/pytorch_model.ckpt
7 86.18% seed-7/pytorch_model.ckpt
3 86.13% seed-3/pytorch_model.ckpt
8 85.92% seed-8/pytorch_model.ckpt
9 85.18% seed-9/pytorch_model.ckpt
  • Holdout score of seed 0: 88.90%

Quick Start

Load Best Model

import torch
from huggingface_hub import hf_hub_download

from libribrain_experiments.models.configurable_modules.classification_module import (
    ClassificationModule,
)

# Download a checkpoint (seed-0)
checkpoint_path = hf_hub_download(
    repo_id="zuazo/megconformer-speech-detection", filename="seed-0/pytorch_model.ckpt"
)

# Choose device
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# Load model and move to device
model = ClassificationModule.load_from_checkpoint(checkpoint_path, map_location=device)
model.eval()

# Inference
meg_signal = torch.randn(1, 306, 125, device=device)  # Create directly on device

with torch.no_grad():
    logits = model(meg_signal)
    prediction = torch.argmax(logits, dim=1)  # 0=silence, 1=speech

print(f"Prediction: {'Speech' if prediction.item() == 1 else 'Silence'}")

Model Details

  • Architecture: Conformer Small
    • Hidden size: 144
    • FFN dim: 576
    • Layers: 16
    • Attention heads: 4
    • Depthwise conv kernel: 31
  • Input: 306-channel MEG signals
  • Window size: 2.5 seconds (625 samples at 250 Hz)
  • Output: Binary classification (silence/speech)
  • Training: LibriBrain 2025 Standard track

Reproducibility

All 10 random seeds are provided to ensure reproducibility.

Citation

@misc{dezuazo2025megconformerconformerbasedmegdecoder,
      title={MEGConformer: Conformer-Based MEG Decoder for Robust Speech and Phoneme Classification}, 
      author={Xabier de Zuazo and Ibon Saratxaga and Eva Navas},
      year={2025},
      eprint={2512.01443},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2512.01443}, 
}

License

The 3-Clause BSD License

Links