megconformer-speech-detection / README.md

zuazo

Add the LibriBrain holdout score

3f98265 verified 14 days ago

preview code

raw

history blame contribute delete

3.38 kB

metadata

license: bsd-3-clause
tags:
  - meg
  - brain-signals
  - speech-detection
  - conformer
  - libribrain
datasets:
  - pnpl/LibriBrain
metrics:
  - f1
library_name: pytorch
model-index:
  - name: megconformer-speech-detection
    results:
      - task:
          type: audio-classification
          name: Speech classification
        dataset:
          name: LibriBrain 2025 PNPL (Standard track, speech task)
          type: pnpl/LibriBrain
          split: holdout
        metrics:
          - name: F1-macro
            type: f1
            value: 0.889
            args:
              average: macro

MEGConformer for Speech Detection

Conformer-based MEG decoder for binary speech detection, trained with 10 different random seeds for reproducibility.

Model Performance

Seed	Val F1-Macro	Checkpoint
0 (best)	87.06%	`seed-0/pytorch_model.ckpt`
6	86.80%	`seed-6/pytorch_model.ckpt`
4	86.62%	`seed-4/pytorch_model.ckpt`
1	86.54%	`seed-1/pytorch_model.ckpt`
2	86.37%	`seed-2/pytorch_model.ckpt`
5	86.29%	`seed-5/pytorch_model.ckpt`
7	86.18%	`seed-7/pytorch_model.ckpt`
3	86.13%	`seed-3/pytorch_model.ckpt`
8	85.92%	`seed-8/pytorch_model.ckpt`
9	85.18%	`seed-9/pytorch_model.ckpt`

Holdout score of seed 0: 88.90%

Quick Start

Load Best Model

import torch
from huggingface_hub import hf_hub_download

from libribrain_experiments.models.configurable_modules.classification_module import (
    ClassificationModule,
)

# Download a checkpoint (seed-0)
checkpoint_path = hf_hub_download(
    repo_id="zuazo/megconformer-speech-detection", filename="seed-0/pytorch_model.ckpt"
)

# Choose device
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# Load model and move to device
model = ClassificationModule.load_from_checkpoint(checkpoint_path, map_location=device)
model.eval()

# Inference
meg_signal = torch.randn(1, 306, 125, device=device)  # Create directly on device

with torch.no_grad():
    logits = model(meg_signal)
    prediction = torch.argmax(logits, dim=1)  # 0=silence, 1=speech

print(f"Prediction: {'Speech' if prediction.item() == 1 else 'Silence'}")

Model Details

Architecture: Conformer Small
- Hidden size: 144
- FFN dim: 576
- Layers: 16
- Attention heads: 4
- Depthwise conv kernel: 31
Input: 306-channel MEG signals
Window size: 2.5 seconds (625 samples at 250 Hz)
Output: Binary classification (silence/speech)
Training: LibriBrain 2025 Standard track

Reproducibility

All 10 random seeds are provided to ensure reproducibility.

Citation

@misc{dezuazo2025megconformerconformerbasedmegdecoder,
      title={MEGConformer: Conformer-Based MEG Decoder for Robust Speech and Phoneme Classification}, 
      author={Xabier de Zuazo and Ibon Saratxaga and Eva Navas},
      year={2025},
      eprint={2512.01443},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2512.01443}, 
}

License

The 3-Clause BSD License

zuazo
/

megconformer-speech-detection