minisv

Improving long-read somatic structural variant calling with pangenome and de novo personal genome assembly.

This project develops a long-read somatic structural variant (SV) calling framework that leverages pangenome graphs and de novo personal genome assembly to improve sensitivity and precision over reference-only pipelines (Qin et al., 2025).

Tumor genomes harbor SVs in repetitive, hypervariable, and low-mappability regions where reference-based callers systematically lose recall. By assembling each individual’s genome and comparing tumor and matched-normal reads against pangenome-aware coordinates, the method recovers complex somatic events including large indels, mobile element insertions, and tandem duplications. Source code is available on GitHub. Related publications include colorSV for long-range somatic SV calling from co-assembly graphs (Le et al., 2025) and LongcallD for joint calling and phasing of small, structural, and mosaic variants from long reads (Gao et al., 2026).

References

2026

  1. bioRxiv
    LongcallD: joint calling and phasing of small, structural and mosaic variants from long reads
    Yan Gao, Wen-Wei Liao, Qian Qin, and 2 more authors
    bioRxiv, Mar 2026

2025

  1. bioRxiv
    Improving long-read somatic structural variant calling with pangenome and de novo personal genome assembly
    Qian Qin, Jakob Heinz, and Heng Li
    bioRxiv, Oct 2025
  2. GPB
    colorSV: Long-range Somatic Structural Variation Calling from Matched Tumor-normal Co-assembly Graphs
    M K Le, Qian Qin, and Heng Li
    Genomics, Proteomics & Bioinformatics, Sep 2025