SurgFormer: Scalable Learning of Organ Deformation with Resection Support and Real-Time Inference

Ashkan Shahbazi1,4,★Elaheh Akbari1,4,★Kyvia Pereira2,4Jon S. Heiselman2,4Annie Benson2,4Garrison Lawrence Horswill Johnston3,4Jie Ying Wu1,4Nabil Simaan3,4Michael Miga2,4Soheil Kolouri1,4
1 Department of Computer Science, College of Connected Computing, Vanderbilt University    2 Department of Biomedical Engineering, Vanderbilt University    3 Department of Mechanical Engineering, Vanderbilt University    4 Vanderbilt Institute for Surgery and Engineering (VISE)
Equal contribution
Paper (PDF) arXiv Code (coming soon) Data (coming soon)
SurgFormer teaser
SurgFormer enables real-time, anatomically plausible soft-tissue simulation for minimally invasive surgery, jointly modeling tool-driven deformation and topology-changing resection. We demonstrate cholecystectomy (top) and appendectomy (bottom); each row shows the input, predicted deformation under tool interaction, and resulting resection state, highlighting geometric fidelity and visual realism.
Abstract

Overview

We introduce SurgFormer, a multiresolution gated transformer for data-driven soft tissue simulation on volumetric meshes. High-fidelity biomechanical solvers are often too costly for interactive use, so we train SurgFormer on solver-generated data to predict nodewise displacement fields at near-real-time rates. SurgFormer builds a fixed mesh hierarchy and applies repeated multibranch blocks that combine local message passing, coarse global self-attention, and pointwise feedforward updates, fused by learned per-node, per-channel gates to adaptively integrate local and long-range information while remaining scalable on large meshes. For cut-conditioned simulation, resection information is encoded as a learned cut embedding and provided as an additional input, enabling a unified model for both standard deformation prediction and topology-altering cases. We also introduce two surgical simulation datasets generated under a unified protocol with XFEM-based supervision: a cholecystectomy resection dataset and an appendectomy manipulation and resection dataset with cut and uncut cases. To our knowledge, this is the first learned volumetric surrogate setting to study XFEM-supervised cut-conditioned deformation within the same volumetric pipeline as standard deformation prediction.
Soft Tissue Deformation Geometric Deep Learning Surgical Simulation XFEM Real-Time Inference Transformer
Contributions

What We Introduce

01

SurgFormer Architecture

A multiresolution gated transformer combining local message passing, coarse-level global attention, and pointwise updates in a scalable unified block.

02

Cut-Conditioned Deformation

First unified volumetric surrogate for both standard and topology-changing deformation prediction under the same XFEM-supervised pipeline.

03

Two New Datasets

Procedure-level surgical simulation datasets for cholecystectomy resection and appendectomy manipulation/resection under a unified protocol.

04

Comprehensive Benchmarks

Unified accuracy/efficiency benchmarks, architectural ablations, cross-task transfer studies, and adversarial smoothness stress tests.

Method

SurgFormer Architecture

SurgFormer processes volumetric organ meshes through a fixed multi-level hierarchy. At each level, a multibranch block adaptively combines three complementary information streams via learned per-node, per-channel gates — enabling global context without sacrificing scalability at fine resolutions.

Input Features
Node coords · Tool signal · BC indicator · Cut embedding (optional)
Adapter
Maps raw features to model width D
Encoder
SurgFormer blocks + max-pool downsampling (L levels)
Decoder
Broadcast upsampling + skip connections
Output Head
Linear → nodewise 3D displacement
Multibranch Block — three fused branches per level:
Local branch — message passing on fine-scale graph edges for local geometry
Global branch — self-attention restricted to coarse levels for long-range context
Feedforward branch — pointwise MLP for per-node feature refinement
Branches are fused by learned per-node, per-channel softmax gates Γ, enabling adaptive integration without fixed weighting.
SurgFormer architecture figure

For cut-conditioned deformation, a learned embedding encodes the binary per-node resection indicator and is concatenated to the base node features before the input Adapter — no architectural changes required, and the mesh graph remains fixed throughout.

Results

Quantitative Performance

ℹ️ All results are averaged over 3 random seeds. Metrics: normalized RMSE ↓, normalized Max Error ↓, Deformation Capture Metric (DCM) ↑, inference time per sample ↓.
97.2%
DCM on deformation task
0.6ms
Inference per sample
6.5M
Parameters
83.6%
DCM on cut-conditioned task

Soft-tissue deformation modeling — comparison with baselines under the linear FEM evaluation protocol.

MethodRMSE ↓Max Err ↓DCM ↑Time (ms) ↓Params (M)
GAOT0.0280.03496.36 ±0.880.53 ±0.047.2
NIN0.0330.04593.61 ±0.611.57 ±0.156.8
MGN-T0.0830.12286.76 ±1.271.31 ±0.096.2
PointNet0.0300.03896.37 ±0.331.28 ±0.056.0
PVCNN0.0390.05392.58 ±1.341.69 ±0.236.0
SurgFormer 0.018 0.022 97.21 ±1.06 0.64 ±0.07 6.5

Cholecystectomy cut-conditioned deformation — each entry reports performance without → with cut conditioning.

MethodRMSE ↓Max Err ↓DCM ↑Time (ms) ↓Params (M)
GAOT0.158 → 0.1330.289 → 0.22163.35 → 72.26 ±1.110.51 ±0.047.2
NIN0.216 → 0.1850.364 → 0.28461.11 → 69.78 ±0.931.56 ±0.176.8
PointNet0.199 → 0.1570.334 → 0.18162.03 → 70.12 ±0.261.33 ±0.066.0
PVCNN0.227 → 0.1440.382 → 0.17364.45 → 72.05 ±1.621.59 ±0.186.0
SurgFormer 0.143 → 0.112 0.191 → 0.164 66.85 → 83.61 ±0.44 0.72 ±0.09 6.5

Appendectomy — mixed uncut and cut cases with cut conditioning (c = 0 for uncut).

MethodRMSE ↓Max Err ↓DCM ↑Time (ms) ↓Params (M)
GAOT0.1640.27279.44 ±1.170.49 ±0.087.2
NIN0.1550.26380.24 ±2.240.81 ±0.146.8
PointNet0.1800.29376.78 ±0.731.07 ±0.136.0
PVCNN0.1190.25788.74 ±1.801.23 ±0.196.0
SurgFormer 0.135 0.228 87.61 ±2.02 0.48 ±0.08 6.0
Datasets

Surgical Simulation Datasets

We introduce two procedure-level datasets generated under a unified XFEM protocol, covering both standard tool-driven deformation and cut-conditioned resection cases.

XFEM data generation pipeline

Cholecystectomy

Gallbladder removal with progressive resection along the liver-gallbladder interface.

Vertices13,085
Tetrahedra48,911
Resection stages25
Total solutions~14,000
AnatomyLiver · Gallbladder · Vasculature

Appendectomy

Mixed cut and uncut cases with manipulation and resection of the appendix.

Vertices7,870
Tetrahedra29,723
Resection stages25
CasesCut + Uncut (unified)
AnatomyCecum · Ileum · Appendix · Mesoappendix
🔓 Code and datasets will be released upon acceptance. Generated using getFEM with XFEM (E = 2100 Pa, ν = 0.45) and segmented from CT volumes using 3D Slicer.
Citation

BibTeX

@article{shahbazi2026surgformer,
  title   = {SurgFormer: Scalable Learning of Organ Deformation
             with Resection Support and Real-Time Inference},
  author  = {Shahbazi, Ashkan and Akbari, Elaheh and
             Pereira, Kyvia and Heiselman, Jon S. and
             Benson, Annie and Johnston, Garrison Lawrence Horswill and
             Wu, Jie Ying and Simaan, Nabil and
             Miga, Michael and Kolouri, Soheil},
  journal = {arXiv preprint arXiv:XXXX.XXXXX},
  year    = {2026},
}