1 Department of Computer Science, College of Connected Computing, Vanderbilt University
2 Department of Biomedical Engineering, Vanderbilt University
3 Department of Mechanical Engineering, Vanderbilt University
4 Vanderbilt Institute for Surgery and Engineering (VISE)
SurgFormer enables real-time, anatomically plausible soft-tissue simulation for minimally invasive surgery, jointly modeling tool-driven deformation and topology-changing resection. We demonstrate cholecystectomy (top) and appendectomy (bottom); each row shows the input, predicted deformation under tool interaction, and resulting resection state, highlighting geometric fidelity and visual realism.
Abstract
Overview
We introduce SurgFormer, a multiresolution gated transformer for data-driven soft tissue simulation on volumetric meshes. High-fidelity biomechanical solvers are often too costly for interactive use, so we train SurgFormer on solver-generated data to predict nodewise displacement fields at near-real-time rates. SurgFormer builds a fixed mesh hierarchy and applies repeated multibranch blocks that combine local message passing, coarse global self-attention, and pointwise feedforward updates, fused by learned per-node, per-channel gates to adaptively integrate local and long-range information while remaining scalable on large meshes. For cut-conditioned simulation, resection information is encoded as a learned cut embedding and provided as an additional input, enabling a unified model for both standard deformation prediction and topology-altering cases. We also introduce two surgical simulation datasets generated under a unified protocol with XFEM-based supervision: a cholecystectomy resection dataset and an appendectomy manipulation and resection dataset with cut and uncut cases. To our knowledge, this is the first learned volumetric surrogate setting to study XFEM-supervised cut-conditioned deformation within the same volumetric pipeline as standard deformation prediction.
Soft Tissue DeformationGeometric Deep LearningSurgical SimulationXFEMReal-Time InferenceTransformer
Contributions
What We Introduce
01
SurgFormer Architecture
A multiresolution gated transformer combining local message passing, coarse-level global attention, and pointwise updates in a scalable unified block.
02
Cut-Conditioned Deformation
First unified volumetric surrogate for both standard and topology-changing deformation prediction under the same XFEM-supervised pipeline.
03
Two New Datasets
Procedure-level surgical simulation datasets for cholecystectomy resection and appendectomy manipulation/resection under a unified protocol.
04
Comprehensive Benchmarks
Unified accuracy/efficiency benchmarks, architectural ablations, cross-task transfer studies, and adversarial smoothness stress tests.
Method
SurgFormer Architecture
SurgFormer processes volumetric organ meshes through a fixed multi-level hierarchy. At each level, a multibranch block adaptively combines three complementary information streams via learned per-node, per-channel gates — enabling global context without sacrificing scalability at fine resolutions.
Input Features
Node coords · Tool signal · BC indicator · Cut embedding (optional)
Multibranch Block — three fused branches per level:
Local branch— message passing on fine-scale graph edges for local geometry
Global branch— self-attention restricted to coarse levels for long-range context
Feedforward branch— pointwise MLP for per-node feature refinement
Branches are fused by learned per-node, per-channel softmax gates Γ, enabling adaptive integration without fixed weighting.
For cut-conditioned deformation, a learned embedding encodes the binary per-node resection indicator and is concatenated to the base node features before the input Adapter — no architectural changes required, and the mesh graph remains fixed throughout.
Results
Quantitative Performance
ℹ️
All results are averaged over 3 random seeds. Metrics: normalized RMSE ↓, normalized Max Error ↓, Deformation Capture Metric (DCM) ↑, inference time per sample ↓.
97.2%
DCM on deformation task
0.6ms
Inference per sample
6.5M
Parameters
83.6%
DCM on cut-conditioned task
Table 1
Soft-tissue deformation modeling — comparison with baselines under the linear FEM evaluation protocol.
Method
RMSE ↓
Max Err ↓
DCM ↑
Time (ms) ↓
Params (M)
GAOT
0.028
0.034
96.36 ±0.88
0.53 ±0.04
7.2
NIN
0.033
0.045
93.61 ±0.61
1.57 ±0.15
6.8
MGN-T
0.083
0.122
86.76 ±1.27
1.31 ±0.09
6.2
PointNet
0.030
0.038
96.37 ±0.33
1.28 ±0.05
6.0
PVCNN
0.039
0.053
92.58 ±1.34
1.69 ±0.23
6.0
SurgFormer
0.018
0.022
97.21 ±1.06
0.64 ±0.07
6.5
Table 2
Cholecystectomy cut-conditioned deformation — each entry reports performance without → with cut conditioning.
Method
RMSE ↓
Max Err ↓
DCM ↑
Time (ms) ↓
Params (M)
GAOT
0.158 → 0.133
0.289 → 0.221
63.35 → 72.26 ±1.11
0.51 ±0.04
7.2
NIN
0.216 → 0.185
0.364 → 0.284
61.11 → 69.78 ±0.93
1.56 ±0.17
6.8
PointNet
0.199 → 0.157
0.334 → 0.181
62.03 → 70.12 ±0.26
1.33 ±0.06
6.0
PVCNN
0.227 → 0.144
0.382 → 0.173
64.45 → 72.05 ±1.62
1.59 ±0.18
6.0
SurgFormer
0.143 → 0.112
0.191 → 0.164
66.85 → 83.61 ±0.44
0.72 ±0.09
6.5
Table 3
Appendectomy — mixed uncut and cut cases with cut conditioning (c = 0 for uncut).
Method
RMSE ↓
Max Err ↓
DCM ↑
Time (ms) ↓
Params (M)
GAOT
0.164
0.272
79.44 ±1.17
0.49 ±0.08
7.2
NIN
0.155
0.263
80.24 ±2.24
0.81 ±0.14
6.8
PointNet
0.180
0.293
76.78 ±0.73
1.07 ±0.13
6.0
PVCNN
0.119
0.257
88.74 ±1.80
1.23 ±0.19
6.0
SurgFormer
0.135
0.228
87.61 ±2.02
0.48 ±0.08
6.0
Datasets
Surgical Simulation Datasets
We introduce two procedure-level datasets generated under a unified XFEM protocol, covering both standard tool-driven deformation and cut-conditioned resection cases.
Cholecystectomy
Gallbladder removal with progressive resection along the liver-gallbladder interface.
Vertices13,085
Tetrahedra48,911
Resection stages25
Total solutions~14,000
AnatomyLiver · Gallbladder · Vasculature
Appendectomy
Mixed cut and uncut cases with manipulation and resection of the appendix.
Vertices7,870
Tetrahedra29,723
Resection stages25
CasesCut + Uncut (unified)
AnatomyCecum · Ileum · Appendix · Mesoappendix
🔓
Code and datasets will be released upon acceptance. Generated using getFEM with XFEM (E = 2100 Pa, ν = 0.45) and segmented from CT volumes using 3D Slicer.
Citation
BibTeX
@article{shahbazi2026surgformer,
title = {SurgFormer: Scalable Learning of Organ Deformation
with Resection Support and Real-Time Inference},
author = {Shahbazi, Ashkan and Akbari, Elaheh and
Pereira, Kyvia and Heiselman, Jon S. and
Benson, Annie and Johnston, Garrison Lawrence Horswill and
Wu, Jie Ying and Simaan, Nabil and
Miga, Michael and Kolouri, Soheil},
journal = {arXiv preprint arXiv:XXXX.XXXXX},
year = {2026},
}