SurgFormer: Scalable Learning of Organ Deformation with Resection Support and Real-Time Inference

Ashkan Shahbazi^1,4,★ Elaheh Akbari^1,4,★ Kyvia Pereira^2,4 Jon S. Heiselman^2,4 Annie Benson^2,4 Garrison Lawrence Horswill Johnston^3,4 Jie Ying Wu^1,4 Nabil Simaan^3,4 Michael Miga^2,4 Soheil Kolouri^1,4

¹ Department of Computer Science, College of Connected Computing, Vanderbilt University ² Department of Biomedical Engineering, Vanderbilt University ³ Department of Mechanical Engineering, Vanderbilt University ⁴ Vanderbilt Institute for Surgery and Engineering (VISE)

^★ Equal contribution

Paper (PDF) arXiv Code (coming soon) Data (coming soon)

SurgFormer enables real-time, anatomically plausible soft-tissue simulation for minimally invasive surgery, jointly modeling tool-driven deformation and topology-changing resection. We demonstrate cholecystectomy (top) and appendectomy (bottom); each row shows the input, predicted deformation under tool interaction, and resulting resection state, highlighting geometric fidelity and visual realism.

Abstract

Overview

We introduce SurgFormer, a multiresolution gated transformer for data-driven soft tissue simulation on volumetric meshes. High-fidelity biomechanical solvers are often too costly for interactive use, so we train SurgFormer on solver-generated data to predict nodewise displacement fields at near-real-time rates. SurgFormer builds a fixed mesh hierarchy and applies repeated multibranch blocks that combine local message passing, coarse global self-attention, and pointwise feedforward updates, fused by learned per-node, per-channel gates to adaptively integrate local and long-range information while remaining scalable on large meshes. For cut-conditioned simulation, resection information is encoded as a learned cut embedding and provided as an additional input, enabling a unified model for both standard deformation prediction and topology-altering cases. We also introduce two surgical simulation datasets generated under a unified protocol with XFEM-based supervision: a cholecystectomy resection dataset and an appendectomy manipulation and resection dataset with cut and uncut cases. To our knowledge, this is the first learned volumetric surrogate setting to study XFEM-supervised cut-conditioned deformation within the same volumetric pipeline as standard deformation prediction.

Soft Tissue Deformation Geometric Deep Learning Surgical Simulation XFEM Real-Time Inference Transformer

Contributions

What We Introduce

SurgFormer Architecture

A multiresolution gated transformer combining local message passing, coarse-level global attention, and pointwise updates in a scalable unified block.

Cut-Conditioned Deformation

First unified volumetric surrogate for both standard and topology-changing deformation prediction under the same XFEM-supervised pipeline.

Two New Datasets

Procedure-level surgical simulation datasets for cholecystectomy resection and appendectomy manipulation/resection under a unified protocol.

Comprehensive Benchmarks

Unified accuracy/efficiency benchmarks, architectural ablations, cross-task transfer studies, and adversarial smoothness stress tests.

Method

SurgFormer Architecture

SurgFormer processes volumetric organ meshes through a fixed multi-level hierarchy. At each level, a multibranch block adaptively combines three complementary information streams via learned per-node, per-channel gates — enabling global context without sacrificing scalability at fine resolutions.

Input Features

Node coords · Tool signal · BC indicator · Cut embedding (optional)

→

Adapter

Maps raw features to model width D

→

Encoder
SurgFormer blocks + max-pool downsampling (L levels)

→

Decoder
Broadcast upsampling + skip connections

→

Output Head

Linear → nodewise 3D displacement

Multibranch Block — three fused branches per level:

Local branch — message passing on fine-scale graph edges for local geometry

Global branch — self-attention restricted to coarse levels for long-range context

Feedforward branch — pointwise MLP for per-node feature refinement

Branches are fused by learned per-node, per-channel softmax gates Γ, enabling adaptive integration without fixed weighting.

For cut-conditioned deformation, a learned embedding encodes the binary per-node resection indicator and is concatenated to the base node features before the input Adapter — no architectural changes required, and the mesh graph remains fixed throughout.

Results

Quantitative Performance

97.2%

DCM on deformation task

0.6ms

Inference per sample

6.5M

Parameters

83.6%

DCM on cut-conditioned task

Table 1

Soft-tissue deformation modeling — comparison with baselines under the linear FEM evaluation protocol.

Method	RMSE ↓	Max Err ↓	DCM ↑	Time (ms) ↓	Params (M)
GAOT	0.028	0.034	96.36 ±0.88	0.53 ±0.04	7.2
NIN	0.033	0.045	93.61 ±0.61	1.57 ±0.15	6.8
MGN-T	0.083	0.122	86.76 ±1.27	1.31 ±0.09	6.2
PointNet	0.030	0.038	96.37 ±0.33	1.28 ±0.05	6.0
PVCNN	0.039	0.053	92.58 ±1.34	1.69 ±0.23	6.0
SurgFormer	0.018	0.022	97.21 ±1.06	0.64 ±0.07	6.5

Table 2

Cholecystectomy cut-conditioned deformation — each entry reports performance without → with cut conditioning.

Method	RMSE ↓	Max Err ↓	DCM ↑	Time (ms) ↓	Params (M)
GAOT	0.158 → 0.133	0.289 → 0.221	63.35 → 72.26 ±1.11	0.51 ±0.04	7.2
NIN	0.216 → 0.185	0.364 → 0.284	61.11 → 69.78 ±0.93	1.56 ±0.17	6.8
PointNet	0.199 → 0.157	0.334 → 0.181	62.03 → 70.12 ±0.26	1.33 ±0.06	6.0
PVCNN	0.227 → 0.144	0.382 → 0.173	64.45 → 72.05 ±1.62	1.59 ±0.18	6.0
SurgFormer	0.143 → 0.112	0.191 → 0.164	66.85 → 83.61 ±0.44	0.72 ±0.09	6.5

Table 3

Appendectomy — mixed uncut and cut cases with cut conditioning (c = 0 for uncut).

Method	RMSE ↓	Max Err ↓	DCM ↑	Time (ms) ↓	Params (M)
GAOT	0.164	0.272	79.44 ±1.17	0.49 ±0.08	7.2
NIN	0.155	0.263	80.24 ±2.24	0.81 ±0.14	6.8
PointNet	0.180	0.293	76.78 ±0.73	1.07 ±0.13	6.0
PVCNN	0.119	0.257	88.74 ±1.80	1.23 ±0.19	6.0
SurgFormer	0.135	0.228	87.61 ±2.02	0.48 ±0.08	6.0

Datasets

Surgical Simulation Datasets

We introduce two procedure-level datasets generated under a unified XFEM protocol, covering both standard tool-driven deformation and cut-conditioned resection cases.

Cholecystectomy

Gallbladder removal with progressive resection along the liver-gallbladder interface.

Vertices13,085

Tetrahedra48,911

Resection stages25

Total solutions~14,000

AnatomyLiver · Gallbladder · Vasculature

Appendectomy

Mixed cut and uncut cases with manipulation and resection of the appendix.

Vertices7,870

Tetrahedra29,723

Resection stages25

CasesCut + Uncut (unified)

AnatomyCecum · Ileum · Appendix · Mesoappendix

Citation

BibTeX

@article{shahbazi2026surgformer,
  title   = {SurgFormer: Scalable Learning of Organ Deformation
             with Resection Support and Real-Time Inference},
  author  = {Shahbazi, Ashkan and Akbari, Elaheh and
             Pereira, Kyvia and Heiselman, Jon S. and
             Benson, Annie and Johnston, Garrison Lawrence Horswill and
             Wu, Jie Ying and Simaan, Nabil and
             Miga, Michael and Kolouri, Soheil},
  journal = {arXiv preprint arXiv:XXXX.XXXXX},
  year    = {2026},
}