E(3)-Pose

Equivariant symmetry-aware head pose estimation for fetal MRI

MIT
CSAIL
Boston Children's
Hospital
Harvard
Medical School
Inria
Université Côte d’Azur


Abstract

We present E(3)-Pose, a novel fast pose estimation method that jointly and explicitly models rotation equivariance and object symmetry. Our work is motivated by the challenging problem of accounting for fetal head motion during a diagnostic MRI scan. We aim to enable automatic adaptive prescription of 2D diagnostic MRI slices with 6-DoF head pose estimation, supported by 3D MRI volumes rapidly acquired before each 2D slice. Existing methods struggle to generalize to clinical volumes, due to pose ambiguities induced by inherent anatomical symmetries, as well as low resolution, noise, and artifacts. In contrast, E(3)-Pose captures anatomical symmetries and rigid pose equivariance by construction, and yields robust estimates of the fetal head pose. Our experiments on publicly available and representative clinical fetal MRI datasets demonstrate the superior robustness and generalization of our method across domains. Crucially, E(3)-Pose achieves state-of-the-art accuracy on clinical MRI volumes, paving the way for clinical translation.

Method

E(3)-Pose is a rotation-equivariant and symmetry-aware framework for 6-DoF pose estimation. Left: a rapid navigator volume is inserted between every two 2D diagnostic MRI slices. It is used to estimate the fetal head pose to adjust imaging plane prescription in real time. Right: To enable robust performance, the network architecture employs E(3)-equivariant convolutional filters to capture pose equivariance and pseudovectors to account for left-right head symmetry.
Overview of E(3)-Pose. We first train a CNN \(\psi\) to segment the object. We estimate translation based on the center-of-mass of the predicted mask and then crop the 3D volumes around this mask. The cropped volumes are fed to an E(3)-CNN \(\phi\) trained independently to regress the orthonormal basis of the object frame, parametrized as one pseudovector (red) and two vectors (blue and green). The output is later constrained to represent a rotation matrix by applying SVD and then choosing the pseudovector \(e_x\) direction that results in a proper rotation without reflection (i.e., \(\det(M(\hat{R})) = 1\)).

Results

Example results. Volumes are displayed before (row 1) and after (rows 2-6) alignment to the canonical object frame. The brain mask is also aligned to the GT (green outline) and predicted (red outline) frames. Navigator volumes include spin history artifacts (blue arrows) and low resolution/SNR, posing challenges for pose estimation. While baseline methods struggle (red Xs, rotation error\(> 60^\circ\)) in younger fetuses (column 3) and navigator volumes (columns 4-6), E(3)-Pose correctly predicts pose in all cases. E(3)-Pose remains accurate under significant pose ambiguity, e.g. when the artifact intersects both eyes, large voxel size obscures brain structure, and the fetal brain is close to a sphere (column 6).
Simulation results. Left: Quantitative comparison of diagnostic slice stacks obtained using motion-blind prescription and E(3)-Pose. Mean ± standard deviation statistics are displayed. * indicates statistical significance (\(p<0.05\), pairwise Wilcoxon). Right: Brain coverage of the diagnostic slice stacks prescribed by each method, for three different example subjects and target anatomical orientations. Coverage gap (red) and obliqueness (purple, \(^\circ\)) metrics are respectively displayed. Spatial coverage gap regions are outlined in red.

In-utero demo

We implemented a feedback loop system that dynamically translates the navigator field-of-view (FOV) to follow the translational movements of the fetal head. We provide examples in a 31 and 28 week old fetus, respectively.

Left: the navigator FOV center (red) is dynamically translated to accurately follow the ground-truth fetal head center-of-mass (blue). Middle: our translated navigator volumes minimize the distance between the two. Right: translated navigator volumes align the FOV center (star) with the estimated brain segmentation mask (green outline). The dark bands in the navigator volumes represent spin history artifacts from the 2D diagnostic MRI slices.