Performance results first

GradICON, Constricon can do a great job solving the challenging retina registration problem. (Here we use the Constricon approach)

Constricon:

In the event that the moving and fixed images are misaligned by a constant shift, ConstrICON can manage it with careful hyperparameter settings

Where GradICON just fails

The new equivariant architecture can't see the shift, so it works fine whether or not the images are shifted

Equivariant architecture details

For the first breakthrough, we need to cover a proof first.

Assertion: if $\Phi$ is equivariant to transforming $I^A, I^B \rightarrow I^A \circ W, I^B \circ U$ (henceforth [W, U] equivariant) and $\Psi$ is equivariant to transforming $I^A, I^B \rightarrow I^A \circ U, I^B \circ U$ (henceforth [U, U] equivariant) for U and W from a class of transforms $\mathcal{T}$ , then $TwoStep\{\Phi, \Psi\}$ is $[W, U]$ equivariant

Proof:

\begin{aligned} T&woStep\{\Phi, \Psi\}[A \circ W, B \circ U]\\ &= \Phi[A \circ W, B \circ U] \circ \Psi\left[A \circ W \circ \Phi[A \circ W, B \circ U], B \circ U\right]\\ &= W^{-1} \circ \Phi[A, B] \circ U \circ \Psi\left[A \circ W \circ W^{-1} \circ \Phi[A, B] \circ U, B \circ U\right] \\ &= W^{-1} \circ \Phi[A, B] \circ U \circ \Psi\left[A \circ \Phi[A, B] \circ U, B \circ U\right] \\ &= W^{-1} \circ \Phi[A, B] \circ U \circ U^{-1} \circ \Psi\left[A \circ \Phi[A, B], B\right] \circ U \\ &= W^{-1} \circ \Phi[A, B] \circ \Psi\left[A \circ \Phi[A, B], B\right] \circ U \\ &= W^{-1} \circ TwoStep\{\Phi, \Psi\}[A, B] \circ U \\ \end{aligned}

This allows the use of powerful but only [U, U] equivariant algorithms such as Voxelmorph-like approaches to be integrated into a [W, U] equivariant pipeline.

For our registration approach, we will use

\begin{aligned} &TwoStep\{\\ &~~ TwoStep\{ \\ &~~~~ DownsampleRegistration\{Equivariant\_reg\},\\ &~~~~ \Psi_1\\ &~~ \},\\ &~~ \Psi_2\\ &\}\\ \end{aligned}

where $\Psi_i$ are standard registration U-Nets that are naturally [U, U] equivariant.

Tuning details

For this experiment, we target [W, U] translation equivariance.

To make this train end to end, we made several fine adjustments to equivariant_reg.

We normalize the feature vectors before passing to the attention layer, and then scale by 4.
We use a simple convolutional network for featurizing with residual connections and no downsampling. We pad the image before processing, and crop afterwards, to remove boundary effects. This is exactly translation equivariant to 1 pixel shifts.
We add batch normalization to the feature extraction network.
We use diffusion regularization on the end to end registration pipeline. GradICON or bending energy regularization work for fine tuning, but require student-teacher initialization and cannot train from scratch. For medical images, it is very important to simplify the training process, so eliminating the student-teacher initialization is a massive win.
We use 128 dimensional feature vectors. 64 dimensions is too little.

Notebooks

constricon_noshift constricon_shift gradicon_shift equicon_noshift equicon_shift

Back to Reports

subdavis.com forrestli.com