Improved Equivarient Reg tech

Strategies:

  • add Cross Attention

  • do attention on patches instead of pixels

  • do relaxations between displacement prediction layers

Cross attention: empirically a bust. Not sure why

attention on patches: Huge performance boost, huge improvement in training time

relaxations between displacement prediction layers: also great, helps training time, helps add more layers to network stably.

new record for rotation equivariant mnist training time- was ~30 mins last week, now is ~30 secs. Most important is relaxation layers

Allows removal of Diffusion Regularized pretraining- bless any reduction in training complexity. Great sign.

Now train GradICON end to end.

Back to Reports
subdavis.com forrestli.com