Improved Equivarient Reg tech
Strategies:
add Cross Attention
do attention on patches instead of pixels
do relaxations between displacement prediction layers
Cross attention: empirically a bust. Not sure why
attention on patches: Huge performance boost, huge improvement in training time
relaxations between displacement prediction layers: also great, helps training time, helps add more layers to network stably.
new record for rotation equivariant mnist training time- was ~30 mins last week, now is ~30 secs. Most important is relaxation layers
Allows removal of Diffusion Regularized pretraining- bless any reduction in training complexity. Great sign.
Now train GradICON end to end.
Back to Reports subdavis.com forrestli.com
© Hastings Greer. Last modified: September 03, 2025. Website built with Franklin.jl and the Julia programming language.