Skip to content

Further development of attention maps; no weight decay for 1D parameters #99

Further development of attention maps; no weight decay for 1D parameters

Further development of attention maps; no weight decay for 1D parameters #99