### Description Inconsistent function description at https://github.com/google/trax/blob/master/trax/layers/attention.py#L330C1-L342C23 The function states that it "Returns attention-computed per-head activations and unchanged mask." but returns only the activations.