I saw "rich semantics" in WavTokenizer. What does this rich semantics mean? ASR probe or Emotion probe or something else?