Opening review spoke
What is normalization in the context of the softmax function? — LLM Research | Unlo