What is multi-head attention in the Transformer architecture?

What is multi-head attention in the Transformer architecture? — LLM Engineering | Unlo