Multi-Head Attention

19/03/2024.

Expanding the concept of Scaled Dot-Product Attention, Vaswani et al proposed the multi-head attention mechanism.

Keep reading at Multi-Head Attention.