Transformer

Attention is all you need

论文地址：[1706.03762] Attention Is All You Need

核心创新点：Transformer 架构和 Self-Attention 自注意力机制。

序列转导模型（Sequence Transduction Model）

之前的实现：

Transformer 结构的创新：

语义信息、位置信息、上下文信息（QKV）

Bert 的位置编码方式不一样；