transformer

[Attention Is All You Need](https://arxiv.org/pdf/1706.03762.pdf)
2021-05-01
6 min read
[BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding](https://arxiv.org/pdf/1810.04805.pdf)
2021-04-20
3 min read