Transformer 썸네일형 리스트형 [Pre-training of Deep Bidirectional Transformers for Language Understanding] BERT 논문 리뷰 [Pre-training of Deep Bidirectional Transformers for Language Understanding] BERT https://arxiv.org/abs/1810.04805 BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding We introduce a new language representation model called BERT, which stands for Bidirectional Encoder Representations from Transformers. Unlike recent language representation models, BERT is designed to .. 더보기 [Transformer: Attention Is All You Need] 논문리뷰 Paper: Attention Is All You Need The dominant sequence transduction models are based on complex recurrent or convolutional neural networks in an encoder-decoder configuration. The best performing models also connect the encoder and decoder through an attention mechanism. We propose a new arxiv.org Abstract 논문 발표 당시 주도하던 *sequence transduction 모델은 encoder와 decoder를 포함한 복잡한 recurrent (RNN) 혹은 conv.. 더보기 이전 1 다음