본문 바로가기

nlp

셀레니움, Kobart-summarization을 이용한 나무위키 목차별 요약 import torch from transformers import PreTrainedTokenizerFast from transformers import BartForConditionalGeneration def model_load(model_name): tokenizer = PreTrainedTokenizerFast.from_pretrained(f"{model_name}/kobart-summarization") model = BartForConditionalGeneration.from_pretrained(f"{model_name}/kobart-summarization") return model, tokenizer def preprocessing(text): text = text.replace('[[0.. 더보기
[Pre-training of Deep Bidirectional Transformers for Language Understanding] BERT 논문 리뷰 [Pre-training of Deep Bidirectional Transformers for Language Understanding] BERT https://arxiv.org/abs/1810.04805 BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding We introduce a new language representation model called BERT, which stands for Bidirectional Encoder Representations from Transformers. Unlike recent language representation models, BERT is designed to .. 더보기
[NEURAL MACHINE TRANSLATION BY JOINTLY LEARNING TO ALIGN AND TRANSLATE] Attention 논문 리뷰 [NEURAL MACHINE TRANSLATION BY JOINTLY LEARNING TO ALIGN AND TRANSLATE] Attention https://arxiv.org/abs/1409.0473 Neural Machine Translation by Jointly Learning to Align and Translate Neural machine translation is a recently proposed approach to machine translation. Unlike the traditional statistical machine translation, the neural machine translation aims at building a single neural network tha.. 더보기
[Word2Vec] CBOW, Skip-gram 논문 리뷰 [Efficient Estimation of Word Representations in Vector Space] Efficient Estimation of Word Representations in Vector Space We propose two novel model architectures for computing continuous vector representations of words from very large data sets. The quality of these representations is measured in a word similarity task, and the results are compared to the previously best per arxiv.org Introduc.. 더보기