kiyoungkim1 / lmkor Goto Github PK
View Code? Open in Web Editor NEWPretrained Language Models for Korean
License: Apache License 2.0
Pretrained Language Models for Korean
License: Apache License 2.0
from transformers import BertTokenizerFast, GPT2LMHeadModel
tokenizer_gpt3 = BertTokenizerFast.from_pretrained("kykim/gpt3-kor-small_based_on_gpt2")
input_ids = tokenizer_gpt3.encode("text to tokenize")[1:] # remove cls token
model_gpt3 = GPT2LMHeadModel.from_pretrained("kykim/gpt3-kor-small_based_on_gpt2")
사용법이 어떻게 되나요?
문서 길이가 길면 실행이 안되던데 길이 조절을 어떻게 할 수 있나요?
bertshared_summarization.py 이곳에서 수정하나요?
Traceback (most recent call last):
File "c:/python/글줄이기.py", line 5, in
summarize(text)
File "c:\python\LMkor\examples\bertshared_summarization.py", line 21, in call
max_length=max_length
File "C:\Users\qusdb\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\autograd\grad_mode.py", line 15, in decorate_context
return func(*args, **kwargs)
File "C:\Users\qusdb\AppData\Local\Programs\Python\Python37\lib\site-packages\transformers\generation_utils.py", line 922, in generate
model_kwargs = self._prepare_encoder_decoder_kwargs_for_generation(input_ids, model_kwargs)
File "C:\Users\qusdb\AppData\Local\Programs\Python\Python37\lib\site-packages\transformers\generation_utils.py", line 417, in _prepare_encoder_decoder_kwargs_for_generation
model_kwargs["encoder_outputs"]: ModelOutput = encoder(input_ids, return_dict=True, **encoder_kwargs)
File "C:\Users\qusdb\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\nn\modules\module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "C:\Users\qusdb\AppData\Local\Programs\Python\Python37\lib\site-packages\transformers\models\bert\modeling_bert.py", line 957, in forward
buffered_token_type_ids_expanded = buffered_token_type_ids.expand(batch_size, seq_length)
RuntimeError: The expanded size of the tensor (772) must match the existing size (512) at non-singleton dimension 1. Target sizes: [1, 772]. Tensor sizes: [1, 512]
위 사진처럼 encoder와 decoder의 embedding 레이어의 weight가 다른 모델을 load state dict해와도
위에 보이듯이 decoder embedding layer의 weigth가 encoder랑 decoder에 둘다 들어가게 됩니다.
그래서 직접 encoder에
model.state_dict()['encoder.embeddings.word_embeddings.weight'].copy_(ckpt['state_dict']['encoder.embeddings.word_embeddings.weight'])
copy로 값을 넣어봤는데 그러면 encoder랑 decoder embedding layer에 둘다 encoder embedding layer의 값만 들어가게 됩니다.
의도하신 부분인지는 모르겟지만 제 모델은 두 부분이 다른 값을 가지게 학습되어서 각각 load해오고 싶은데 해결방법이 없을까요
Hello, thanks for the repo! Is the pretraining dataset available somewhere?
안녕하세요 좋은 모델 만들어주셔서 감사합니다.
만들어주신 mask_prediction 함수로 한국어 문장의 맨마지막 종결어를 맞추는 모델을 만들었는데요.
질문이 있습니다.
좋은 모델 만들어 주셔서 감사합니다.
김병준 드림.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.