Giter VIP home page Giter VIP logo

a_journey_into_math_of_ml's People

Contributors

aespresso avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

a_journey_into_math_of_ml's Issues

random_char function

大佬,您好, 在04_transformer_tutorial_2nd_part/BERT_tutorial/dataset/wiki_dataset.py 文件的第92行,应该是prob < 0.15, 按照原文 “The training data generator
chooses 15% of the token positions at random for
prediction ”

情感分析中[cls]对应的那一条向量

回顾在BERT的训练中Next Sentence Prediction中, 我们取出[cls]对应的那一条向量, 然后把他映射成1个数值并用sigmoid函数激活,请教一下您这一句话应该怎么理解?为什么要用[cls]对应的那一条向量作为分类的评判标准呢?谢谢

BERT预训练

想问一下,你BERT预训练用的是哪个脚本,是training/pretraing.py还是`BERT_Training.py?

我看了你和两个训练脚本,有些地方不是很明白,为什么初始的训练epoch是5?如果我设为0,
trainer = init_trainer(config["lr"], load_model=TRUE)的model没有,怎么加载?

模型用于同义文本分类

您好,请教下,想用您的预训练模型进行格式为(sentance1,sentance2,label)的同义文本二分类问题,模型是否合适呢?另外,数据加载这块需要额外做些什么工作呢?谢谢!

运行Sentiment_Training出错

D:\Anaconda\envs\torch\python.exe D:/NLP/code/BERT_tutorial/Sentiment_Training.py
Traceback (most recent call last):
File "D:/NLP/code/BERT_tutorial/Sentiment_Training.py", line 4, in
from dataset.sentiment_dataset_v2 import CLSDataset
File "D:\NLP\code\BERT_tutorial\dataset\sentiment_dataset_v2.py", line 7, in
from sklearn.utils import shuffle
File "D:\Anaconda\envs\torch\lib\site-packages\sklearn_init_.py", line 82, in
from .base import clone
File "D:\Anaconda\envs\torch\lib\site-packages\sklearn\base.py", line 20, in
from .utils import IS_32BIT
File "D:\Anaconda\envs\torch\lib\site-packages\sklearn\utils_init
.py", line 20, in
from scipy.sparse import issparse
File "D:\Anaconda\envs\torch\lib\site-packages\scipy\sparse_init_.py", line 230, in
from .base import *
File "D:\Anaconda\envs\torch\lib\site-packages\scipy\sparse\base.py", line 9, in
from scipy._lib.numpy_compat import broadcast_to
File "D:\Anaconda\envs\torch\lib\site-packages\scipy_lib_numpy_compat.py", line 16, in
assert_warns = np.testing.assert_warns
File "D:\Anaconda\envs\torch\lib\site-packages\numpy_init
.py", line 213, in getattr
import numpy.testing as testing
File "D:\Anaconda\envs\torch\lib\site-packages\numpy\testing_init
.py", line 12, in
from .private.utils import *
File "D:\Anaconda\envs\torch\lib\site-packages\numpy\testing_private\utils.py", line 57, in
HAS_LAPACK64 = hasattr(numpy.config, 'lapack_ilp64_opt_info')
File "D:\Anaconda\envs\torch\lib\site-packages\numpy_init
.py", line 220, in getattr
"{!r}".format(name, attr))
AttributeError: module 'numpy' has no attribute 'config'

不知道是不是因为和题主的版本不一样的原因,题主能说一下各个包的版本么

在GPU上运行出错

result = self.forward(*input, **kwargs)
File "/a8root/bert_model.py", line 287, in forward
hidden_states, attention_matrices = layer_module(hidden_states, attention_mask, get_attention_matrices=get_attention_matrices)
File "/usr/local/python3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in call
result = self.forward(*input, **kwargs)
File "/a8roo/bert_model.py", line 264, in forward
attention_output, attention_matrices = self.attention(hidden_states, attention_mask, get_attention_matrices=get_attention_matrices)
File "/usr/local/python3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in call
result = self.forward(*input, **kwargs)
File "/a8root/bert_model.py", line 222, in forward
attention_output = self.output(self_output, input_tensor)
File "/usr/local/python3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in call
result = self.forward(*input, **kwargs)
File "/a8root/bert_model.py", line 209, in forward
hidden_states = self.LayerNorm(hidden_states + input_tensor)
File "/usr/local/python3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in call
result = self.forward(*input, **kwargs)
File "/a8root/bert_model.py", line 194, in forward
x = (x - u) / torch.sqrt(s + self.variance_epsilon)
RuntimeError: CUDA error: device-side assert triggered

pretraining.py 和BERT_Training.py运行都会报错,我查了一些资料,跟feward函数里传入的input的维度有关系,但是不太确定具体原因

关于可视化

请教一下,a_journey_into_math_of_ml/02_adaboost/adaboost.ipynb 一开始的3D动图是怎么做到的?我运行您的代码可以生成静态的3D图。感谢

加载模型验证时有问题

我用v2版本训练出模型后加载模型时说
RuntimeError: Error(s) in loading state_dict for Bert_Sentiment_Analysis:
Unexpected key(s) in state_dict: "dense.weight", "dense.bias".
请问是要改哪个地方呢?感谢

Masked language Model 代码细节请教

def random_char(self, sentence):
    char_tokens_ = list(sentence)
    char_tokens = self.tokenize_char(char_tokens_)

    output_label = []
    for i, token in enumerate(char_tokens):
        prob = random.random()
        if prob < 0.30:
            prob /= 0.30
            output_label.append(char_tokens[i])
            # 80% randomly change token to mask token
            if prob < 0.8:
                char_tokens[i] = self.mask_index
            # 10% randomly change token to random token
            elif prob < 0.9:
                char_tokens[i] = random.randrange(len(self.word2idx))
        else:
            output_label.append(0)
            
    return char_tokens, output_label

这一段代码看的不是很明白,论文里说的是一句话中 15% 的 tokentoken 被替换,只预测被遮盖的,这里是0.3?另外,output_label.append(0) 这里为什么append(0)?

请大神讲解一下,多谢啦!

您好

您好,我是一名大学生,也在做一种二分类任务,判断新闻真假。但找到的数据集很多都是英文的,请问您有什么bert英文训练模型推荐吗?

自己训练的情感分析模型在Inference代码里面加载报错

您好,看了您bert视频受益匪浅。有个问题想请教一下您,用您训练好的sentiment_model_epoch.418这个模型执行Sentiment_Inference.py时候是正常的,而我自己训练出来的模型放进去执行时候出现Unexpected key(s) in state_dict: "dense.weight", "dense.bias"。load_state_dict()里面strict设为False的话可以正常执行,但是出来的正负样本预测值都在0.25左右明显不对。我去检查了下训练时候all_label和all_prediction都是很正常的,所以现在不知道该怎么解决这个问题。

关于epoch和运行问题

请问大神的酒店评论分析一共训练了多少个epoch达到90多的准确率呢?
另外,我将训练好的epoch进行验证时报错:
RuntimeError: Error(s) in loading state_dict for Bert_Sentiment_Analysis:
Unexpected key(s) in state_dict: "dense.weight", "dense.bias".
这是什么原因呢?非常感谢大神的解答

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.