Giter VIP home page Giter VIP logo

llm-travel's People

Contributors

glanvery avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

llm-travel's Issues

二次预训练

师兄您好,未来的小师弟向您请教一波。
image

和您请教这个二次预训练的逻辑是什么,我理解的是如下:
input:ABCDEFG
label: ABCDEFG(和input错一位)
也就是给模型一个让模型预测第一个token,然后给模型+第一个token,让模型预测下一个,一直到预测到为止
不知道我理解的对不对,以及代码层面上是怎么实现的呢吗,盼望回复

生成token的概率

您好,请教一下 ,下面的代码是否可以计算生成token的概率?

def cal_prob(self, input_text):
    # 将input_text转换为t5模型的输入格式
    encodings = self.tokenizer(input_text, max_length=512, return_tensors="pt")
    encodings = {k: v.to(self.device) for k, v in encodings.items()}

    # 根据t5模型真实输出,作为target_text
    target_text = self.predict(input_text)

    # 将target_text转换为t5模型的输出格式
    labels = self.tokenizer.encode(target_text, return_tensors="pt", max_length=512, padding="max_length").to(device)

    # 由labels生成decoder_input_ids,需要在前面补0使得长度与labels相同
    decoder_input_ids = torch.cat([torch.zeros_like(labels[:, :1]), labels[:, :-1]], dim=-1).to(device)

    # 计算生成text的概率
    """outputs 包括 loss logits past_key_values encoder_last_hidden_state"""
    outputs = self.model(**encodings, labels=labels, decoder_input_ids=decoder_input_ids)
    logits = outputs.logits
    print(outputs.logits.size())

    return logits

    # 根据t5模型真实输出,作为target_text
    target_text = self.predict(input_text)

    # 将target_text转换为t5模型的输出格式
    labels = self.tokenizer.encode(target_text, return_tensors="pt", max_length=512, padding="max_length").to(device)

    # 由labels生成decoder_input_ids,需要在前面补0使得长度与labels相同
    decoder_input_ids = torch.cat([torch.zeros_like(labels[:, :1]), labels[:, :-1]], dim=-1).to(device)

    # 计算生成text的概率
    """outputs 包括 loss logits past_key_values encoder_last_hidden_state"""
    outputs = self.model(**encodings, labels=labels, decoder_input_ids=decoder_input_ids)
    logits = outputs.logits
    print(outputs.logits.size())

    return logits

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.