Giter VIP home page Giter VIP logo

Comments (3)

xiscoding avatar xiscoding commented on June 12, 2024

JANK SOLUTION 1

I added these code blocks, It only inferences one prompt at a time. And after retraining on Alpha-CoT (it is a poo poo dataset) my results are pretty incoherent. What data set would you retrain on?

// 'checkpoint' and 'input_sequence' are necessary args
    if (argc < 3) {
        printf("Usage: %s <checkpoint_file> <input_sequence> [temperature] [seed]\n", argv[0]);
        return 1;
    }
    checkpoint = argv[1];
    char *input_sequence_str = argv[2];
    // temperature is optional
    if (argc >= 4) {
        temperature = atof(argv[3]);
    }
    // seed is optional
    if (argc >= 5) {
        unsigned int seed = atoi(argv[4]);
        srand(seed);
    } else {
        time_t current_time; 
        time(&current_time);
        srand((unsigned int)current_time);
    }

    // call Python script to tokenize input sequence
    char command[2048];  // make sure this is large enough
    sprintf(command, "python pretokenize.py \"%s\"", input_sequence_str);
    FILE *pipe = popen(command, "r");
    if (!pipe) {
        printf("Unable to open pipe!");
        return 1;
    }

    // read tokenized input sequence from pipe
    int input_sequence[MAX_INPUT_LENGTH];
    int input_length = 0;
    while (fscanf(pipe, "%d", &input_sequence[input_length]) == 1) {
        input_length++;
        
    }
    //printf("input_length: %d", input_length);
    pclose(pipe);

Then to init the model state

// initialize the model's state with the input sequence
    int next;
    int pos = 0;

    
    for (int i = 0; i < input_length; i++) {
        transformer(input_sequence[i], pos, &config, &state, &weights);
        pos++;
        next = input_sequence[i];  // add this line

    }

    // start generating the output sequence after the input sequence
    while (pos < config.seq_len) {

        // forward the transformer to get logits for the next token
        transformer(next, pos, &config, &state, &weights);

Then I changed the run wrap file to work with input sequences. There isn't a run_wrap file anymore Im not sure how they are decoding the tokens. If you make this better please let me know.

EDIT: also this probably works better
#62 (comment)

from llama2.c.

StudyingLover avatar StudyingLover commented on June 12, 2024

卡顿解决方案 1

我添加了这些代码块,它一次只推断一个提示。在 Alpha-CoT(它是便便数据集)上重新训练后,我的结果非常不连贯。您将使用什么数据集进行重新训练?

// 'checkpoint' and 'input_sequence' are necessary args
    if (argc < 3) {
        printf("Usage: %s <checkpoint_file> <input_sequence> [temperature] [seed]\n", argv[0]);
        return 1;
    }
    checkpoint = argv[1];
    char *input_sequence_str = argv[2];
    // temperature is optional
    if (argc >= 4) {
        temperature = atof(argv[3]);
    }
    // seed is optional
    if (argc >= 5) {
        unsigned int seed = atoi(argv[4]);
        srand(seed);
    } else {
        time_t current_time; 
        time(&current_time);
        srand((unsigned int)current_time);
    }

    // call Python script to tokenize input sequence
    char command[2048];  // make sure this is large enough
    sprintf(command, "python pretokenize.py \"%s\"", input_sequence_str);
    FILE *pipe = popen(command, "r");
    if (!pipe) {
        printf("Unable to open pipe!");
        return 1;
    }

    // read tokenized input sequence from pipe
    int input_sequence[MAX_INPUT_LENGTH];
    int input_length = 0;
    while (fscanf(pipe, "%d", &input_sequence[input_length]) == 1) {
        input_length++;
        
    }
    //printf("input_length: %d", input_length);
    pclose(pipe);

然后初始化模型状态

// initialize the model's state with the input sequence
    int next;
    int pos = 0;

    
    for (int i = 0; i < input_length; i++) {
        transformer(input_sequence[i], pos, &config, &state, &weights);
        pos++;
        next = input_sequence[i];  // add this line

    }

    // start generating the output sequence after the input sequence
    while (pos < config.seq_len) {

        // forward the transformer to get logits for the next token
        transformer(next, pos, &config, &state, &weights);

然后我更改了运行包装文件以处理输入序列。不再有 run_wrap 文件我不确定他们如何解码令牌。如果你能做得更好,请告诉我。

编辑:这可能效果更好 #62(评论)

I don't have custom dataset, I want ti use open model to run directly. Your code is amazing, I will try it if it's merged

from llama2.c.

kroggen avatar kroggen commented on June 12, 2024

Now it supports prompt. Added on this commit

Usage details on the README

from llama2.c.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.