Comments (3)
JANK SOLUTION 1
I added these code blocks, It only inferences one prompt at a time. And after retraining on Alpha-CoT (it is a poo poo dataset) my results are pretty incoherent. What data set would you retrain on?
// 'checkpoint' and 'input_sequence' are necessary args
if (argc < 3) {
printf("Usage: %s <checkpoint_file> <input_sequence> [temperature] [seed]\n", argv[0]);
return 1;
}
checkpoint = argv[1];
char *input_sequence_str = argv[2];
// temperature is optional
if (argc >= 4) {
temperature = atof(argv[3]);
}
// seed is optional
if (argc >= 5) {
unsigned int seed = atoi(argv[4]);
srand(seed);
} else {
time_t current_time;
time(¤t_time);
srand((unsigned int)current_time);
}
// call Python script to tokenize input sequence
char command[2048]; // make sure this is large enough
sprintf(command, "python pretokenize.py \"%s\"", input_sequence_str);
FILE *pipe = popen(command, "r");
if (!pipe) {
printf("Unable to open pipe!");
return 1;
}
// read tokenized input sequence from pipe
int input_sequence[MAX_INPUT_LENGTH];
int input_length = 0;
while (fscanf(pipe, "%d", &input_sequence[input_length]) == 1) {
input_length++;
}
//printf("input_length: %d", input_length);
pclose(pipe);
Then to init the model state
// initialize the model's state with the input sequence
int next;
int pos = 0;
for (int i = 0; i < input_length; i++) {
transformer(input_sequence[i], pos, &config, &state, &weights);
pos++;
next = input_sequence[i]; // add this line
}
// start generating the output sequence after the input sequence
while (pos < config.seq_len) {
// forward the transformer to get logits for the next token
transformer(next, pos, &config, &state, &weights);
Then I changed the run wrap file to work with input sequences. There isn't a run_wrap file anymore Im not sure how they are decoding the tokens. If you make this better please let me know.
EDIT: also this probably works better
#62 (comment)
from llama2.c.
卡顿解决方案 1
我添加了这些代码块,它一次只推断一个提示。在 Alpha-CoT(它是便便数据集)上重新训练后,我的结果非常不连贯。您将使用什么数据集进行重新训练?
// 'checkpoint' and 'input_sequence' are necessary args if (argc < 3) { printf("Usage: %s <checkpoint_file> <input_sequence> [temperature] [seed]\n", argv[0]); return 1; } checkpoint = argv[1]; char *input_sequence_str = argv[2]; // temperature is optional if (argc >= 4) { temperature = atof(argv[3]); } // seed is optional if (argc >= 5) { unsigned int seed = atoi(argv[4]); srand(seed); } else { time_t current_time; time(¤t_time); srand((unsigned int)current_time); } // call Python script to tokenize input sequence char command[2048]; // make sure this is large enough sprintf(command, "python pretokenize.py \"%s\"", input_sequence_str); FILE *pipe = popen(command, "r"); if (!pipe) { printf("Unable to open pipe!"); return 1; } // read tokenized input sequence from pipe int input_sequence[MAX_INPUT_LENGTH]; int input_length = 0; while (fscanf(pipe, "%d", &input_sequence[input_length]) == 1) { input_length++; } //printf("input_length: %d", input_length); pclose(pipe);
然后初始化模型状态
// initialize the model's state with the input sequence int next; int pos = 0; for (int i = 0; i < input_length; i++) { transformer(input_sequence[i], pos, &config, &state, &weights); pos++; next = input_sequence[i]; // add this line } // start generating the output sequence after the input sequence while (pos < config.seq_len) { // forward the transformer to get logits for the next token transformer(next, pos, &config, &state, &weights);
然后我更改了运行包装文件以处理输入序列。不再有 run_wrap 文件我不确定他们如何解码令牌。如果你能做得更好,请告诉我。
编辑:这可能效果更好 #62(评论)
I don't have custom dataset, I want ti use open model to run directly. Your code is amazing, I will try it if it's merged
from llama2.c.
Now it supports prompt. Added on this commit
Usage details on the README
from llama2.c.
Related Issues (20)
- runomp on Mac M1 Max is slower than runfast HOT 9
- can the custom model in llama2.c format be exported to hf format? HOT 1
- How to run interface on GPU
- Plans C
- Llama transformer walkthrough
- Significant Quality Degradation with q8 Quantization in Small Models HOT 4
- Prefill Processing HOT 2
- How to train a chat model
- -O3 does not apply auto-vectorization on X86-64 CPU
- ld: warning: ignoring duplicate libraries: '-lgcc'
- numpy llama2 for fun and learning
- Is it possible to use Orca2 with this code ? HOT 1
- export does not seem to work? HOT 2
- can i train on cpu HOT 5
- support for SIMD in matmul might increase performance i think so...
- mfu calculation
- Training Tiny Stories: 'CUDA' -vs- 'MPS' HOT 1
- HOw to add different coropus ? HOT 4
- NanoGPT in c for inference
- Once upon a time, there was a little girl named Lily HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from llama2.c.