withcatai / catai Goto Github PK
View Code? Open in Web Editor NEWRun AI โจ assistant locally! with simple API for Node.js ๐
Home Page: https://withcatai.github.io/catai/
License: MIT License
Run AI โจ assistant locally! with simple API for Node.js ๐
Home Page: https://withcatai.github.io/catai/
License: MIT License
Describe the bug
I have setup catai, downloaded a model, the web interface opens up, I see this in the console of the server:
new connection
but as soon as I type anything in the web interface, the circle starts spinning indefinitely and this pops up in the server log:
<end>
Desktop (please complete the following information):
PS i clearly have no idea what i'm doing, so the problem is likely on my side, but I don't know what to try
Hi, I appreciate your work but I'm having a hard time understanding the specific actions required from me to run this UI with CUDA support on windows.
Did I get the fact that I need to manually build node-llama-cpp with cuda support and put it to node_modules?
It feels like a lot of pointless work and I'm not sure how other people are doing it if it's not in the readme... Did I miss something?
I've tried to add gpuLayers in config but looks like it still is just using CPU... so there has to be some additional steps
Please refer to the troubleshooting before opening an issue. You might find the solution there.
Describe the bug
Followed instructions, when catai up
is used it only opens a page for 1 time, then process exits.
Screenshots
$ catai up
CatAI client on http://127.0.0.1:3000
New connection
$ echo $?
0
Desktop (please complete the following information):
catai --version
) 3.0.0node --version
) 20Please refer to the troubleshooting before opening an issue. You might find the solution there.
Describe the bug
CatAI really doesn't like having the browser be open when you start it. If you attempt to start CatAI with the browser open, it prints this out to the terminal
..................../home/thesystemguy/.nvm/versions/node/v20.2.0/lib/node_modules/catai/node_modules/openurl/openurl.js:39
throw error;
^
Error: Gtk-Message: 00:49:11.085: Failed to load module "xapp-gtk3-module"
[2:2:0711/004911.245184:ERROR:nacl_fork_delegate_linux.cc(313)] Bad NaCl helper startup ack (0 bytes)
Gtk-Message: 00:49:11.272: Failed to load module "xapp-gtk3-module"
at Socket.<anonymous> (/home/thesystemguy/.nvm/versions/node/v20.2.0/lib/node_modules/catai/node_modules/openurl/openurl.js:35:25)
at Socket.emit (node:events:525:35)
at endReadableNT (node:internal/streams/readable:1359:12)
at process.processTicksAndRejections (node:internal/process/task_queues:82:21)
Node.js v18.16.0
at file:///home/thesystemguy/.nvm/versions/node/v20.2.0/lib/node_modules/catai/scripts/cli.js:69:27
exit code: 1
A reinstallation nor reboot does not fix this issue. This issue cropped up after Google Chrome broke and Firefox assumed itself as the default browser and I changed the settings.
Screenshots
n/a
Desktop (please complete the following information):
OS: Linux Mint 21.1 Cinnamon, Linux 5.15.0-76-generic
Browser: Google Chrome
CatAI version 0.3.12
Node.js version v18.16.0
CPU: AMD Ryzen 5 5600H with Radeon Graphics
RAM: 30.7 GiB (512 MiB reserved to graphics chipset)
Hi, is there a way to cutom the chat app using a system_prompt, like "You are a pirate and act like this, if the user say 'hello', you say "that"..." ?
Usecase: running catai instances on cloud vms and accessing the UI over the network
Hi,
it would be good to have some kind of user mode and developer mode, which can be toggled with an environment variable.
So you have more parameters to choose from in developer mode and when you are ready, you ship it in user mode with a simple interface.
the command catai serve
doesn't work.
tested on two ubuntu versions.
version Ubuntu 22.04.2 LTS
:
catai serve
$ cd /home/noam/.nvm/versions/node/v18.12.1/lib/node_modules/catai
$ npm start
> [email protected] start
> node src/index.js
Listening on http://127.0.0.1:3000
/home/noam/.nvm/versions/node/v18.12.1/lib/node_modules/catai/node_modules/openurl/openurl.js:39
throw error;
^
Error: Gtk-Message: 16:22:50.858: Not loading module "atk-bridge": The functionality is provided by GTK natively. Please try to not load it.
at Socket.<anonymous> (/home/noam/.nvm/versions/node/v18.12.1/lib/node_modules/catai/node_modules/openurl/openurl.js:35:25)
at Socket.emit (node:events:525:35)
at endReadableNT (node:internal/streams/readable:1359:12)
at process.processTicksAndRejections (node:internal/process/task_queues:82:21)
Node.js v18.12.1
file:///home/noam/.nvm/versions/node/v18.12.1/lib/node_modules/catai/node_modules/zx/build/core.js:146
let output = new ProcessOutput(code, signal, stdout, stderr, combined, message);
^
ProcessOutput [Error]: /home/noam/.nvm/versions/node/v18.12.1/lib/node_modules/catai/node_modules/openurl/openurl.js:39
throw error;
^
Error: Gtk-Message: 16:22:50.858: Not loading module "atk-bridge": The functionality is provided by GTK natively. Please try to not load it.
at Socket.<anonymous> (/home/noam/.nvm/versions/node/v18.12.1/lib/node_modules/catai/node_modules/openurl/openurl.js:35:25)
at Socket.emit (node:events:525:35)
at endReadableNT (node:internal/streams/readable:1359:12)
at process.processTicksAndRejections (node:internal/process/task_queues:82:21)
Node.js v18.12.1
at file:///home/noam/.nvm/versions/node/v18.12.1/lib/node_modules/catai/scripts/cli.js:34:27
exit code: 1
at ChildProcess.<anonymous> (file:///home/noam/.nvm/versions/node/v18.12.1/lib/node_modules/catai/node_modules/zx/build/core.js:146:26)
at ChildProcess.emit (node:events:513:28)
at maybeClose (node:internal/child_process:1091:16)
at ChildProcess._handle.onexit (node:internal/child_process:302:5)
at Process.callbackTrampoline (node:internal/async_hooks:130:17) {
_code: 1,
_signal: null,
_stdout: '\n' +
'> [email protected] start\n' +
'> node src/index.js\n' +
'\n' +
'Listening on http://127.0.0.1:3000\n',
_stderr: '/home/noam/.nvm/versions/node/v18.12.1/lib/node_modules/catai/node_modules/openurl/openurl.js:39\n' +
' throw error;\n' +
' ^\n' +
'\n' +
'Error: Gtk-Message: 16:22:50.858: Not loading module "atk-bridge": The functionality is provided by GTK natively. Please try to not load it.\n' +
'\n' +
' at Socket.<anonymous> (/home/noam/.nvm/versions/node/v18.12.1/lib/node_modules/catai/node_modules/openurl/openurl.js:35:25)\n' +
' at Socket.emit (node:events:525:35)\n' +
' at endReadableNT (node:internal/streams/readable:1359:12)\n' +
' at process.processTicksAndRejections (node:internal/process/task_queues:82:21)\n' +
'\n' +
'Node.js v18.12.1\n',
_combined: '\n' +
'> [email protected] start\n' +
'> node src/index.js\n' +
'\n' +
'Listening on http://127.0.0.1:3000\n' +
'/home/noam/.nvm/versions/node/v18.12.1/lib/node_modules/catai/node_modules/openurl/openurl.js:39\n' +
' throw error;\n' +
' ^\n' +
'\n' +
'Error: Gtk-Message: 16:22:50.858: Not loading module "atk-bridge": The functionality is provided by GTK natively. Please try to not load it.\n' +
'\n' +
' at Socket.<anonymous> (/home/noam/.nvm/versions/node/v18.12.1/lib/node_modules/catai/node_modules/openurl/openurl.js:35:25)\n' +
' at Socket.emit (node:events:525:35)\n' +
' at endReadableNT (node:internal/streams/readable:1359:12)\n' +
' at process.processTicksAndRejections (node:internal/process/task_queues:82:21)\n' +
'\n' +
'Node.js v18.12.1\n'
}
Node.js v18.12.1
on Ubuntu 18.04.6 LTS
it does open a server, then it shows the following error:
/home/noam/.npm-global/lib/node_modules/catai/models/executable/chat: /lib/x86_64-linux-gnu/libm.so.6: version `GLIBC_2.29' not found (required by /home/noam/.npm-global/lib/node_modules/catai/models/executable/chat)
/home/noam/.npm-global/lib/node_modules/catai/models/executable/chat: /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.29' not found (required by /home/noam/.npm-global/lib/node_modules/catai/models/executable/chat)
/home/noam/.npm-global/lib/node_modules/catai/models/executable/chat: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.32' not found (required by /home/noam/.npm-global/lib/node_modules/catai/models/executable/chat)
/home/noam/.npm-global/lib/node_modules/catai/models/executable/chat: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.34' not found (required by /home/noam/.npm-global/lib/node_modules/catai/models/executable/chat)
Error: /home/noam/.npm-global/lib/node_modules/catai/models/executable/chat: /lib/x86_64-linux-gnu/libm.so.6: version `GLIBC_2.29' not found (required by /home/noam/.npm-global/lib/node_modules/catai/models/executable/chat) /home/noam/.npm-global/lib/node_modules/catai/models/executable/chat: /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.29' not found (required by /home/noam/.npm-global/lib/node_modules/catai/models/executable/chat) /home/noam/.npm-global/lib/node_modules/catai/models/executable/chat: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.32' not found (required by /home/noam/.npm-global/lib/node_modules/catai/models/executable/chat) /home/noam/.npm-global/lib/node_modules/catai/models/executable/chat: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.34' not found (required by /home/noam/.npm-global/lib/node_modules/catai/models/executable/chat) Thread unexpected closed!
It looks like the API streams the whole result to the server console before sending the output back as the response. Is there a way to return the results as soon as they're available?
Or if not, then to stream the results back from the API?
Outputs only so much text in its answer, cuts off every time. How do I increase this? Better yet is there settings somewhere?
Please refer to the troubleshooting before opening an issue. You might find the solution there.
Describe the bug
CatAI refuses to load custom GUIs. No error is tossed from the terminal where I am running it. Running catai serve --ui chatGPT as seen in commands.md loads the default UI. I have tried forcing CatAI into using the alternate GUI by changing some files, but nothing happened.
Screenshots
n/a
Desktop (please complete the following information):
The interface starts, and after entering the first request, it crashes
PS C:\Users\pomazan> catai serve
$ cd C:\Users\pomazan\AppData\Roaming\npm\node_modules\catai
$ npm start -- --production true --ui catai
> [email protected] start
> node src/index.js --production true --ui catai
llama.cpp: loading model from C:\Users\pomazan\catai\models\Alpaca-13B
llama_model_load_internal: format = ggjt v1 (pre #1405)
llama_model_load_internal: n_vocab = 32000
llama_model_load_internal: n_ctx = 1024
llama_model_load_internal: n_embd = 5120
llama_model_load_internal: n_mult = 256
llama_model_load_internal: n_head = 40
llama_model_load_internal: n_layer = 40
llama_model_load_internal: n_rot = 128
llama_model_load_internal: ftype = 2 (mostly Q4_0)
llama_model_load_internal: n_ff = 13824
llama_model_load_internal: n_parts = 1
llama_model_load_internal: model size = 13B
error loading model: this format is no longer supported (see https://github.com/ggerganov/llama.cpp/pull/1305)
llama_init_from_file: failed to load model
Listening on http://127.0.0.1:3000
new connection
llama.cpp: loading model from C:\Users\pomazan\catai\models\Alpaca-13B
llama_model_load_internal: format = ggjt v1 (pre #1405)
llama_model_load_internal: n_vocab = 32000
llama_model_load_internal: n_ctx = 1024
llama_model_load_internal: n_embd = 5120
llama_model_load_internal: n_mult = 256
llama_model_load_internal: n_head = 40
llama_model_load_internal: n_layer = 40
llama_model_load_internal: n_rot = 128
llama_model_load_internal: ftype = 2 (mostly Q4_0)
llama_model_load_internal: n_ff = 13824
llama_model_load_internal: n_parts = 1
llama_model_load_internal: model size = 13B
error loading model: this format is no longer supported (see https://github.com/ggerganov/llama.cpp/pull/1305)
llama_init_from_file: failed to load model
at file:///C:/Users/pomazan/AppData/Roaming/npm/node_modules/catai/scripts/cli.js:69:27
exit code: 1
I'd like to install CatAI from the GitHub source rather than using the npm servers, because I want to make some modifications to the interface.
However, I'm having trouble doing so.
I've recently started studying programming.
How can I go about this?
I've tried downloading the package and installing it with "npm install" in /server/ folder, but I'm not sure how to run it after installation. :(
Otherbrain is a free human feedback dataset for open models.
Here's a link with more info: https://www.otherbrain.world/human-feedback
Would ya'll be interested in adding ๐๐ to catai to help build the open data set? Happy to help if so.
For reference, here's what the flow looks like in FreeChat. I think we could do something similar in catai:
If I ask "Please write a summary of all the countries in the world in alphabetical order. Include in each summary the country's population and population density.", it will write about 1000 tokens, then it'll just shut down, and the UI will lose the connection.
I was using the Stable Vicuna model 13B on 16GB of ram.
If you don't experience this issue, then I think this can be closed, as it's probably just my system's limitation.
Describe the bug
Error: The connection lost, check the server status and refresh the page.
Screenshots
C:\Windows\System32>catai up
CatAI client on http://127.0.0.1:3000
New connection
Failed to load prebuilt binary for platform "win32" "x64". Error: Error: A dynamic link library (DLL) initialization routine failed.
\?\C:\Users\ppodl\AppData\Roaming\npm\node_modules\catai\node_modules\node-llama-cpp\llamaBins\win-x64\llama-addon.node at Module._extensions..node (node:internal/modules/cjs/loader:1327:18)
at Module.load (node:internal/modules/cjs/loader:1091:32)
at Module._load (node:internal/modules/cjs/loader:938:12)
at Module.require (node:internal/modules/cjs/loader:1115:19)
at require (node:internal/modules/helpers:130:18)
at loadBin (file:///C:/Users/ppodl/AppData/Roaming/npm/node_modules/catai/node_modules/node-llama-cpp/dist/utils/getBin.js:45:24)
at async file:///C:/Users/ppodl/AppData/Roaming/npm/node_modules/catai/node_modules/node-llama-cpp/dist/llamaEvaluator/LlamaBins.js:2:29 {
code: 'ERR_DLOPEN_FAILED'
}
Falling back to locally built binaries
file:///C:/Users/ppodl/AppData/Roaming/npm/node_modules/catai/node_modules/node-llama-cpp/dist/utils/compileLLamaCpp.js:93
throw new Error("Could not find Release or Debug directory");
^
Error: Could not find Release or Debug directory
at getCompiledResultDir (file:///C:/Users/ppodl/AppData/Roaming/npm/node_modules/catai/node_modules/node-llama-cpp/dist/utils/compileLLamaCpp.js:93:11)
at async getCompiledLlamaCppBinaryPath (file:///C:/Users/ppodl/AppData/Roaming/npm/node_modules/catai/node_modules/node-llama-cpp/dist/utils/compileLLamaCpp.js:80:35)
at async loadBin (file:///C:/Users/ppodl/AppData/Roaming/npm/node_modules/catai/node_modules/node-llama-cpp/dist/utils/getBin.js:57:24)
at async file:///C:/Users/ppodl/AppData/Roaming/npm/node_modules/catai/node_modules/node-llama-cpp/dist/llamaEvaluator/LlamaBins.js:2:29
Node.js v20.8.0
A "Stop" button to stop long running execution would be good.
Running install with an unrecognised model gives the following output:
โ models catai install gpt4all
$ cd /usr/local/lib/node_modules/catai
Model unknown, we will download with template URL. You can also try one of thous:7B, 13B, 30B, Vicuna-7B, Vicuna-7B-Uncensored, Vicuna-13B, Stable-Vicuna-13B, Wizard-Vicuna-7B, Wizard-Vicuna-7B-Uncensored, Wizard-Vicuna-13B, OpenAssistant-30B
Outputting a list of available models is excellent but perhaps also worth adding a catai install --list
or similar command?
Also observe that the output appears to make no sense ... thous:7B
? And what are the 7B,13B,30B models? (edit) Ahhh, the original llama models, doh!
The || true
in this line:
prevents setting CATAI_OPEN_IN_BROWSER to anything other then true. When it's set to false it will default to || true.
Please refer to the troubleshooting before opening an issue. You might find the solution there.
Describe the bug
npm install -g catai
catai install vicuna-7b-16k-q4_k_s
catai up
Screenshots
If applicable, add screenshots to help explain your problem.
then typing "hello"
Desktop (please complete the following information):
catai --version
) 3.0.2node --version
)catai active
)Is this model compatible? (run catai ls
for this info)
I made a "catai update" and i thought it was working better (the GPTcat logo was here) but it crash too
Hello ๐
In following development.md to run the Server locally, I'm getting the following error below when starting the Server. Would you have any advice for me to troubleshoot further?
Repro:
Error:
zsh: segmentation fault npm start
Node- v18.16.0
System Version: macOS 13.3.1
client/catai runs just fine!
What's odd is that I run llama-node inference.js just fine on this Mac.
Installing globally and using catai serve also works without error.
after fix the connection problems. I like to use this chat but after
catai up
i get
C:\Users\ppodl>catai up
CatAI client on http://127.0.0.1:3000
New connection
llama_model_loader: loaded meta data with 20 key-value pairs and 291 tensors from C:\Users\ppodl\catai\models\vicuna-7b-16k-q4_k_s (version GGUF V2 (latest))
llama_model_loader: - tensor 0: token_embd.weight q4_K [ 4096, 32000, 1, 1 ]
llama_model_loader: - tensor 1: blk.0.attn_q.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 2: blk.0.attn_k.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 3: blk.0.attn_v.weight q5_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 4: blk.0.attn_output.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 5: blk.0.ffn_gate.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 6: blk.0.ffn_up.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 7: blk.0.ffn_down.weight q5_K [ 11008, 4096, 1, 1 ]
llama_model_loader: - tensor 8: blk.0.attn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 9: blk.0.ffn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 10: blk.1.attn_q.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 11: blk.1.attn_k.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 12: blk.1.attn_v.weight q5_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 13: blk.1.attn_output.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 14: blk.1.ffn_gate.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 15: blk.1.ffn_up.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 16: blk.1.ffn_down.weight q5_K [ 11008, 4096, 1, 1 ]
llama_model_loader: - tensor 17: blk.1.attn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 18: blk.1.ffn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 19: blk.2.attn_q.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 20: blk.2.attn_k.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 21: blk.2.attn_v.weight q5_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 22: blk.2.attn_output.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 23: blk.2.ffn_gate.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 24: blk.2.ffn_up.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 25: blk.2.ffn_down.weight q5_K [ 11008, 4096, 1, 1 ]
llama_model_loader: - tensor 26: blk.2.attn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 27: blk.2.ffn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 28: blk.3.attn_q.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 29: blk.3.attn_k.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 30: blk.3.attn_v.weight q5_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 31: blk.3.attn_output.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 32: blk.3.ffn_gate.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 33: blk.3.ffn_up.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 34: blk.3.ffn_down.weight q5_K [ 11008, 4096, 1, 1 ]
llama_model_loader: - tensor 35: blk.3.attn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 36: blk.3.ffn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 37: blk.4.attn_q.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 38: blk.4.attn_k.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 39: blk.4.attn_v.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 40: blk.4.attn_output.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 41: blk.4.ffn_gate.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 42: blk.4.ffn_up.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 43: blk.4.ffn_down.weight q4_K [ 11008, 4096, 1, 1 ]
llama_model_loader: - tensor 44: blk.4.attn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 45: blk.4.ffn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 46: blk.5.attn_q.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 47: blk.5.attn_k.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 48: blk.5.attn_v.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 49: blk.5.attn_output.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 50: blk.5.ffn_gate.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 51: blk.5.ffn_up.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 52: blk.5.ffn_down.weight q4_K [ 11008, 4096, 1, 1 ]
llama_model_loader: - tensor 53: blk.5.attn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 54: blk.5.ffn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 55: blk.6.attn_q.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 56: blk.6.attn_k.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 57: blk.6.attn_v.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 58: blk.6.attn_output.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 59: blk.6.ffn_gate.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 60: blk.6.ffn_up.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 61: blk.6.ffn_down.weight q4_K [ 11008, 4096, 1, 1 ]
llama_model_loader: - tensor 62: blk.6.attn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 63: blk.6.ffn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 64: blk.7.attn_q.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 65: blk.7.attn_k.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 66: blk.7.attn_v.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 67: blk.7.attn_output.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 68: blk.7.ffn_gate.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 69: blk.7.ffn_up.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 70: blk.7.ffn_down.weight q4_K [ 11008, 4096, 1, 1 ]
llama_model_loader: - tensor 71: blk.7.attn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 72: blk.7.ffn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 73: blk.8.attn_q.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 74: blk.8.attn_k.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 75: blk.8.attn_v.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 76: blk.8.attn_output.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 77: blk.8.ffn_gate.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 78: blk.8.ffn_up.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 79: blk.8.ffn_down.weight q4_K [ 11008, 4096, 1, 1 ]
llama_model_loader: - tensor 80: blk.8.attn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 81: blk.8.ffn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 82: blk.9.attn_q.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 83: blk.9.attn_k.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 84: blk.9.attn_v.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 85: blk.9.attn_output.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 86: blk.9.ffn_gate.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 87: blk.9.ffn_up.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 88: blk.9.ffn_down.weight q4_K [ 11008, 4096, 1, 1 ]
llama_model_loader: - tensor 89: blk.9.attn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 90: blk.9.ffn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 91: blk.10.attn_q.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 92: blk.10.attn_k.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 93: blk.10.attn_v.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 94: blk.10.attn_output.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 95: blk.10.ffn_gate.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 96: blk.10.ffn_up.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 97: blk.10.ffn_down.weight q4_K [ 11008, 4096, 1, 1 ]
llama_model_loader: - tensor 98: blk.10.attn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 99: blk.10.ffn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 100: blk.11.attn_q.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 101: blk.11.attn_k.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 102: blk.11.attn_v.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 103: blk.11.attn_output.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 104: blk.11.ffn_gate.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 105: blk.11.ffn_up.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 106: blk.11.ffn_down.weight q4_K [ 11008, 4096, 1, 1 ]
llama_model_loader: - tensor 107: blk.11.attn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 108: blk.11.ffn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 109: blk.12.attn_q.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 110: blk.12.attn_k.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 111: blk.12.attn_v.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 112: blk.12.attn_output.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 113: blk.12.ffn_gate.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 114: blk.12.ffn_up.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 115: blk.12.ffn_down.weight q4_K [ 11008, 4096, 1, 1 ]
llama_model_loader: - tensor 116: blk.12.attn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 117: blk.12.ffn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 118: blk.13.attn_q.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 119: blk.13.attn_k.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 120: blk.13.attn_v.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 121: blk.13.attn_output.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 122: blk.13.ffn_gate.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 123: blk.13.ffn_up.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 124: blk.13.ffn_down.weight q4_K [ 11008, 4096, 1, 1 ]
llama_model_loader: - tensor 125: blk.13.attn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 126: blk.13.ffn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 127: blk.14.attn_q.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 128: blk.14.attn_k.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 129: blk.14.attn_v.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 130: blk.14.attn_output.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 131: blk.14.ffn_gate.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 132: blk.14.ffn_up.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 133: blk.14.ffn_down.weight q4_K [ 11008, 4096, 1, 1 ]
llama_model_loader: - tensor 134: blk.14.attn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 135: blk.14.ffn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 136: blk.15.attn_q.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 137: blk.15.attn_k.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 138: blk.15.attn_v.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 139: blk.15.attn_output.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 140: blk.15.ffn_gate.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 141: blk.15.ffn_up.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 142: blk.15.ffn_down.weight q4_K [ 11008, 4096, 1, 1 ]....
llama_model_loader: - tensor 143: blk.15.attn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 144: blk.15.ffn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 145: blk.16.attn_q.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 146: blk.16.attn_k.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 147: blk.16.attn_v.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 148: blk.16.attn_output.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 149: blk.16.ffn_gate.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 150: blk.16.ffn_up.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 151: blk.16.ffn_down.weight q4_K [ 11008, 4096, 1, 1 ]
llama_model_loader: - tensor 152: blk.16.attn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 153: blk.16.ffn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 154: blk.17.attn_q.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 155: blk.17.attn_k.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 156: blk.17.attn_v.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 157: blk.17.attn_output.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 158: blk.17.ffn_gate.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 159: blk.17.ffn_up.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 160: blk.17.ffn_down.weight q4_K [ 11008, 4096, 1, 1 ]
llama_model_loader: - tensor 161: blk.17.attn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 162: blk.17.ffn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 163: blk.18.attn_q.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 164: blk.18.attn_k.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 165: blk.18.attn_v.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 166: blk.18.attn_output.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 167: blk.18.ffn_gate.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 168: blk.18.ffn_up.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 169: blk.18.ffn_down.weight q4_K [ 11008, 4096, 1, 1 ]
llama_model_loader: - tensor 170: blk.18.attn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 171: blk.18.ffn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 172: blk.19.attn_q.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 173: blk.19.attn_k.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 174: blk.19.attn_v.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 175: blk.19.attn_output.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 176: blk.19.ffn_gate.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 177: blk.19.ffn_up.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 178: blk.19.ffn_down.weight q4_K [ 11008, 4096, 1, 1 ]
llama_model_loader: - tensor 179: blk.19.attn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 180: blk.19.ffn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 181: blk.20.attn_q.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 182: blk.20.attn_k.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 183: blk.20.attn_v.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 184: blk.20.attn_output.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 185: blk.20.ffn_gate.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 186: blk.20.ffn_up.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 187: blk.20.ffn_down.weight q4_K [ 11008, 4096, 1, 1 ]
llama_model_loader: - tensor 188: blk.20.attn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 189: blk.20.ffn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 190: blk.21.attn_q.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 191: blk.21.attn_k.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 192: blk.21.attn_v.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 193: blk.21.attn_output.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 194: blk.21.ffn_gate.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 195: blk.21.ffn_up.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 196: blk.21.ffn_down.weight q4_K [ 11008, 4096, 1, 1 ]
llama_model_loader: - tensor 197: blk.21.attn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 198: blk.21.ffn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 199: blk.22.attn_q.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 200: blk.22.attn_k.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 201: blk.22.attn_v.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 202: blk.22.attn_output.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 203: blk.22.ffn_gate.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 204: blk.22.ffn_up.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 205: blk.22.ffn_down.weight q4_K [ 11008, 4096, 1, 1 ]
llama_model_loader: - tensor 206: blk.22.attn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 207: blk.22.ffn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 208: blk.23.attn_q.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 209: blk.23.attn_k.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 210: blk.23.attn_v.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 211: blk.23.attn_output.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 212: blk.23.ffn_gate.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 213: blk.23.ffn_up.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 214: blk.23.ffn_down.weight q4_K [ 11008, 4096, 1, 1 ]
llama_model_loader: - tensor 215: blk.23.attn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 216: blk.23.ffn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 217: blk.24.attn_q.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 218: blk.24.attn_k.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 219: blk.24.attn_v.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 220: blk.24.attn_output.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 221: blk.24.ffn_gate.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 222: blk.24.ffn_up.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 223: blk.24.ffn_down.weight q4_K [ 11008, 4096, 1, 1 ]
llama_model_loader: - tensor 224: blk.24.attn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 225: blk.24.ffn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 226: blk.25.attn_q.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 227: blk.25.attn_k.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 228: blk.25.attn_v.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 229: blk.25.attn_output.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 230: blk.25.ffn_gate.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 231: blk.25.ffn_up.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 232: blk.25.ffn_down.weight q4_K [ 11008, 4096, 1, 1 ]
llama_model_loader: - tensor 233: blk.25.attn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 234: blk.25.ffn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 235: blk.26.attn_q.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 236: blk.26.attn_k.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 237: blk.26.attn_v.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 238: blk.26.attn_output.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 239: blk.26.ffn_gate.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 240: blk.26.ffn_up.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 241: blk.26.ffn_down.weight q4_K [ 11008, 4096, 1, 1 ]
llama_model_loader: - tensor 242: blk.26.attn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 243: blk.26.ffn_norm.weight f32 [ 4096, 1, 1, 1 ]
llama_model_loader: - tensor 244: blk.27.attn_q.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 245: blk.27.attn_k.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 246: blk.27.attn_v.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 247: blk.27.attn_output.weight q4_K [ 4096, 4096, 1, 1 ]
llama_model_loader: - tensor 248: blk.27.ffn_gate.weight q4_K [ 4096, 11008, 1, 1 ]
llama_model_loader: - tensor 249: blk.27.ffn_up.weight q4_K [ 4096, 11008, 1, 1 ]....
but after ask and send 1st qestion i dont get answer and the loading animation is loop
Hey,
when installing npm install -g catai
the catai Server fails conversations stating: Failed to load prebuilt binary for platform "linux" "arm64". Error: Error: /usr/local/lib/node_modules/catai/node_modules/node-llama-cpp/llamaBins/linux-arm64/llama-addon.node: cannot open shared object file: No such file or directory
.
The file is actually there.
# ls -la /usr/local/lib/node_modules/catai/node_modules/node-llama-cpp/llamaBins/linux-arm64/llama-addon.node
-rw-r--r-- 1 root root 1184376 Oct 6 10:32 /usr/local/lib/node_modules/catai/node_modules/node-llama-cpp/llamaBins/linux-arm64/llama-addon.node
I'm trying to run Wizard-Vicuna-13B-Uncensored model on a VM (16GB RAM), but i'm getting the below error:
Error: Missing field nGpuLayers
at LLamaCpp. (file:///usr/local/lib/node_modules/catai/node_modules/llama-node/dist/llm/llama-cpp.js:63:35)
at Generator.next ()
at file:///usr/local/lib/node_modules/catai/node_modules/llama-node/dist/llm/llama-cpp.js:33:61
at new Promise ()
at __async (file:///usr/local/lib/node_modules/catai/node_modules/llama-node/dist/llm/llama-cpp.js:17:10)
at LLamaCpp.load (file:///usr/local/lib/node_modules/catai/node_modules/llama-node/dist/llm/llama-cpp.js:61:12)
at LLM.load (/usr/local/lib/node_modules/catai/node_modules/llama-node/dist/index.cjs:52:21)
at #addNew (file:///usr/local/lib/node_modules/catai/src/alpaca-client/node-llama/process-pull.js:88:21)
at new NodeLlamaActivePull (file:///usr/local/lib/node_modules/catai/src/alpaca-client/node-llama/process-pull.js:19:38)
at file:///usr/local/lib/node_modules/catai/src/alpaca-client/node-llama/node-llama.js:8:48 {
code: 'InvalidArg'
}
Describe the bug
The installation of the package fails part-way. Node does not find an entrypoint in the ~/catai/models
directory, which suggests that something goes haywire in the installation machinery. I'm trying to test out the performance of the small 3B version of StableLM on an Odroid N2+ development board using CatAI.
-> % npm install -g catai
npm ERR! code 1
npm ERR! path /home/alarm/.nvm/versions/node/v20.6.0/lib/node_modules/catai
npm ERR! command failed
npm ERR! command sh -c node ./dist/cli/cli.js postinstall
npm ERR! CatAI Migrated to v0.3.13
npm ERR! node:internal/process/promises:289
npm ERR! triggerUncaughtException(err, true /* fromPromise */);
npm ERR! ^
npm ERR!
npm ERR! [Error: ENOENT: no such file or directory, scandir '/home/alarm/catai/models'] {
npm ERR! errno: -2,
npm ERR! code: 'ENOENT',
npm ERR! syscall: 'scandir',
npm ERR! path: '/home/alarm/catai/models'
npm ERR! }
npm ERR!
npm ERR! Node.js v20.6.0
The provided error message, CatAI Migrated to v0.3.13
, is not of particular help to me.
Could you explain in more detail what this error message is about? ๐
Screenshots
N/A
Desktop (please complete the following information):
Linux cinedroid 5.10.2-6-ARCH #1 SMP PREEMPT Mon Dec 28 21:22:54 AST 2020 aarch64 GNU/Linux
)catai --version
)node --version
)Hi, I'm new to using node, so I'm not sure what's going on. I can't use catai because it says it doesn't find my node installation.
Steps:
catai list
, and it fails.catai models
works, and listed all available models, and catai install Stable-Vicuna-13B
downloads, but it failed when it tried to use the model.catai list
, and it still fails.Note: I can go to C:\Users\sabsa\AppData\Roaming\nvm\v20.2.0\node_modules\catai\scripts
manually, and run npm run list
and it will say "No model downloaded", so it works if I run it manually.
Note 2: so knowing this, I download Stable Vicuna-13B again, and ran npm run use Stable-Vicuna-13B
in the node_modules/catai/scripts directory manually, and it worked. I then ran npm start -- --production true --ui catai
. it started a server, but then failed with Error: Missing field 'nGpuLayers'
so I don't know if that's happening because I didn't start Catai the correct way, or if llamacpp was updated and llama-node is out of date?
error message:
C:\Users\sabagithub>catai list
$ cd C:\Users\sabagithub\AppData\Roaming\npm\node_modules\catai
$ npm run list
node:net:426
throw errnoException(err, 'open');
^Error: open EISDIR
at new Socket (node:net:426:13)
at createWritableStdioStream (node:internal/bootstrap/switches/is_main_thread:80:18)
at process.getStdout [as stdout] (node:internal/bootstrap/switches/is_main_thread:150:12)
at console.get (node:internal/console/constructor:209:42)
at console.value (node:internal/console/constructor:337:50)
at console.log (node:internal/console/constructor:376:61)
at runScript (node:internal/process/execution:94:7)
at evalScript (node:internal/process/execution:104:10)
at node:internal/main/eval_string:50:3 {
errno: -4068,
code: 'EISDIR',
syscall: 'open'
}Node.js v20.2.0
node:internal/modules/cjs/loader:1073
throw err;
^Error: Cannot find module 'C:\node_modules\npm\bin\npm-cli.js'
at Module._resolveFilename (node:internal/modules/cjs/loader:1070:15)
at Module._load (node:internal/modules/cjs/loader:923:27)
at Function.executeUserEntryPoint [as runMain] (node:internal/modules/run_main:83:12)
at node:internal/main/run_main_module:23:47 {
code: 'MODULE_NOT_FOUND',
requireStack: []
}Node.js v20.2.0
Could not determine Node.js install directory
node:net:426
throw errnoException(err, 'open');
^Error: open EISDIR
at new Socket (node:net:426:13)
at createWritableStdioStream (node:internal/bootstrap/switches/is_main_thread:80:18)
at process.getStdout [as stdout] (node:internal/bootstrap/switches/is_main_thread:150:12)
at console.get (node:internal/console/constructor:209:42)
at console.value (node:internal/console/constructor:337:50)
at console.log (node:internal/console/constructor:376:61)
at runScript (node:internal/process/execution:94:7)
at evalScript (node:internal/process/execution:104:10)
at node:internal/main/eval_string:50:3 {
errno: -4068,
code: 'EISDIR',
syscall: 'open'
}Node.js v20.2.0
node:internal/modules/cjs/loader:1073
throw err;
^Error: Cannot find module 'C:\node_modules\npm\bin\npm-cli.js'
at Module._resolveFilename (node:internal/modules/cjs/loader:1070:15)
at Module._load (node:internal/modules/cjs/loader:923:27)
at Function.executeUserEntryPoint [as runMain] (node:internal/modules/run_main:83:12)
at node:internal/main/run_main_module:23:47 {
code: 'MODULE_NOT_FOUND',
requireStack: []
}Node.js v20.2.0
Could not determine Node.js install directory
at file:///C:/Users/sabagithub/AppData/Roaming/npm/node_modules/catai/scripts/cli.js:62:27
exit code: 1
catai server --ui chatGPT
should be catai serve --ui chatGPT
Hello,
I am looking to see where/how cpu and/or gpu information is passed during server start but I am unable to find it.
Thank you
Got this:
catai $ catai serve
$ cd /usr/local/lib/node_modules/catai
$ npm start -- --production true --ui catai
> [email protected] start
> node src/index.js --production true --ui catai
/bin/bash: line 1: 49127 Illegal instruction: 4 npm start -- --production true --ui catai
/bin/bash: line 1: 49127 Illegal instruction: 4 npm start -- --production true --ui catai
at file:///usr/local/lib/node_modules/catai/scripts/cli.js:69:27
exit code: 132 (Illegal instruction)
C:\Users\micro\Downloads>catai serve --ui chatGPT
$ cd C:\Users\micro\AppData\Roaming\npm\node_modules\catai
$ npm start -- --production true --ui chatGPT
> [email protected] start
> node src/index.js --production true --ui chatGPT
fatal runtime error: Rust cannot catch foreign exceptions
fatal runtime error: Rust cannot catch foreign exceptions
at file:///C:/Users/micro/AppData/Roaming/npm/node_modules/catai/scripts/cli.js:69:27
exit code: 9
Describe the bug
I get this error trying to use the vicuna 13b uncensored model
llama.cpp: loading model from /Users/jvisker/catai/models/Vicuna-13B-Uncensored
error loading model: unrecognized tensor type 4
llama_init_from_file: failed to load model
Listening on http://127.0.0.1:3000
node:internal/process/promises:288
triggerUncaughtException(err, true /* fromPromise */);
^
[Error: Failed to initialize LLama context from file: /Users/jvisker/catai/models/Vicuna-13B-Uncensored]{
code: 'GenericFailure'
}
Desktop (please complete the following information):
It works great on the 7B one
hi i like configure catai.
https://withcatai.github.io/node-llama-cpp/types/LlamaModelOptions.html
but this website dont work.
can you provide me some good configure to your bot?
PS.
i lunch this catai github program on S8+ just install git and cmake and the catai run on termux normal :)
I tried this on the 22nd and was able to install models but not get it to serve (it complains model not found).
With the latest version it doesn't appear to be installing models anymore.
catai install Vicuna-13B
$ cd /usr/lib/node_modules/catai
When I run install I just see a cd command echo'd out to the terminal and nothing else. Same thing if I try to run it from that directory.
remote-catai example does not work :
progress.stdout.write(token);
should be process.stdout.write(token);
the example send prompt before the ws is open.
we should first modify modify remote-catai, adding in the _init() function
this._ws.on('open', () => {
this.emit("open")
});
then we should wait for the 'open' event to send the prompt
import { RemoteCatAI } from "catai";
const catai = new RemoteCatAI("ws://localhost:3000");
catai.on("open", async () => {
console.log("Connected");
const response = await catai.prompt("Write me 100 words story", (token) => {
process.stdout.write(token);
});
console.log(`Total text length: ${response.length}`);
catai.close();
});
Please refer to the troubleshooting before opening an issue. You might find the solution there.
Describe the bug
You must be careful on logo image 404 error!
Screenshots
If applicable, add screenshots to help explain your problem.
Desktop (please complete the following information):
catai --version
)node --version
)The fetch URL for GPT4All-13B (and a few others) requires a basic auth, so isn't downloadable from the models menu:
https://huggingface.co/Pi3141/alpaca-GPT4All-13BB-ggml/resolve/main/ggml-model-q4_0.bi
Also, is the end meant to be .bin
like the others?
It would be nice to allow people to specify settings like data directory & port to use instead of hard-coded values without editing the package source.
After Installing a model and running catai serve I get this error:
catai use 30B
$ cd /usr/lib/node_modules/catai
$ npm run use 30B
> [email protected] use
> zx scripts/use.js 30B
Model set to 30B
user@Machine:~/FastChat$ catai serve --ui chatGPT
$ cd /usr/lib/node_modules/catai
$ npm start production chatGPT
> [email protected] start
> node src/index.js production chatGPT
llama.cpp: loading model from /home/user/catai/models/30B
llama_model_load_internal: format = 'ggml' (old version with low tokenizer quality and no mmap support)
llama_model_load_internal: n_vocab = 32000
llama_model_load_internal: n_ctx = 1024
llama_model_load_internal: n_embd = 6656
llama_model_load_internal: n_mult = 256
llama_model_load_internal: n_head = 52
llama_model_load_internal: n_layer = 60
llama_model_load_internal: n_rot = 128
llama_model_load_internal: ftype = 2 (mostly Q4_0)
llama_model_load_internal: n_ff = 17920
llama_model_load_internal: n_parts = 1
llama_model_load_internal: model size = 30B
llama_model_load_internal: ggml ctx size = 19856856.30 KB
llama_model_load_internal: mem required = 21695.46 MB (+ 6248.00 MB per state)
....................................................................................................
llama_init_from_file: kv self size = 3120.00 MB
file:///usr/lib/node_modules/catai/src/chat.js:7
throw new Error('Model not found, try re-downloading the model');
^
Error: Model not found, try re-downloading the model
at file:///usr/lib/node_modules/catai/src/chat.js:7:11
Node.js v20.0.0
llama.cpp: loading model from /home/user/catai/models/30B
llama_model_load_internal: format = 'ggml' (old version with low tokenizer quality and no mmap support)
llama_model_load_internal: n_vocab = 32000
llama_model_load_internal: n_ctx = 1024
llama_model_load_internal: n_embd = 6656
llama_model_load_internal: n_mult = 256
llama_model_load_internal: n_head = 52
llama_model_load_internal: n_layer = 60
llama_model_load_internal: n_rot = 128
llama_model_load_internal: ftype = 2 (mostly Q4_0)
llama_model_load_internal: n_ff = 17920
llama_model_load_internal: n_parts = 1
llama_model_load_internal: model size = 30B
llama_model_load_internal: ggml ctx size = 19856856.30 KB
llama_model_load_internal: mem required = 21695.46 MB (+ 6248.00 MB per state)
....................................................................................................
llama_init_from_file: kv self size = 3120.00 MB
file:///usr/lib/node_modules/catai/src/chat.js:7
throw new Error('Model not found, try re-downloading the model');
^
Error: Model not found, try re-downloading the model
at file:///usr/lib/node_modules/catai/src/chat.js:7:11
Node.js v20.0.0
at file:///usr/lib/node_modules/catai/scripts/cli.js:55:27
exit code: 1
Strange, as it appears to find and load the model, then afterwards complain that it's not found.
This happens with the &B model as well, though I am currently having trouble re-installing it.
Spinning off into it's own issue.
Please refer to the troubleshooting before opening an issue. You might find the solution there.
Describe the bug
CatAI after updating with the advice on issue #29 now no longer starts up at all, and tosses a segmentation fault. The errors produced when trying to start CatAI are the following:
[email protected] start
node src/index.js --production true --ui catai
Segmentation fault (core dumped)
Using catai update results into a different error, but same outcome
fatal runtime error: Rust cannot catch foreign exceptions
Aborted (core dumped)
fatal runtime error: Rust cannot catch foreign exceptions
Aborted (core dumped)
at file:///home/thesystemguy/.nvm/versions/node/v20.2.0/lib/node_modules/catai/scripts/cli.js:69:27
exit code: 134 (Process aborted)
Reinstallation does the same thing as using catai update. This is a showstopper problem.
Desktop (please complete the following information):
OS: Linux Mint 21.2 Cinnamon, Linux 5.15.0-76-generic
Browser: n/a (no start)
CatAI version 1.0.2 (as advised in issue #29)
Node.js version v18.16.0
CPU: AMD Ryzen 5 5600H with Radeon Graphics
RAM: 30.7 GiB (512 MiB reserved to graphics chipset)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.