Validations <input ty

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

<a class="user-mention notranslate" data-hovercard-type="user" data-hover

Chat template for Codellama70b? Getting terrible and off-topic output compared to web-hosted Codellama70b about continue HOT 6 CLOSED

ewebgh33 commented on June 2, 2024 1

Chat template for Codellama70b? Getting terrible and off-topic output compared to web-hosted Codellama70b

from continue.

Comments (6)

sestinj commented on June 2, 2024 3

@EmmaWebGH I just became aware of the prompting issues this afternoon and they have been solved in 0.8.7. I tested with both the "free-trial" and "together" providers.

If you're seeing template-related problems while using Ollama though, this problem might be on the Ollama side, as we rely on them to format messages. I haven't had the chance to test without having a GPU large enough for CodeLlama-70b, so let me know and I could reach out to them

from continue.

ewebgh33 commented on June 2, 2024 1

Great, I will update when I can!
You really are on top of everything. Thankyou for such a useful extension!

from continue.

sestinj commented on June 2, 2024 1

Ok, sounds like this is about a typical response for CodeLlama-70b : ) which means that this should be resolved. Let me know if anything else comes up!

from continue.

davideuler commented on June 2, 2024

I've met the same issue. I started codellama 70b instruct gguf on my mac m1 studio by llama.cpp server like this:

../llama.cpp/server -m ./codellama-70b-instruct.Q5_K_S.gguf -np 2 -c 4096 --host 0.0.0.0 --port 8080

And configured the model in config.json

    {
      "title": "codellama-70b",
      "model": "codellama-70b",
      "completionOptions": {},
      "contextLength": 4096,
      "provider": "llama.cpp",
      "apiBase": "http://dev.myserver.com:8080" 
    },

On the VS Code Continue plugin, it keeps output lots of code and messages without stopping.
I wonder how could set the proper prompt template, and the stop token.

For huggingface chat-ui, the following config works:

{
	"name": "codellama-70b-llamacpp",
	"chatPromptTemplate" : "<s>{{#if @root.preprompt}}Source: system\n\n {{@root.preprompt}} <step> {{/if}}{{#each messages}}{{#ifUser}}Source: user\n\n {{content}} <step> {{/ifUser}}{{#ifAssistant}}Source: assistant\n\n {{content}} <step> {{/ifAssistant}}{{/each}}Source: assistant\nDestination: user\n\n ",
	
	"parameters": {
		"temperature": 0.5,
		"top_p": 0.95,
		"repetition_penalty": 1.2,
		"top_k": 50,
		"truncate": 3072,
		"max_new_tokens": 2048,
		"stop" : ["<step>", "Source: assistant"]
	},
	
	"endpoints": [{
		"type": "openai",
		"baseURL": "http://dev.myserver.com:8080/v1"
	
	}]
}

from continue.

sestinj commented on June 2, 2024

@davideuler Everything in your config looks right and seems to indicate that the prompt should be correctly set. here is the code where we format the prompt for codellama-70b. You could double check that the correct formatting is being sent by going to the "Output" tab in the bottom bar of VS Code (next to terminal) and then selecting "Continue - ..." in the dropdown on the right. It shows all raw prompt/completions

If this looks correct, then perhaps there might be a bad interaction with the server (e.g. it also formats the prompt, leading to it happening twice)

from continue.

davideuler commented on June 2, 2024

@davideuler Everything in your config looks right and seems to indicate that the prompt should be correctly set. here is the code where we format the prompt for codellama-70b. You could double check that the correct formatting is being sent by going to the "Output" tab in the bottom bar of VS Code (next to terminal) and then selecting "Continue - ..." in the dropdown on the right. It shows all raw prompt/completions

If this looks correct, then perhaps there might be a bad interaction with the server (e.g. it also formats the prompt, leading to it happening twice)

Sestinj, Thanks, I've checked the output in VS Code. The request which was sent to llama.cpp is ok.

And when I am with the latest version of continue plugin, it shows me response related to the code and lots of apologizing message like "I apologize, but as a responsible AI language model".

from continue.

Chat template for Codellama70b? Getting terrible and off-topic output compared to web-hosted Codellama70b about continue HOT 6 CLOSED

Comments (6)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent