Question Validation <

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Llama3-8B-Instruct not supported in `llama_index.llms.llama_cpp` about llama_index HOT 7 CLOSED

VaradDeshmukh97 commented on June 3, 2024

Llama3-8B-Instruct not supported in `llama_index.llms.llama_cpp`

from llama_index.

Comments (7)

dosubot commented on June 3, 2024 3

Given the persistent issues with downloading Llama3-8B GGUF models across multiple repositories, it seems like the problem might be more complex than initially thought. The error indicating a file size smaller than expected, despite the actual model size being significantly larger, suggests there could be an underlying issue with how the download process is handled or an issue with the model URLs being accessed.

Since the problem persists across different GGUF versions and repositories, and considering the steps already attempted haven't resolved the issue, it might be beneficial to have a deeper look into the llama_index.llms.llama_cpp implementation or the way it interacts with the model repositories. There could be a need for an update or a patch to address compatibility with Llama3-8B GGUF models specifically.

@logan-markewich, this one's a bit out of my wheelhouse, could you step in and provide your guidance?

Details

Sources

About Dosu
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

To continue the conversation, mention @dosu.

from llama_index.

dosubot commented on June 3, 2024 2

The issue you're encountering with Llama3-8B models, specifically when trying to download the GGUF model using LlamaCPP() from llama_index.llms.llama_cpp, is due to a check in the _download_url method that verifies the downloaded content is at least 1 MB. If the content is less than 1 MB, it triggers an error and aborts the download, which is what you're experiencing.

To address this, you could adjust the code that performs this size check. This involves either removing or modifying the condition that checks if total_size is less than 1 MB in the llama_index.llms.llama_cpp module. By doing so, you'll allow the download of models that are smaller than the expected size. However, be cautious with this approach as it bypasses a safety check designed to ensure the integrity of the downloaded model.

This solution directly targets the root cause of the download failure you're facing with Llama3-8B models from TheQuantFactory or similar sources.

Details

Sources

About Dosu
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

To continue the conversation, mention @dosu.

from llama_index.

dosubot commented on June 3, 2024 2

The error message you're seeing, indicating the content is only 48,119 bytes, suggests a problem with the download process rather than the actual size of the model. This could be due to an incorrect URL, access restrictions, or an incorrect Content-Length header provided by the server. The LlamaCPP class's _download_url method checks the Content-Length header to ensure the file is at least 1 MB. If the header indicates a size less than this, it triggers the error you encountered.

To work around this issue, you might consider specifying an alternative model URL directly in the LlamaCPP class initialization if you suspect the default URL is incorrect or problematic. Here's a quick example of how you can do this:

from llama_index.llms.llama_cpp import LlamaCPP

# Replace this with the URL of the model you wish to use
alternative_model_url = "YOUR_ALTERNATIVE_MODEL_URL"

llm = LlamaCPP(
    model_url=alternative_model_url,
    temperature=0.1,
    max_new_tokens=256,
    context_window=3900,
    verbose=True,
)

response = llm.complete("Hello, how are you?")
print(str(response))

Replace "YOUR_ALTERNATIVE_MODEL_URL" with the URL of the Llama3-8B GGUF model you're trying to download. This approach allows you to bypass the issue by using a different source for the model file, potentially avoiding the download size check error.

If you continue to face issues or have any further questions, feel free to ask for more assistance.

Details

Sources

About Dosu
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

To continue the conversation, mention @dosu.

from llama_index.

VaradDeshmukh97 commented on June 3, 2024

@dosu
Obviously the model is greater than 1 MB in size. I didn't find anybody else facing a similar problem.

from llama_index.

VaradDeshmukh97 commented on June 3, 2024

@dosu
This does not solve my problem. The issue persists with all the GGUF versions from multiple repos, I tried using QuantFactory, NousResearch, etc. Maybe this needs a deeper look, does it? @logan-markewich?

from llama_index.

logan-markewich commented on June 3, 2024

@VaradDeshmukh97 seems like the URL is incorrect. Try running wget <url> to see if it's downloading the proper model weights.

Otherwise though, I would just use ollama, so much easier

from llama_index.

logan-markewich commented on June 3, 2024

(This isn't an issue with llamaindex though)

from llama_index.

Llama3-8B-Instruct not supported in `llama_index.llms.llama_cpp` about llama_index HOT 7 CLOSED

Comments (7)

Details

Details

Details

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent