Comments (7)
Given the persistent issues with downloading Llama3-8B GGUF models across multiple repositories, it seems like the problem might be more complex than initially thought. The error indicating a file size smaller than expected, despite the actual model size being significantly larger, suggests there could be an underlying issue with how the download process is handled or an issue with the model URLs being accessed.
Since the problem persists across different GGUF versions and repositories, and considering the steps already attempted haven't resolved the issue, it might be beneficial to have a deeper look into the llama_index.llms.llama_cpp
implementation or the way it interacts with the model repositories. There could be a need for an update or a patch to address compatibility with Llama3-8B GGUF models specifically.
@logan-markewich, this one's a bit out of my wheelhouse, could you step in and provide your guidance?
Details
Sources
- llama-index-legacy/llama_index/legacy/llms/llama_cpp.py
- llama-index-integrations/llms/llama-index-llms-llama-cpp/llama_index/llms/llama_cpp/base.py
- llama-index-legacy/VERSION
- llama-index-core/llama_index/core/init.py
About Dosu
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.
Help me be more useful! Please leave a ๐ if this is helpful and ๐ if it is irrelevant.
To continue the conversation, mention @dosu.
from llama_index.
The issue you're encountering with Llama3-8B models, specifically when trying to download the GGUF model using LlamaCPP()
from llama_index.llms.llama_cpp
, is due to a check in the _download_url
method that verifies the downloaded content is at least 1 MB. If the content is less than 1 MB, it triggers an error and aborts the download, which is what you're experiencing.
To address this, you could adjust the code that performs this size check. This involves either removing or modifying the condition that checks if total_size
is less than 1 MB in the llama_index.llms.llama_cpp
module. By doing so, you'll allow the download of models that are smaller than the expected size. However, be cautious with this approach as it bypasses a safety check designed to ensure the integrity of the downloaded model.
This solution directly targets the root cause of the download failure you're facing with Llama3-8B models from TheQuantFactory or similar sources.
Details
Sources
- llama-index-legacy/llama_index/legacy/llms/llama_cpp.py
- llama-index-integrations/llms/llama-index-llms-llama-cpp/llama_index/llms/llama_cpp/base.py
About Dosu
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.
Help me be more useful! Please leave a ๐ if this is helpful and ๐ if it is irrelevant.
To continue the conversation, mention @dosu.
from llama_index.
The error message you're seeing, indicating the content is only 48,119 bytes, suggests a problem with the download process rather than the actual size of the model. This could be due to an incorrect URL, access restrictions, or an incorrect Content-Length
header provided by the server. The LlamaCPP
class's _download_url
method checks the Content-Length
header to ensure the file is at least 1 MB. If the header indicates a size less than this, it triggers the error you encountered.
To work around this issue, you might consider specifying an alternative model URL directly in the LlamaCPP
class initialization if you suspect the default URL is incorrect or problematic. Here's a quick example of how you can do this:
from llama_index.llms.llama_cpp import LlamaCPP
# Replace this with the URL of the model you wish to use
alternative_model_url = "YOUR_ALTERNATIVE_MODEL_URL"
llm = LlamaCPP(
model_url=alternative_model_url,
temperature=0.1,
max_new_tokens=256,
context_window=3900,
verbose=True,
)
response = llm.complete("Hello, how are you?")
print(str(response))
Replace "YOUR_ALTERNATIVE_MODEL_URL"
with the URL of the Llama3-8B GGUF model you're trying to download. This approach allows you to bypass the issue by using a different source for the model file, potentially avoiding the download size check error.
If you continue to face issues or have any further questions, feel free to ask for more assistance.
Details
Sources
- llama-index-legacy/llama_index/legacy/llms/llama_cpp.py
- llama-index-integrations/llms/llama-index-llms-llama-cpp/llama_index/llms/llama_cpp/base.py
- docs/docs/examples/llm/llama_2_llama_cpp.ipynb
About Dosu
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.
Help me be more useful! Please leave a ๐ if this is helpful and ๐ if it is irrelevant.
To continue the conversation, mention @dosu.
from llama_index.
@dosu
Obviously the model is greater than 1 MB in size. I didn't find anybody else facing a similar problem.
from llama_index.
@dosu
This does not solve my problem. The issue persists with all the GGUF versions from multiple repos, I tried using QuantFactory, NousResearch, etc. Maybe this needs a deeper look, does it? @logan-markewich?
from llama_index.
@VaradDeshmukh97 seems like the URL is incorrect. Try running wget <url>
to see if it's downloading the proper model weights.
Otherwise though, I would just use ollama, so much easier
from llama_index.
(This isn't an issue with llamaindex though)
from llama_index.
Related Issues (20)
- [Feature Request]: give me chance and see my work
- [Question]: When searching vector databases, queries that have no parameters that will not meet the threshold requirements will be excluded. HOT 1
- [Question]: Google Gemini 1.5 flash model support HOT 3
- [Bug]: ArxivReader.load_papers_and_abstracts trying to read files from hidden directory. HOT 1
- [Feature Request]: Citations Data
- [Bug]: Milvus vector database๏ผAttributeError: 'NoneType' object has no attribute 'condition' HOT 2
- [Bug]: KnowledgeGraphQueryEngine occurs error when query HOT 1
- [Bug]: HOT 2
- [Bug]: loading 5.7 GB data to llama-index HOT 3
- [Documentation]: Add Upstage llm and embedding in document
- [Bug]: Unable to use Seaborn when asking the LLM to graph HOT 12
- [Question]: Is there any way to initialize my index data from the Elastic Search database? HOT 3
- [Question]: Approaches to searching documents requiring different kind of embeddings (multimodal too) HOT 3
- [Question]: Storing Duplicates in vector Db HOT 4
- Microsoft Outlook Email Reader
- [Bug]: sec-filings reader load_data() TypeError HOT 4
- [Feature Request]: Add a Contentful integration
- [Question]: refresh_ref_docs does not work in Chroma even when using docstore. Am I doing something wrong? HOT 7
- [Question]: Run multimodal LLM locally HOT 13
- [Question]: Chroma PersistentClient versus ChromaVectorStore.persist() HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. ๐๐๐
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google โค๏ธ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from llama_index.