Export llama is failing with errors for llama and stories models Err

Got the llama2-7b model working on <a href="https://gist.github.com/chauhang/ca75857c6

Thanks <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-u

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Thanks <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-u

export llama failing with errors for runtime errors about executorch HOT 11 CLOSED

chauhang commented on August 16, 2024 2

export llama failing with errors for runtime errors

from executorch.

Comments (11)

chauhang commented on August 16, 2024 2

Got the llama2-7b model working on macOS and Android.

Local model runtime on macOS: Model load time: 10.39s, Time to first generated token: 0.739s, Generated token rate: 0.3089 toks/sec
Android Samsung Galaxy S22 runtime: Model load time: 12.05s, Time to first generated token: 8.448s, Generated token rate: 0.0777 toks/sec

Updated list of issues:

LLama2 model

vocab_size in params.json from HF downloads is -1, need to manually change to 32000 to proceed forward, update script/readme steps
Export with SDPA failed with errors for AttributeError: '_OpNamespace' 'llama' object has no attribute 'sdpa_with_kv_cache'
Update readme to add steps for generating tokenizer.bin for llama2 model
Optimize local model runtime on macOS (Model load time: 10.39s, Time to first generated token: 0.739s, Generated token rate: 0.3089 toks/sec)
Android Emulator -- pte file transfer hangs / creashes emulator for 4gb model file
Add steps for running on iOS

Stories Model

Fix error RuntimeError: mmap can only be used with files saved with torch.save(./stories/stories110M.pt, _use_new_zipfile_serialization=True)

from executorch.

iseeyuan commented on August 16, 2024 1

Thanks @chauhang for reporting this issue! Could you confirm the vocab_size in llama2 7B model's params.json?

from executorch.

chauhang commented on August 16, 2024 1

Also tested for llama2-7b after updating vocab_size to 32000, getting error AttributeError: '_OpNamespace' 'llama' object has no attribute 'sdpa_with_kv_cache'

Full error logs here

from executorch.

chauhang commented on August 16, 2024 1

After removing spda param was able to proceed uptill running model on computer. On running the model get error The tokenizer vocab size 84545034 is larger than the model vocab size 32000. .... In function generate(), assert failed (num_prompt_tokens >= 1): Expected at least 1 prompt token

Full logs here

from executorch.

chauhang commented on August 16, 2024

@iseeyuan For the meta-llama/Llama-2-7b model the params.json on HF is:

{"dim": 4096, "multiple_of": 256, "n_heads": 32, "n_layers": 32, "norm_eps": 1e-05, "vocab_size": -1}

Also checked for 13b/70b base models and the chat models all of them have vocab_size=-1 in their params.json

from executorch.

iseeyuan commented on August 16, 2024

@chauhang , It's a bug in our code. We should provide an option so that the export_llama works out of box, given a downloaded folder, either from llama official website, or from HuggingFace.

from executorch.

iseeyuan commented on August 16, 2024

@chauhang , the second issue, Export with SDPA failed with [errors](https://gist.github.com/chauhang/ca75857c6a152df65b79302fefa1fe2c?permalink_comment_id=5015390#gistcomment-5015390) for AttributeError: '_OpNamespace' 'llama' object has no attribute 'sdpa_with_kv_cache' should have been fixed in main branch over the weekend. Could you pull the updated version and give it another try?
The performance afterwards may also get affected by using sdpa_with_kv_cache.

from executorch.

kimishpatel commented on August 16, 2024

Also tested for llama2-7b after updating vocab_size to 32000, getting error AttributeError: '_OpNamespace' 'llama' object has no attribute 'sdpa_with_kv_cache'

Might be related to @larryliu0820's diff that got reverted recently

from executorch.

kimishpatel commented on August 16, 2024

updated

we should just cherry-pick that, right?

from executorch.

mergennachin commented on August 16, 2024

Thanks @chauhang

Some fixes

#2926

from executorch.

mergennachin commented on August 16, 2024

Things are fixed now.

from executorch.

export llama failing with errors for runtime errors about executorch HOT 11 CLOSED

Comments (11)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent