Giter VIP home page Giter VIP logo

Comments (11)

chauhang avatar chauhang commented on August 16, 2024 2

Got the llama2-7b model working on macOS and Android.

Local model runtime on macOS: Model load time: 10.39s, Time to first generated token: 0.739s, Generated token rate: 0.3089 toks/sec
Android Samsung Galaxy S22 runtime: Model load time: 12.05s, Time to first generated token: 8.448s, Generated token rate: 0.0777 toks/sec

Updated list of issues:

LLama2 model

  • vocab_size in params.json from HF downloads is -1, need to manually change to 32000 to proceed forward, update script/readme steps
  • Export with SDPA failed with errors for AttributeError: '_OpNamespace' 'llama' object has no attribute 'sdpa_with_kv_cache'
  • Update readme to add steps for generating tokenizer.bin for llama2 model
  • Optimize local model runtime on macOS (Model load time: 10.39s, Time to first generated token: 0.739s, Generated token rate: 0.3089 toks/sec)
  • Android Emulator -- pte file transfer hangs / creashes emulator for 4gb model file
  • Add steps for running on iOS

Stories Model

  • Fix error RuntimeError: mmap can only be used with files saved with torch.save(./stories/stories110M.pt, _use_new_zipfile_serialization=True)

from executorch.

iseeyuan avatar iseeyuan commented on August 16, 2024 1

Thanks @chauhang for reporting this issue! Could you confirm the vocab_size in llama2 7B model's params.json?

from executorch.

chauhang avatar chauhang commented on August 16, 2024 1

Also tested for llama2-7b after updating vocab_size to 32000, getting error AttributeError: '_OpNamespace' 'llama' object has no attribute 'sdpa_with_kv_cache'

Full error logs here

from executorch.

chauhang avatar chauhang commented on August 16, 2024 1

After removing spda param was able to proceed uptill running model on computer. On running the model get error The tokenizer vocab size 84545034 is larger than the model vocab size 32000. .... In function generate(), assert failed (num_prompt_tokens >= 1): Expected at least 1 prompt token

Full logs here

from executorch.

chauhang avatar chauhang commented on August 16, 2024

@iseeyuan For the meta-llama/Llama-2-7b model the params.json on HF is:

{"dim": 4096, "multiple_of": 256, "n_heads": 32, "n_layers": 32, "norm_eps": 1e-05, "vocab_size": -1}

Also checked for 13b/70b base models and the chat models all of them have vocab_size=-1 in their params.json

from executorch.

iseeyuan avatar iseeyuan commented on August 16, 2024

@chauhang , It's a bug in our code. We should provide an option so that the export_llama works out of box, given a downloaded folder, either from llama official website, or from HuggingFace.

from executorch.

iseeyuan avatar iseeyuan commented on August 16, 2024

@chauhang , the second issue, Export with SDPA failed with [errors](https://gist.github.com/chauhang/ca75857c6a152df65b79302fefa1fe2c?permalink_comment_id=5015390#gistcomment-5015390) for AttributeError: '_OpNamespace' 'llama' object has no attribute 'sdpa_with_kv_cache' should have been fixed in main branch over the weekend. Could you pull the updated version and give it another try?
The performance afterwards may also get affected by using sdpa_with_kv_cache.

from executorch.

kimishpatel avatar kimishpatel commented on August 16, 2024

Also tested for llama2-7b after updating vocab_size to 32000, getting error AttributeError: '_OpNamespace' 'llama' object has no attribute 'sdpa_with_kv_cache'

Might be related to @larryliu0820's diff that got reverted recently

from executorch.

kimishpatel avatar kimishpatel commented on August 16, 2024

updated

we should just cherry-pick that, right?

from executorch.

mergennachin avatar mergennachin commented on August 16, 2024

Thanks @chauhang

Some fixes

#2926

from executorch.

mergennachin avatar mergennachin commented on August 16, 2024

Things are fixed now.

from executorch.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.