Comments (4)
I appreciate your enthusiasm with testing the model out, since I only added it a few hours ago... but I'm still adding support for it to the library! I will let you know when it is supported.
from transformers.js.
You can follow along in the v3 branch: #545
Here's some example code which should work:
import { AutoTokenizer, AutoProcessor, RawImage, LlavaForConditionalGeneration } from '@xenova/transformers';
// Load tokenizer, processor and model
const model_id = 'Xenova/nanoLLaVA';
const tokenizer = await AutoTokenizer.from_pretrained(model_id);
const processor = await AutoProcessor.from_pretrained(model_id);
const model = await LlavaForConditionalGeneration.from_pretrained(model_id, {
dtype: {
embed_tokens: 'fp16',
vision_encoder: 'q8', // or 'fp16'
decoder_model_merged: 'q4', // or 'q8'
},
});
// Prepare text inputs
const prompt = 'Describe this image in detail';
const messages = [
{ 'role': 'user', 'content': `<image>\n${prompt}` }
]
const text = tokenizer.apply_chat_template(messages, { tokenize: false, add_generation_prompt: true })
const text_inputs = tokenizer(text, { padding: true });
// Prepare vision inputs
const url = 'https://huggingface.co/datasets/Xenova/transformers.js-docs/resolve/main/cats.jpg'
const image = await RawImage.fromURL(url);
const vision_inputs = await processor(image);
// Generate response
const inputs = { ...text_inputs, ...vision_inputs };
const output = await model.generate({
...inputs,
do_sample: false,
max_new_tokens: 64,
});
// Decode output
const decoded = tokenizer.batch_decode(output, { skip_special_tokens: false });
console.log('decoded', decoded);
Note that this may change in future, and I'll update the model card when I've done some more testing.
from transformers.js.
The model card has been updated with example code 👍 https://huggingface.co/Xenova/nanoLLaVA
We also put an online demo out for you to try: https://huggingface.co/spaces/Xenova/experimental-nanollava-webgpu
Example videos:
nanollava-webgpu.mp4
nanollava-webgpu-2.mp4
from transformers.js.
Brilliant, thank you very much!
I'm closely watching this feature, and if you link a PR for this I can glean from the work and help maintain the code!
from transformers.js.
Related Issues (20)
- V3 audio transcription: aud.subarray is not a function HOT 8
- range error: array buffer allocation failed <- how to catch this error?
- transformers@latest: Unsupported model IR version: 9, max supported IR version: 8 HOT 1
- Support nomic-ai/nomic-embed-vision-v1.5 HOT 1
- AutoModel.from_pretrained - Which model is loaded HOT 1
- The scripts/convert.py script fails for a few reasons HOT 1
- RAGatouille/Colbert support
- Result is wrong when decoding tokens one by one HOT 1
- How do you delete a downloaded model? HOT 2
- Support for Both Word-Level and Sentence-Level Timestamps in ASR Decoding
- Segmentation fault while converting Bert-base-uncased with README command HOT 2
- Can't run depth-anything-v2 HOT 1
- Support for react-native
- JavaScript code completion model
- [Severe] Memory leak issue under WebGPU Whisper transcribe pipeline
- 4bit ONNX models support HOT 1
- how to retain spiece token markers HOT 2
- No loader is configured for ".node" files
- musicgen example run error on lastest v3
- compat with transformers >= 4.40 and tokenizers >= 0.19 HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from transformers.js.