Giter VIP home page Giter VIP logo

Comments (4)

xenova avatar xenova commented on September 3, 2024 1

I appreciate your enthusiasm with testing the model out, since I only added it a few hours ago... but I'm still adding support for it to the library! I will let you know when it is supported.

from transformers.js.

xenova avatar xenova commented on September 3, 2024 1

You can follow along in the v3 branch: #545

Here's some example code which should work:

import { AutoTokenizer, AutoProcessor, RawImage, LlavaForConditionalGeneration } from '@xenova/transformers';

// Load tokenizer, processor and model
const model_id = 'Xenova/nanoLLaVA';
const tokenizer = await AutoTokenizer.from_pretrained(model_id);
const processor = await AutoProcessor.from_pretrained(model_id);
const model = await LlavaForConditionalGeneration.from_pretrained(model_id, {
    dtype: {
        embed_tokens: 'fp16',
        vision_encoder: 'q8', // or 'fp16'
        decoder_model_merged: 'q4', // or 'q8'
    },
});

// Prepare text inputs
const prompt = 'Describe this image in detail';
const messages = [
    { 'role': 'user', 'content': `<image>\n${prompt}` }
]
const text = tokenizer.apply_chat_template(messages, { tokenize: false, add_generation_prompt: true })
const text_inputs = tokenizer(text, { padding: true });

// Prepare vision inputs
const url = 'https://huggingface.co/datasets/Xenova/transformers.js-docs/resolve/main/cats.jpg'
const image = await RawImage.fromURL(url);
const vision_inputs = await processor(image);

// Generate response
const inputs = { ...text_inputs, ...vision_inputs };
const output = await model.generate({
    ...inputs,
    do_sample: false,
    max_new_tokens: 64,
});

// Decode output
const decoded = tokenizer.batch_decode(output, { skip_special_tokens: false });
console.log('decoded', decoded);

Note that this may change in future, and I'll update the model card when I've done some more testing.

from transformers.js.

xenova avatar xenova commented on September 3, 2024 1

The model card has been updated with example code 👍 https://huggingface.co/Xenova/nanoLLaVA

We also put an online demo out for you to try: https://huggingface.co/spaces/Xenova/experimental-nanollava-webgpu

Example videos:

nanollava-webgpu.mp4
nanollava-webgpu-2.mp4

from transformers.js.

kendelljoseph avatar kendelljoseph commented on September 3, 2024

Brilliant, thank you very much!

I'm closely watching this feature, and if you link a PR for this I can glean from the work and help maintain the code!

from transformers.js.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.