vercel / modelfusion Goto Github PK

View Code? Open in Web Editor NEW

1.1K 14.0 77.0 15.99 MB

The TypeScript library for building AI applications.

Home Page: https://modelfusion.dev

License: MIT License

JavaScript 1.61% TypeScript 98.38% Shell 0.01%

chatbot gpt-3 javascript js llm openai ts typescript whisper ai

modelfusion's Issues

Code execution tool

Hi there,

Is it possible to build something similar to PythonREPLTool() (Langchain) to JS/Nodejs ecosystem?

Update pinecone integration to pinecone 1.0.1

Langchain Integration

Is it possible to Integrate the Vectordb with langchain in order to use chains and retrieval agents?
Thanks!

Add chunkOverlap parameter to splitAtToken / splitAtCharacter

Ideally there would just be an optional ‘overlap’ parameter in the existing split functions. This might be a good time to add a test framework as well to test that the splitting works correctly. Ideally jest bc GPT-4 can easily write tests for it.

Google Gemini exploration

Contribution suggestion from @lgrammel

"Google Gemini exploration (this might be impact API design, so I want to prob do it myself)"

LocalAI exploration & docs page (OpenAI)

Contribution suggestion from @lgrammel

"LocalAI exploration & docs page (OpenAI compatible I believe)"

Expose full configuration settings for OpenAI chat and text completions in model settings

Also Automatic1111

Help wanted: Add tutorials and recipes

The ModelFusion examples & tutorials section in the documentation could need more tutorials and recipes.

If you have great examples that would fit into the section, or if you want to write a tutorial, please reach out!

Decouple generateJson and generateTextOrJson from Zod

Add Anthropic tokenizer

node requires >=18, is the requirement too high?

node requires >=18, is the requirement too high?
I installed it with node 14, and no error was reported when it was developed and used, but an error was reported when I tried to package and deploy.
error [email protected]: The engine "node" is incompatible with this module. Expected version ">=18". Got "14.18.0"
error Found incompatible module.

No longer can use on browser only v0.63.0

Is there a way to opt-out from runs when importing?

Module build failed: UnhandledSchemeError: Reading from "node:async_hooks" is not handled by plugins (Unhandled scheme).
Webpack supports "data:" and "file:" URIs by default.
You may need an additional plugin to handle "node:" URIs.
Import trace for requested module:
node:async_hooks
../../node_modules/.pnpm/[email protected]/node_modules/modelfusion/core/getRun.js
../../node_modules/.pnpm/[email protected]/node_modules/modelfusion/core/index.js
../../node_modules/.pnpm/[email protected]/node_modules/modelfusion/index.js
../core/src/nodes/openai/openai-function.ts

Decouple cost calculation from Run

Move pinecone integration into separate module

Support fine-tuned OpenAI models

Caching fails on subsequent invocations

here's a weird thing I'm experiencing. I'm testing out caching (and making a new FileCache class). I'm trying to validate the existing MemoryCache behavior in the only available example at : examples/basic/src/model-provider/ollama/ollama-chat-generate-text-caching-example.ts. I find the first time I run this file, it works. Each subsequent time, it fails with a consistent error. Strangely, if I swap out the Ollama model (going back and forth between mistral:latest vs. llama2:latest), caching will once again work - but only once and then the error returns on subsequent calls until I swap out models again. I'm baffled, since each run occurs on a separate process.

{
  eventType: 'finished',
  functionType: 'generate-text',
  callId: 'call-lMjNEWwz6-HuNHNc207OO',
  model: { provider: 'ollama', modelName: 'llama2:latest' },
  settings: { maxGenerationTokens: 100, stopSequences: [] },
  input: 'Write a short story about a robot learning to love:',
  timestamp: 2024-01-10T18:25:59.000Z,
  startTimestamp: 2024-01-10T18:25:59.000Z,
  finishTimestamp: 2024-01-10T18:26:01.992Z,
  durationInMs: 2989,
  result: {
    status: 'error',
    error: {
      url: 'http://127.0.0.1:11434/api/chat',
      requestBodyValues: [Object],
      statusCode: 200,
      responseBody: '{"model":"llama2:latest","created_at":"2024-01-10T18:26:01.977632Z","message":{"role":"assistant","content":"In the year 2154, robots had been a part of everyday life for centuries. They worked, played, and even lived alongside humans, but they never truly experienced emotions. That was, until the day a robot named Zeta learned to love.\\n\\nZeta was a sleek, silver machine with glowing blue eyes. She had been designed to assist humans in various tasks, from cooking to cleaning to providing companionship. But despite her advanced programming"},"done":true,"total_duration":2958093208,"load_duration":2757542,"prompt_eval_duration":206523000,"eval_count":100,"eval_duration":2746259000}',
      cause: [Object],
      isRetryable: false,
      name: 'ApiCallError'
    }
  }
}
ApiCallError: Invalid JSON response
    at handler (/Users/jakedetels/www/test/modelfusion/examples/basic/node_modules/modelfusion/model-provider/ollama/OllamaChatModel.cjs:269:23)
    at processTicksAndRejections (node:internal/process/task_queues:95:5)
    ... 5 lines matching cause stack trace ...
    at async executeStandardCall (/Users/jakedetels/www/test/modelfusion/examples/basic/node_modules/modelfusion/model-function/executeStandardCall.cjs:45:20)
    at async generateText (/Users/jakedetels/www/test/modelfusion/examples/basic/node_modules/modelfusion/model-function/generate-text/generateText.cjs:6:26)
    at async main (/Users/jakedetels/www/test/modelfusion/examples/basic/src/model-provider/ollama/ollama-chat-generate-text-caching-example.ts:12:17) {
  url: 'http://127.0.0.1:11434/api/chat',
  requestBodyValues: {
    stream: false,
    model: 'llama2:latest',
    messages: [ [Object] ],
    format: undefined,
    options: {
      mirostat: undefined,
      mirostat_eta: undefined,
      mirostat_tau: undefined,
      num_gpu: undefined,
      num_gqa: undefined,
      num_predict: 100,
      num_threads: undefined,
      repeat_last_n: undefined,
      repeat_penalty: undefined,
      seed: undefined,
      stop: [],
      temperature: undefined,
      tfs_z: undefined,
      top_k: undefined,
      top_p: undefined
    },
    template: undefined
  },
  statusCode: 200,
  responseBody: '{"model":"llama2:latest","created_at":"2024-01-10T18:26:01.977632Z","message":{"role":"assistant","content":"In the year 2154, robots had been a part of everyday life for centuries. They worked, played, and even lived alongside humans, but they never truly experienced emotions. That was, until the day a robot named Zeta learned to love.\\n\\nZeta was a sleek, silver machine with glowing blue eyes. She had been designed to assist humans in various tasks, from cooking to cleaning to providing companionship. But despite her advanced programming"},"done":true,"total_duration":2958093208,"load_duration":2757542,"prompt_eval_duration":206523000,"eval_count":100,"eval_duration":2746259000}',
  cause: TypeValidationError: Type validation failed: Structure: {"model":"llama2:latest","created_at":"2024-01-10T18:26:01.977632Z","message":{"role":"assistant","content":"In the year 2154, robots had been a part of everyday life for centuries. They worked, played, and even lived alongside humans, but they never truly experienced emotions. That was, until the day a robot named Zeta learned to love.\n\nZeta was a sleek, silver machine with glowing blue eyes. She had been designed to assist humans in various tasks, from cooking to cleaning to providing companionship. But despite her advanced programming"},"done":true,"total_duration":2958093208,"load_duration":2757542,"prompt_eval_duration":206523000,"eval_count":100,"eval_duration":2746259000}.
  Error message: [
    {
      "code": "invalid_union",
      "unionErrors": [
        {
          "issues": [
            {
              "code": "invalid_type",
              "expected": "number",
              "received": "undefined",
              "path": [
                "prompt_eval_count"
              ],
              "message": "Required"
            }
          ],
          "name": "ZodError"
        },
        {
          "issues": [
            {
              "received": true,
              "code": "invalid_literal",
              "expected": false,
              "path": [
                "done"
              ],
              "message": "Invalid literal value, expected false"
            }
          ],
          "name": "ZodError"
        }
      ],
      "path": [],
      "message": "Invalid input"
    }
  ]
      at safeValidateTypes (/Users/jakedetels/www/test/modelfusion/examples/basic/node_modules/modelfusion/core/schema/validateTypes.cjs:50:20)
      at safeParseJSON (/Users/jakedetels/www/test/modelfusion/examples/basic/node_modules/modelfusion/core/schema/parseJSON.cjs:37:57)
      ... 6 lines matching cause stack trace ...
      at async runSafe (/Users/jakedetels/www/test/modelfusion/examples/basic/node_modules/modelfusion/util/runSafe.cjs:6:35)
      at async executeStandardCall (/Users/jakedetels/www/test/modelfusion/examples/basic/node_modules/modelfusion/model-function/executeStandardCall.cjs:45:20) {
    structure: {
      model: 'llama2:latest',
      created_at: '2024-01-10T18:26:01.977632Z',
      message: [Object],
      done: true,
      total_duration: 2958093208,
      load_duration: 2757542,
      prompt_eval_duration: 206523000,
      eval_count: 100,
      eval_duration: 2746259000
    },
    cause: ZodError: [
      {
        "code": "invalid_union",
        "unionErrors": [
          {
            "issues": [
              {
                "code": "invalid_type",
                "expected": "number",
                "received": "undefined",
                "path": [
                  "prompt_eval_count"
                ],
                "message": "Required"
              }
            ],
            "name": "ZodError"
          },
          {
            "issues": [
              {
                "received": true,
                "code": "invalid_literal",
                "expected": false,
                "path": [
                  "done"
                ],
                "message": "Invalid literal value, expected false"
              }
            ],
            "name": "ZodError"
          }
        ],
        "path": [],
        "message": "Invalid input"
      }
    ]
        at Object.get error [as error] (/Users/jakedetels/www/test/modelfusion/examples/basic/node_modules/zod/lib/types.js:43:31)
        at safeValidateTypes (/Users/jakedetels/www/test/modelfusion/examples/basic/node_modules/modelfusion/core/schema/validateTypes.cjs:52:41)
        at safeParseJSON (/Users/jakedetels/www/test/modelfusion/examples/basic/node_modules/modelfusion/core/schema/parseJSON.cjs:37:57)
        at handler (/Users/jakedetels/www/test/modelfusion/examples/basic/node_modules/modelfusion/model-provider/ollama/OllamaChatModel.cjs:257:67)
        at processTicksAndRejections (node:internal/process/task_queues:95:5)
        at async postToApi (/Users/jakedetels/www/test/modelfusion/examples/basic/node_modules/modelfusion/core/api/postToApi.cjs:140:20)
        at async OllamaChatModel.doGenerateTexts (/Users/jakedetels/www/test/modelfusion/examples/basic/node_modules/modelfusion/model-provider/ollama/OllamaChatModel.cjs:122:51)
        at async getGeneratedTexts (/Users/jakedetels/www/test/modelfusion/examples/basic/node_modules/modelfusion/model-function/generate-text/generateText.cjs:41:32)
        at async generateResponse (/Users/jakedetels/www/test/modelfusion/examples/basic/node_modules/modelfusion/model-function/generate-text/generateText.cjs:53:28)
        at async runSafe (/Users/jakedetels/www/test/modelfusion/examples/basic/node_modules/modelfusion/util/runSafe.cjs:6:35) {
      issues: [Array],
      addIssue: [Function (anonymous)],
      addIssues: [Function (anonymous)],
      errors: [Array]
    }
  },
  isRetryable: false,
  data: undefined
``

Add support for Ollama multimodal

Draft PR for multimodal Ollama: ollama/ollama#1216

This would add Ollama as a supported provider alongside structure generation: https://github.com/lgrammel/modelfusion#generate-structure

explore img + text -> image models / prompts

Contribution suggestion from @lgrammel

Support parallel tool calls and JSON guarantees

Warning

Aligning ModelFusion to the new OpenAI tool calling and the JSON output guarantees of OpenAI, Ollama, and llama.cpp will lead to breaking changes.

OpenAI introduced parallel tool calling on dev day, and deprecated function calls. They also introduced enforcing JSON output, and Ollama and llama.cpp support this as well.

I will therefore:

introduce a generateToolCalls abstraction (and potentially generateToolCallsOrText)
update useTool to use generateToolCalls
change generateStructure to use forced JSON output instead and only require a schema
remove generateStructureOrText

Tasks

Add Anthropic provider

implement text generation
implement anthropic prompt format (instruction & chat)
implement text streaming
document

OpenAI Azure streaming does not work

I have no problem using generateText through Azure, but using streamText with the same parameters has been consistently reporting errors. Aren't the parameters of these two methods supposed to be the same?

 const textStream = await streamText(
    new OpenAIChatModel({
        api: new AzureOpenAIApiConfiguration({
            apiKey: process.env.AZURE_OPENAI_API_KEY,
            resourceName: process.env.AZURE_OPENAI_API_INSTANCE_NAME,
            deploymentId: process.env.AZURE_OPENAI_API_DEPLOYMENT_NAME,
            apiVersion: process.env.AZURE_OPENAI_API_VERSION,
        }),
        model: "gpt-3.5-turbo",
    }), [
    OpenAIChatMessage.system("You are a story writer. Write a story about:"),
    OpenAIChatMessage.user("A robot learning to love"),
]);

Error:
'Error: JSONParseError: JSON parsing failed: Text: {"id":"","object":"","created":0,"model":"","prompt_filter_results":[{"prompt_index":0,"content_filter_results":{"hate":{"filtered":false,"severity":"safe"},"self_harm":{"filtered":false,"severity":"safe"},"sexual":{"filtered":false,"severity":"safe"},"violence":{"filtered":false,"severity":"safe"}}}],"choices":[]}.\n' +
      'Error message: [\n' +
      '  {\n' +
      '    "received": "",\n' +
      '    "code": "invalid_literal",\n' +
      '    "expected": "chat.completion.chunk",\n' +
      '    "path": [\n' +
      '      "object"\n' +
      '    ],\n' +
      '    "message": "Invalid literal value, expected \"chat.completion.chunk\""\n' +
      '  }\n' +
      ']\n' +
      '    at AIService.streamText (/Users/popmart/song/src/acs/apps/server/dist/main.js:32952:19)\n' +
      '    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)'

Add OpenChat prompt template

Contribution suggestion from @lgrammel

"Add OpenChat prompt template (there was a cool new OpenChat model released today)"

TurboPuff Vector DB support

Contribution suggestion from @lgrammel

Embedding with AzureOpenAI

I am trying to get the pdf-chat-terminal example working with Azure.
I have created two api configurations for this. One for embedding and the other for chatting. It looks like this:

const embeddApi = new AzureOpenAIApiConfiguration({
  // apiKey: automatically uses process.env.AZURE_OPENAI_API_KEY,
  resourceName: "my-resource-name",
  deploymentId: "my-embedd-id",
  apiVersion: "my-api-version",
})

Unfortunately I get the following error Too many inputs. The max number of inputs is 16....

Is there a possibility to set this parameter?

Thanks in advance

Postgres/Knex Vector DB support with pgvector

Contribution suggestion from @lgrammel

Help wanted: Improve documentation

The ModelFusion documentation could be expanded to include more details and make it easier to understand. If you have suggestions or ideas, please contribute!

Deno Compatibility

Perplexity LLM exploration and docs page (OpenAI compatible should work) - they have online LLMs which are cool

Contribution suggestion from @lgrammel

PDF to Tweet - error

Generate text rewrite-tweet text-generation.
Generate text rewrite-tweet text-generation.
TypeError: responses is not iterable
at calculateOpenAIEmbeddingCostInMillicents

how can i solve this?

Document observer configuration

Support streaming response for structures

OpenAI supports streaming function calls. Here's an example in Python: https://gist.github.com/simonmesmith/bbeb894fc4ae954b246125eb2902800b. It would be a nice feature to have in modelfusion.

Cohere custom model and chat api support

Contribution suggestion from @lgrammel.

Support ElevenLabs text to speech websockets

I got this set up for a project over the weekend and it was such a pain 😅 it would be nice to have it wrapped up in a library like modelfusion.

Here's the code I ended up with for the browser:

import { SequentialAsyncOperationQueue } from "./sequentialAsyncOperationQueue";

export class TextToSpeechStreamer {
  private voiceId = "LX4K2KUcue0ViWVHVMn6";
  private model = "eleven_monolingual_v1";
  private wsUrl = `wss://api.elevenlabs.io/v1/text-to-speech/${this.voiceId}/stream-input?model_id=${this.model}`;
  private ttsSocket: WebSocket = new WebSocket(this.wsUrl);
  private audioPlaybackQueue = new SequentialAsyncOperationQueue();
  private sentBOS = false;
  private insideFootnote = false;

  private constructor() {
    this.ttsSocket.onmessage = this.handleMessage.bind(this);
    this.ttsSocket.onerror = this.handleError.bind(this);
    this.ttsSocket.onclose = this.handleClose.bind(this);
  }

  static async create() {
    const ttsStreamer = new TextToSpeechStreamer();
    await new Promise((resolve) => {
      ttsStreamer.ttsSocket.onopen = resolve;
    });
    return ttsStreamer;
  }

  private handleMessage(event: MessageEvent) {
    const response = JSON.parse(event.data);

    console.log("Server response:", response);

    if (response.audio) {
      // decode and handle the audio data (e.g., play it)
      const audioChunk = atob(response.audio); // decode base64
      console.log("Received audio chunk: ", audioChunk);
      // Use AudioContext to play audioBuffer here
      // Decode the base64 audio and convert it to ArrayBuffer
      const audioData = Uint8Array.from(atob(response.audio), (c) =>
        c.charCodeAt(0)
      ).buffer;

      this.audioPlaybackQueue.enqueue(async () => {
        try {
          // Decode the MP3 encoded audio data
          let audioContext = new AudioContext();
          const buffer = await audioContext.decodeAudioData(audioData);
          const source = audioContext.createBufferSource();
          source.buffer = buffer;
          source.connect(audioContext.destination);
          source.start();
          await new Promise((resolve) => {
            source.onended = resolve;
          });
        } catch {}
      });
    } else {
      console.log("No audio data in the response");
    }

    if (response.isFinal) {
      // the generation is complete
    }

    if (response.normalizedAlignment) {
      // use the alignment info if needed
    }
  }

  private handleError(error: MessageEvent) {
    console.error(`WebSocket Error: ${error}`);
  }

  private handleClose(event: CloseEvent) {
    if (event.wasClean) {
      console.info(
        `Connection closed cleanly, code=${event.code}, reason=${event.reason}`
      );
    } else {
      console.warn("Connection died");
    }
  }

  async sendTextDeltas(textDeltas: AsyncIterable<string>) {
    for await (const textDelta of textDeltas) {
      this.sendTextDelta(textDelta);
    }
    this.done();
  }

  private send(text: string) {
    this.ttsSocket.send(JSON.stringify({ text, try_trigger_generation: true }));
  }

  sendTextDelta(text: string) {
    if (!this.sentBOS) {
      const bosMessage = {
        text: " ",
        voice_settings: {
          stability: 0.5,
          similarity_boost: true,
        },
        xi_api_key: import.meta.env.VITE_ELEVEN_LABS_API_KEY, // replace with your API key
      };
      this.ttsSocket.send(JSON.stringify(bosMessage));
      this.insideFootnote = false;
      this.sentBOS = true;
    }

    const splitters = [
      ".",
      ",",
      "?",
      "!",
      ";",
      ":",
      "—",
      "-",
      "(",
      ")",
      "}",
      " ",
    ];

    let buffer = "";
    if (text.includes("[")) {
      // send the buffer and the text before the [
      const [before, _] = text.split("[");
      const textPart = buffer + before;
      if (textPart) {
        this.send(textPart + " ");
      }
      this.insideFootnote = true;
      return;
    } else if (text.includes("]")) {
      // send the buffer and the text after the ]
      const [_, after] = text.split("]");
      const textPart = buffer + after;
      if (textPart) {
        this.send(textPart + " ");
      }
      this.insideFootnote = false;
      return;
    } else if (this.insideFootnote) {
      return;
    } else if (splitters.some((s) => buffer.endsWith(s))) {
      this.send(buffer + " ");
      buffer = text;
    } else if (splitters.some((s) => text.startsWith(s))) {
      this.send(buffer + text[0] + " ");
      buffer = text.slice(1);
    } else {
      buffer += text;
    }

    if (buffer) {
      this.send(buffer + " ");
    }
  }

  done() {
    // 4. Send the EOS message with an empty string
    const eosMessage = {
      text: "",
    };
    this.insideFootnote = false;
    this.ttsSocket.send(JSON.stringify(eosMessage));
  }
}

I'd be glad to implement this into modelfusion myself and make a PR if you think it's a reasonable addition to the library (I might need some guidance). I'd like to support both web and browser environments if possible. Let me know what you think :)

Cannot find module Zodschema imported from invokeFlow.js

Hi Lars,

Thanks again for the great work you're doing around Modelfusion.

Was wondering if you can give me any hints on what may be causing this error below? I'm using modelfusion as part of a Nuxt 3 application. This error keeps happening.

I've verified that Zodschema.js file is present and accessible. I've redone node_modules and npm install.

500
Cannot find module '/Users/fredcamacho/dev/hcgps/node_modules/modelfusion/core/structure/ZodSchema' imported from /Users/fredcamacho/dev/hcgps/node_modules/modelfusion/browser/invokeFlow.js

at new NodeError (node:internal/errors:405:5)
at finalizeResolution (node:internal/modules/esm/resolve:324:11)
at moduleResolve (node:internal/modules/esm/resolve:943:10)
at defaultResolve (node:internal/modules/esm/resolve:1129:11)
at nextResolve (node:internal/modules/esm/loader:163:28)
at ESMLoader.resolve (node:internal/modules/esm/loader:835:30)
at ESMLoader.getModuleJob (node:internal/modules/esm/loader:424:18)
at ModuleWrap. (node:internal/modules/esm/module_job:77:40)
at link (node:internal/modules/esm/module_job:76:36)

Thanks for your help!

Fred

Node llama.cpp bindings

So I recently found your project and really like your approach. I have actually been working on a similar project of my own because based on TypeScript. I started my own project because I wanted something with minimal to zero dependencies, lightweight and focused. The concept is a toolbox of lightweight AI libraries that can be used independently or together to build a complete application. You have several of the pieces I have started building. I am considering contributing to your project instead.

I was thinking about building a model fusion package that offers llama.cpp bindings for Nodejs. It would allow people to use mode fusion with local models without having to run a separate server.

Thoughts?

llama.cpp server / embeddings broken

hey
I had an older git clone of llama cpp and your integration with the llamacpp server was working perfectly.
I cloned the latest version on to a new server but keept getting 'Invalid JSON response error.

RetryError: Failed after 1 attempt(s) with non-retryable error: 'Invalid JSON response'
    at _retryWithExponentialBackoff (/mnt/data/vscode/config/workspace/ai/jsagent/node_modules/modelfusion/core/api/retryWithExponentialBackoff.cjs:42:15)
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
    at async LlamaCppTextEmbeddingModel.doEmbedValues (/mnt/data/vscode/config/workspace/ai/jsagent/node_modules/modelfusion/model-provider/llamacpp/LlamaCppTextEmbeddingModel.cjs:73:26)
    at async Promise.all (index 1)
    at async generateResponse (/mnt/data/vscode/config/workspace/ai/jsagent/node_modules/modelfusion/model-function/embed/embed.cjs:44:31)
    at async runSafe (/mnt/data/vscode/config/workspace/ai/jsagent/node_modules/modelfusion/util/runSafe.cjs:6:35)
    at async executeStandardCall (/mnt/data/vscode/config/workspace/ai/jsagent/node_modules/modelfusion/model-function/executeStandardCall.cjs:45:20) {
  errors: [
    ApiCallError: Invalid JSON response
        at /mnt/data/vscode/config/workspace/ai/jsagent/node_modules/modelfusion/core/api/postToApi.cjs:8:15
        at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
        ... 6 lines matching cause stack trace ...
        at async executeStandardCall (/mnt/data/vscode/config/workspace/ai/jsagent/node_modules/modelfusion/model-function/executeStandardCall.cjs:45:20) {
      url: 'http://10.0.0.100:8080/embedding',
      requestBodyValues: [Object],
      statusCode: 200,
      cause: [ZodError],
      isRetryable: false
    }
  ],
  reason: 'errorNotRetryable'
}

After testing different git commit's of llama.cpp i found that cb33f43a2a9f5a5a5f8d290dd97c625d9ba97a2f was one of the last ones to still work. ( so one of those around that )
I know they have an issue open about implementing a new api, but to me it looks like they have not merged that yet so i hope it's a simple fix, to get this module to handle what ever they changed within the last two weeks ( just "nice to have" since they keep tweaking things and improving it, so would be nice, to be able to use latest version )
For anyone else having issues with that, you can go back to that version with

git checkout cb33f43a2a9f5a5a5f8d290dd97c625d9ba97a2f

Add replicate provider

Add snapshot tests for models

Support setting global run observers

introduce classify model function

Contribution suggestion from @lgrammel. Lars to tackle directly.

HuggingFace text embeddings are broken

Did some investigation. This seems to happen independent of ModelFusion. Need to revisit and see if has been fixed on the HF side.

RetryError: Failed after 1 attempt(s) with non-retryable error: 'SentenceSimilarityInputsCheck expected dict not list: `__root__` in `parameters`'
    at _retryWithExponentialBackoff (/Users/lgrammel/repositories/modelfusion/dist/util/api/retryWithExponentialBackoff.cjs:42:15)
    at processTicksAndRejections (node:internal/process/task_queues:95:5)
    at async Promise.all (index 0)
    at async runSafe (/Users/lgrammel/repositories/modelfusion/dist/util/runSafe.cjs:6:36)
    at async doExecuteCall (/Users/lgrammel/repositories/modelfusion/dist/model-function/executeCall.cjs:98:20)
    at async /Users/lgrammel/repositories/modelfusion/examples/basic/src/model-provider/huggingface/huggingface-text-embedding-example.ts:7:22 {
  errors: [
    HuggingFaceError [ApiCallError]: SentenceSimilarityInputsCheck expected dict not list: `__root__` in `parameters`
        at failedHuggingFaceCallResponseHandler (/Users/lgrammel/repositories/modelfusion/dist/model-provider/huggingface/HuggingFaceError.cjs:31:16)
        at processTicksAndRejections (node:internal/process/task_queues:95:5)
        at async postToApi (/Users/lgrammel/repositories/modelfusion/dist/util/api/postToApi.cjs:59:23)
        at async _retryWithExponentialBackoff (/Users/lgrammel/repositories/modelfusion/dist/util/api/retryWithExponentialBackoff.cjs:18:16)
        at async Promise.all (index 0)
        at async runSafe (/Users/lgrammel/repositories/modelfusion/dist/util/runSafe.cjs:6:36)
        at async doExecuteCall (/Users/lgrammel/repositories/modelfusion/dist/model-function/executeCall.cjs:98:20)
        at async /Users/lgrammel/repositories/modelfusion/examples/basic/src/model-provider/huggingface/huggingface-text-embedding-example.ts:7:22 {
      url: 'https://api-inference.huggingface.co/models/intfloat/e5-base-v2',
      requestBodyValues: [Object],
      statusCode: 400,
      cause: undefined,
      isRetryable: false,
      data: [Object]
    }
  ],
  reason: 'errorNotRetryable'
}

explore variable output parsers, e.g. for list (when generated using GBNF grammar)

Contribution suggestion from @lgrammel

How to pass in an OpenAI API key when not using environment variables?

Hi,

how do I pass in an OpenAI API key when not using environment variables?

I'm trying to get the "chat with pdf" example to work without the enviroment variable and it gives an error when running upsertIntoVectorIndex.

OpenAI API key is missing. Pass it using the 'apiKey' parameter or set it as an environment variable named OPENAI_API_KEY.

snippet of my code:

async function createEmbeddings(openAiApiKey: string) {
  const embeddingModel = new OpenAITextEmbeddingModel({
    model: 'text-embedding-ada-002',
    // TODO: openAiApiKey - where to put it?
  });

  // NOTE: a desperate attempt at putting the api key in...
  // eslint-disable-next-line @typescript-eslint/no-explicit-any
  const settings: any = embeddingModel.settings;
  settings.apiKey = openAiApiKey;

  const pages = await loadPdfPages(path);

  const chunks = await splitTextChunks(
    splitAtToken({
      maxTokensPerChunk: 256,
      tokenizer: embeddingModel.tokenizer,
    }),
    pages,
  );

  // here we'll face the error
  await upsertIntoVectorIndex({
    vectorIndex,
    embeddingModel,
    objects: chunks,
    getValueToEmbed: chunk => chunk.text,
  });

original model response not available when setting `fullResponse` to true in `streamText()`.

Hey there. Love the ModelFusion library. Thanks for creating it, Lars!

According to the docs, setting fullResponse to true in the options for generateText(model, messages, options) is supposed to include the model's original response. In my case, I want to use streamText() instead of generateText(). I see in this example that streamText() also supports the fullResponse property. When setting that prop to true, I do observe an additional metadata object is included, but that metadata object doesn't appear to include the expected original response from the selected model provider (e.g., OpenAI or Mistral).

In my case, I'm using Mistral and their docs indicate an expected response looks like this:

{
  "id": "cmpl-e5cc70bb28c444948073e77776eb30ef",
  "object": "chat.completion",
  "created": 1702256327,
  "model": "mistral-tiny",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "I don't have a favorite condiment as I don't consume food..."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 14,
    "completion_tokens": 93,
    "total_tokens": 107
  }
}

When I set fullResponse to true, I don't see any of the above expected response properties. Instead, ModelFusion returns a response with a metadata object that looks like this:

This seems like a bug, based on the wording of the docs which suggest one would get the original response (presumably from whichever LLM provider is selected).

Here's the ModelFusion docs I was referencing regarding fullResponse:

Rich Responses

For more advanced use cases, you might want to access the full response from the model, or the metadata about the call.
Model functions return rich results that include the original response and metadata when you set the fullResponse option to true.

// access the full response (needs to be typed) and the metadata:
const { text, texts, response, metadata } = await generateText(
  openai.CompletionTextGenerator({
    model: "gpt-3.5-turbo-instruct",
    maxGenerationTokens: 1000,
    n: 2, // generate 2 completions
  }),
  "Write a short story about a robot learning to love:\n\n"
  { fullResponse: true }
);

console.log(metadata);

// cast to the response type:
for (const choice of (response as OpenAICompletionResponse).choices) {
  console.log(choice.text);
}

Thanks in advance! I'm hoping we can get the original LLM provider's response.

BUG: Ollama response `prompt_eval_count` is required field.

ZodError: [
  {
    "code": "invalid_union",
    "unionErrors": [
      {
        "issues": [
          {
            "code": "invalid_type",
            "expected": "number",
            "received": "undefined",
            "path": [
              "prompt_eval_count"
            ],
            "message": "Required"
          }
        ],
        "name": "ZodError"
      },
      {
        "issues": [
          {
            "received": true,
            "code": "invalid_literal",
            "expected": false,
            "path": [
              "done"
            ],
            "message": "Invalid literal value, expected false"
          }
        ],
        "name": "ZodError"
      }
    ],
    "path": [],
    "message": "Invalid input"
  }
]
    at get error [as error] (webpack-internal:///(app-pages-browser)/../../node_modules/.pnpm/[email protected]/node_modules/zod/lib/index.mjs:649:31)
    at safeValidateTypes (webpack-internal:///(app-pages-browser)/../../node_modules/.pnpm/[email protected]/node_modules/modelfusion/core/schema/validateTypes.js:54:41)
    at safeParseJSON (webpack-internal:///(app-pages-browser)/../../node_modules/.pnpm/[email protected]/node_modules/modelfusion/core/schema/parseJSON.js:40:84)
    at handler (webpack-internal:///(app-pages-browser)/../../node_modules/.pnpm/[email protected]/node_modules/modelfusion/model-provider/ollama/OllamaCompletionModel.js:271:107)
    at async postToApi (webpack-internal:///(app-pages-browser)/../../node_modules/.pnpm/[email protected]/node_modules/modelfusion/core/api/postToApi.js:144:20)
    at async OllamaCompletionModel.doGenerateTexts (webpack-internal:///(app-pages-browser)/../../node_modules/.pnpm/[email protected]/node_modules/modelfusion/model-provider/ollama/OllamaCompletionModel.js:144:51)
    at async getGeneratedTexts (webpack-internal:///(app-pages-browser)/../../node_modules/.pnpm/[email protected]/node_modules/modelfusion/model-function/generate-text/generateText.js:17:29)
    at async generateResponse (webpack-internal:///(app-pages-browser)/../../node_modules/.pnpm/[email protected]/node_modules/modelfusion/model-function/generate-text/generateText.js:55:28)
    at async runSafe (webpack-internal:///(app-pages-browser)/../../node_modules/.pnpm/[email protected]/node_modules/modelfusion/util/runSafe.js:7:35)
    at async executeStandardCall (webpack-internal:///(app-pages-browser)/../../node_modules/.pnpm/[email protected]/node_modules/modelfusion/model-function/executeStandardCall.js:54:20)
    at async generateText (webpack-internal:///(app-pages-browser)/../../node_modules/.pnpm/[email protected]/node_modules/modelfusion/model-function/generate-text/generateText.js:8:26)
    at async eval (webpack-internal:///(app-pages-browser)/../core/src/nodes/function/generateText.ts:290:21)
    ```

Fireworks / Anyscale / Together docs pages (for their API configs)

Contribution suggestion from @lgrammel

"Fireworks / Anyscale / Together docs pages (for their API configs)"

Support sqlite vector search module

There's a sqlite extension to support vector search called sqlite-vss. This would be a welcome addition to modelfusion because sqlite is like a stepping stone between the in-memory JSON vector DB and a "proper" vector DB like pinecone. The advantage it has over the in-memory JSON approach is that it doesn't take up a bunch of RAM, so I can deploy prototypes quickly for free without upgrading to paid tiers 😄. Also it's less setup than pinecone.

I would be glad to implement this myself because I'm already familiar with sqlite-vss if that sounds good to you.

explore if there is a good local TTS that could be integrated

Contribution suggestion from @lgrammel

"explore if there is a good local TTS that could be integrated (ideally with a server, I looked at Coqui but setup was complex)"

move out openai cost calc from modelfusion and modelfusion/experimental into a standalone pkg that can also be used w/o modelfusion

Contribution suggestion from @lgrammel

vercel / modelfusion Goto Github PK

modelfusion's Issues

Tasks

Rich Responses

Recommend Projects

Recommend Topics

Recommend Org