Giter VIP home page Giter VIP logo

gpt-tokens's People

Contributors

cainier avatar kingchan818 avatar linj121 avatar lox avatar qlee3 avatar sebastiansandqvist avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

gpt-tokens's Issues

Throw an error when content contains <|endoftext|>

When using gpt tokens to calculate tokens, I found that if the content contains values similar to<| endoftext |>, the calculation will report an error.
How about use a filtering function like text=text. replace (/<|\w{0,} |>/g, "").

How to count tokens for cl100k_base embedding?

Sorry for probably stupid question, but my task is to count tokens before sending text for vector embedding using cl100k_base - that one seems to be used with text-embedding-3-large model.
I've not seen such an example in the readme and wondering whether this model is supported by this module and how to use it properly for this task.
I mean that I have only one text, should I put it into 'system' or 'user' role field?
And which model I have to specify?
Thanks in advance!

Add gpt-4-0125-preview

Please add 'gpt-4-0125-preview' and the new alias 'gpt-4-turbo-preview' to supported models list

Function calling?

OpenAI models now support what is called function calling.

const usageInfo = new GPTTokens({
    model   : 'gpt-3.5-turbo-16k-0613,
    messages,
    funcitons,
});

Is there a chance that this library is going to support it? If no can you please let me know how should I get # of tokens for the functions. Thank you.

Tiktoken encodings not re-used between GPTTokens instances (slow performance)

When calling this library many times (for instance using it to split text into parts), it appears the internal encodings aren't re-used and there isn't any way to do it.

The result is it's quite slow:

import { GPTTokens } from 'gpt-tokens'
// import { GPTTokens } from '../src/libs/gptTokens.js'

for (let i = 0; i < 1000; i++) {
  console.time('GPTTokens')
  const usageInfo = new GPTTokens({
    plus: false,
    model: 'gpt-3.5-turbo-0613',
    messages: [
      {
        role: 'user',
        content: 'Hello world',
      },
    ],
  })

  usageInfo.usedTokens
  usageInfo.promptUsedTokens
  usageInfo.completionUsedTokens
  usageInfo.usedUSD
  console.timeEnd('GPTTokens')
}

Returns:

GPTTokens: 332.625ms
GPTTokens: 290.321ms
GPTTokens: 273.416ms
GPTTokens: 264.106ms
GPTTokens: 281.858ms
GPTTokens: 257.714ms
GPTTokens: 280.463ms
GPTTokens: 282.296ms
GPTTokens: 255.335ms
GPTTokens: 274.843ms
GPTTokens: 268.74ms
GPTTokens: 269.419ms
GPTTokens: 279.843ms
GPTTokens: 252.028ms
GPTTokens: 276.782ms
GPTTokens: 283.575ms
GPTTokens: 258.711ms
GPTTokens: 284.372ms

When the encodings are cached in the module:

GPTTokens: 64.708ms
GPTTokens: 1.558ms
GPTTokens: 1.12ms
GPTTokens: 1.114ms
GPTTokens: 0.876ms
GPTTokens: 0.838ms
GPTTokens: 0.954ms
GPTTokens: 0.92ms
GPTTokens: 0.765ms
GPTTokens: 0.84ms
GPTTokens: 0.72ms
GPTTokens: 0.789ms
GPTTokens: 0.822ms
GPTTokens: 0.782ms
GPTTokens: 0.78ms
GPTTokens: 0.737ms
GPTTokens: 0.746ms

Import is broken

After upgrading new release I started to get import errors.

{
"errorType": "Error",
"errorMessage": "Cannot find module '/var/task/node_modules/gpt-tokens/dist/index.ts' imported from /var/task/index.js\nDid you mean to import gpt-tokens/dist/index.js?",
"code": "ERR_MODULE_NOT_FOUND",
"url": "file:///var/task/node_modules/gpt-tokens/dist/index.ts",

It looks like package.json export has wrong dist file for "import", should be index.d.ts
"exports": {
".": {
"import": "./dist/index.ts",

Previous version 1.2.0 works well.

Requesting your help with gpt-tokens project

Hello! I really appreciate your project gpt-tokens, which is a fast BPE tokenizer for working with OpenAI’s models. I have a need to use it on Vercel Edge Runtime to count the tokens of the OpenAI responses that I get from Vercel reverse proxy. However, I don’t have any programming background, so I can’t do it by myself.

Therefore, I would like to ask you for a favor, if you have time and interest. Could you please tell me how much US dollars you would need as a compensation? I am happy to pay you a reasonable fee.

Thank you very much for your time and help.

Yuan

Error while importing in React

My code :

I am making a function to shorten the number of messages so that the tokens doesnt exceed max tokens.

import { GPTTokens } from "gpt-tokens";

let reduceTokens = (history, model = "gpt-3.5-turbo") => {
  // let usageInfo = new GPTTokens({model : model ,messages : history});
  
  // console.log(usageInfo.usedTokens);
  return history;
};

export default reduceTokens; 

While importing i am getting this error

image

Please help...

[Idea] self-discipline plugin

To manage API usage across multiple projects without exceeding limits, consider implementing a function that calculates delay in milliseconds for each request. This function should factor in userPriority and botPriority, ensuring quicker responses for paid users and higher priority projects. Lower priority requests can have longer delays. Here's a prototype from my current project for your review.

// RPM (requests per minute), RPD (requests per day), TPM (tokens per minute), TPD (tokens per day), and IPM (images per minute)

import { GPTTokens, supportModelType } from 'gpt-tokens'

import connector, { RatesResults } from './connector'
import logger from '../../lib/logger'
import Stats from '../../models/Stats'
import { StatKeys } from '../../models/Stats.types'
const log = logger({ module: 'RateLimiter.model.ts' })

export enum OpenAIModels {
  'gpt-4' = 'gpt-4',
  'gpt-4-1106-preview' = 'gpt-4-1106-preview',
  'gpt-4-vision-preview' = 'gpt-4-vision-preview',
  'gpt-3.5-turbo' = 'gpt-3.5-turbo',
  'text-embedding-ada-002' = 'text-embedding-ada-002',
  'whisper-1' = 'whisper-1',
  'tts-1' = 'tts-1',
  'dall-e-2' = 'dall-e-2',
  'dall-e-3' = 'dall-e-3',
}
export interface MessageItem {
  name?: string
  role: 'system' | 'user' | 'assistant'
  content: string
}
type RateSettings = {
  RPM: number //  requests per minute
  RPD: number //  requests per day
  TPM: number //  tokens per minute
  TPD: number // token per day
  CS: number // context size
}
const MSDAY = 1000 * 60 * 60 * 24
const MSMINUTE = 1000 * 60

const limits: { [model in OpenAIModels]?: RateSettings } = {
  [OpenAIModels['gpt-4']]: { RPM: 10000, RPD: -1, TPM: 300000, TPD: -1, CS: 8190 },
  [OpenAIModels['gpt-4-1106-preview']]: { RPM: 500, RPD: 10000, TPM: 300000, TPD: -1, CS: 128000 },
  [OpenAIModels['gpt-4-vision-preview']]: { RPM: 20, RPD: 100, TPM: 300000, TPD: -1, CS: 128000 },
  [OpenAIModels['gpt-3.5-turbo']]: { RPM: 10000, RPD: -1, TPM: 1000000, TPD: -1, CS: 4000 },
  [OpenAIModels['text-embedding-ada-002']]: { RPM: 10000, RPD: -1, TPM: 5000000, TPD: -1, CS: -1 },
  [OpenAIModels['whisper-1']]: { RPM: 100, RPD: -1, TPM: -1, TPD: -1, CS: -1 },
  [OpenAIModels['tts-1']]: { RPM: 100, RPD: -1, TPM: -1, TPD: -1, CS: -1 },
  [OpenAIModels['dall-e-2']]: { RPM: 100, RPD: -1, TPM: -1, TPD: -1, CS: -1 },
  [OpenAIModels['dall-e-3']]: { RPM: 15, RPD: -1, TPM: -1, TPD: -1, CS: -1 },
}
const lowestPriorityDelayMultiplier = 4
const highestPriorityDelayMultiplier = 1.1
class RateLimiter {
  constructor() {
    //
  }
  public async registerChatRequest(
    model: OpenAIModels,
    messages: MessageItem[],
    tools: any[] = [],
    userPriority = 1, // from 0(lowest) to 1(highest)
    botPriority = 1 // from 0(lowest) to 1(highest)
  ): Promise<boolean> {
    //
    try {
      const usageInfo = new GPTTokens({
        model: model as supportModelType | undefined,
        messages,
      })
      const rateSetting: RateSettings | undefined = limits[model]
      if (!rateSetting) {
        throw new Error(`Invalid model [${model}]`)
      }

      if (rateSetting.CS > 0 && usageInfo.usedTokens > rateSetting.CS) {
        throw new Error('Chat exceeds token limit')
      }

      const currentConumeRates: RatesResults = await connector.getRates(model)
      const getDelayForRate: number = this.getDelayForRate(rateSetting, currentConumeRates, userPriority, botPriority)

      // pause for a bit
      if (getDelayForRate && getDelayForRate > 0) {
        await new Promise((resolve) => setTimeout(resolve, getDelayForRate))
      }

      await connector.register(model, usageInfo.usedTokens)
        await Stats.addStat({
          [StatKeys.consumeRequests]: 1,
          [StatKeys.consumeTokens]: usageInfo.usedTokens,
          [StatKeys.consumeCash]: usageInfo.usedUSD,
        })
      return true
    } catch (e) {
      log.error(e, 'Error registering chat request')
      return false
    }
  }

  getDelayForRate(
    rateSettings: RateSettings,
    currentConumeRates: RatesResults,
    userPriority: number,
    botPriority: number
  ): number {
    //finding proper delay for each rate
    const rpmDelay =
      rateSettings.RPM > 0
        ? ((currentConumeRates.RPM / rateSettings.RPM) * MSMINUTE) / (rateSettings.RPM - currentConumeRates.RPM)
        : 0
    const rpdDelay =
      rateSettings.RPD > 0
        ? ((currentConumeRates.RPD / rateSettings.RPD) * MSDAY) / (rateSettings.RPD - currentConumeRates.RPD)
        : 0
    const tpmDelay =
      rateSettings.TPM > 0
        ? ((currentConumeRates.TPM / rateSettings.TPM) * MSMINUTE) / (rateSettings.TPM - currentConumeRates.TPM)
        : 0
    const tpdDelay =
      rateSettings.TPD > 0
        ? ((currentConumeRates.TPD / rateSettings.TPD) * MSDAY) / (rateSettings.TPD - currentConumeRates.TPD)
        : 0

    // get highest delay and calculate additional koeficient for low priority users
    const delay = Math.max(rpmDelay, rpdDelay, tpmDelay, tpdDelay)
    // additional multiplieer delay for unprioritezd from 1(highest prioriyt) to lowestPriorityDelayMultiplier(lowest priority)
    const totalPriorityK =
      (1 - userPriority * botPriority) * lowestPriorityDelayMultiplier + highestPriorityDelayMultiplier
    // rounded delay in ms which necessary to fit rate limits
    return Math.round(delay * totalPriorityK)
  }
}

export default new RateLimiter()

messages token count algorithm is different from the one in openAI official cookbook

On this line: https://github.com/Cainier/gpt-tokens/blob/main/index.js#L170

For model "gpt-3.5-turbo-0613", at line 170, the code is using tokens_per_message = 4, tokens_per_name = -1

but in openAI's cookbook code: https://github.com/openai/openai-cookbook/blob/main/examples/How_to_count_tokens_with_tiktoken.ipynb , in the num_tokens_from_messages function (snippet IN [14]) , you can see it is using tokens_per_message = 3, tokens_per_name = 1

And from their code, gpt-3.5-turbo-0301 and gpt-3.5-turbo-0613 are using different token-count method, but code in this repo are using the same method: https://github.com/Cainier/gpt-tokens/blob/main/index.js#L164

Counts are slightly off on completion for chat models

Awesome project, thank you for adding this to the ecosystem! My brother and I are currently working on https://github.com/openpipe/openpipe, and this package is incredibly useful to us.

I do notice that completion token counts are slightly off on some models. Specifically, it appears that GPTTokens always believes that the completion includes more tokens than it actually does. I created an experiment that compares the number of tokens OpenAI reports were used for a certain response (returned from non-streamed responses) against tokens calculated using GPTTokens (calculated on streamed responses). Here's the experiment: https://openpipe.ai/experiments/e2d5d255-5731-4dbc-9f83-7f642745404d.

I think we're using the latest version (1.0.10): https://github.com/OpenPipe/openpipe/blob/main/package.json#L39

And here are the relevant screenshots:

Non-streamed token counts (read from response):
Screenshot 2023-07-08 at 9 31 26 PM

Streamed token counts (calculated using GPTTokens):
Screenshot 2023-07-08 at 9 31 15 PM

Again, amazing project! Starring now!

new GPTTokens() sometime fails with RuntimeError: unreachable

When using this package, sometimes there are runtime errors in the downstream @dqbd/tiktoken dependency.

Code snippet:

new GPTTokens({
  model: 'gpt-3.5-turbo',
  messages: [
    {
      role: 'user',
      content: JSON.stringify(query),
    },
  ],
});

Error:

/Users/xxx/repos/gpt-pr-comment-summary/crawler/node_modules/@dqbd/tiktoken/tiktoken_bg.cjs:262
            wasm.tiktoken_encode(retptr, this.ptr, ptr0, len0, addHeapObject(allowed_special), addHeapObject(disallowed_special));
                 ^
RuntimeError: unreachable
    at wasm://wasm/00b5f812:wasm-function[563]:0x6a72a
    at wasm://wasm/00b5f812:wasm-function[665]:0x6fd7a
    at wasm://wasm/00b5f812:wasm-function[756]:0x70f7f
    at wasm://wasm/00b5f812:wasm-function[237]:0x5c43a
    at wasm://wasm/00b5f812:wasm-function[200]:0x4db89
    at wasm://wasm/00b5f812:wasm-function[34]:0x1f78a
    at wasm://wasm/00b5f812:wasm-function[159]:0x48dc3
    at Tiktoken.encode (/Users/xxx/repos/gpt-pr-comment-summary/crawler/node_modules/@dqbd/tiktoken/tiktoken_bg.cjs:262:18)
    at GPTTokens.num_tokens_from_messages (/Users/xxx/repos/gpt-pr-comment-summary/crawler/node_modules/gpt-tokens/index.js:162:40)
    at GPTTokens.num_tokens_from_messages (/Users/xxx/repos/gpt-pr-comment-summary/crawler/node_modules/gpt-tokens/index.js:126:25)

Add openai finetune model support

It seems there is not finetune model in supportModelType. A finetune example is ft:gpt-3.5-turbo-0613:company::abcdefg". I think we should support finetune model feature.

image

Here is the price reference of finetune model.

image

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.