Giter VIP home page Giter VIP logo

Comments (3)

github-actions avatar github-actions commented on June 26, 2024

Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @jpalvarezl @ralph-msft @trrwilson.

from azure-sdk-for-net.

trrwilson avatar trrwilson commented on June 26, 2024

Hello, @Freddeb! Thank you for getting in touch. I can share a quick way to get this information but would also love your feedback on how we could improve the discoverability of this information.

Where usage is: as you can see in OpenAI's API reference, usage is not provided per embedding entry (data array item), but rather per full response -- mirroring that, you can retrieve usage information from an EmbeddingCollection instance that you get from the multi-embedding-returning GenerateEmbeddingsAsync() ("Embeddings" with the 's') method:

EmbeddingCollection embeddings = await client.GenerateEmbeddingsAsync(["hello, world"!]);
int totalTokensForResponse = embeddings.Usage.TotalTokens;
Embeddings theOneActualEmbedding = embeddings[0];

Why?

  • We created the singular GenerateEmbedding[Async]() method based on the observation that single-item cases were exceedingly common; much like Chat Completions rarely providing an n > 0 for multiple choices, the data array for Embeddings isn't particularly relevant in the majority of cases, where a single input at a time is provided.
  • But we were concerned that, in those uncommon cases where multiple inputs are provided, there may be confusion if we provided the same EmbeddingTokenUsage instance on multiple Embedding instances; e.g. if we had the below:
EmbeddingCollection multipleEmbeddings = await client.GenerateEmbeddingAsync(
[
    "hello, world!",
    "this is a test",
    "I'd like multiple embeddings this time"
]);
// this is what's present right now, and represents actual usage
int totalTokensForOperation = multipleEmbeddings.Usage.TotalTokens;
int tokensForAllInputs = multipleEmbeddings.Usage.InputTokens;
// if we had it on each embedding, this could be misleading -- usage information isn't actually provided per data item!
int maybeMisleadingTokens = multipleEmbeddings[0].Usage.TotalTokens;
int sameAsAboveTokens = multipleEmbeddings[1].Usage.TotalTokens;
Assert.That(multipleEmbeddings[0].Usage.TotalTokens, Is.EqualTo(multipleEmbeddings[2].Usage.TotalTokens));

Question: what would make this easier? We're concerned about the confusion case (especially since it could produce an alarming misrepresentation about what's being paid for), but you've clearly hit a very troublesome discoverability problem for the single-item case we're trying to optimize for.

  • Documentation? This is basically a given that we need more, but would have a README example for fetching usage have clarified sooner?
  • Duplicate and rename the properties? If you had multiple Embedding instances as shown above and they each had e.g. TotalTokensForOperation, aside from seeming overly wordy would you see that as clarifying the problem?
  • Method on Embedding? A GetOperationTokenUsage() method might make it even more clear -- though also incrementally less discoverable (though still better than a different method entirely!). Would that have seemed clear?
  • Another approach? We're fundamentally trying to make single-embedding calls easy while not misrepresenting that we can't provide usage information on a per-single-embedding-item basis. Whatever makes this clear is worth looking into!

Thanks again; hopefully this unblocks getting the information in the immediate term, and your input is greatly appreciated for making the medium term better!

from azure-sdk-for-net.

Freddeb avatar Freddeb commented on June 26, 2024

Hello @trrwilson,

Once again, thank you for taking the time to answer my question and for providing a quick solution.

Here is my perspective as a developer.

The function GenerateEmbeddingAsync(without the 's'):
This function is my intuitive choice when I need to retrieve a user's question in the prompt of my bot and translate it into one vector before submitting it to my vectorized database. Knowing the number of tokens used makes sense for the function GenerateEmbeddingAsync, even for a single item. I need to record the different token usages for a complete turn.

The function GenerateEmbeddingsAsync (with the 's'):
This function certainly makes a lot of sense when you want to prepare multiple questions for your vectorized database or when you want to convert chunks of a complete document.

To be honest, I always have apprehension about using this type of function with a large amount of data (e.g., a complete 150-page document split into multiple chunks).
What if the operation goes wrong and crashes in the middle of the process (e.g., network issue, problematic chunk, etc.)?
Would we still need to pay for the tokens used even if the function call ended with an exception? Am I wrong?
Therefore, to prepare my vectorized database, I tend to do "1 chunk = 1 API function call."
I run the process in a loop on the document chunks and go get myself a coffee; the conversion is quite fast.

Now, I'll stop elaborating and answer your question.
The answer you gave me (the first 6 lines) is clear and sufficient for me.

If I wanted to convert a few chunks with a single call to the function GenerateEmbeddingsAsync, I would be content with the total input tokens and the total tokens retrieved on the EmbeddingCollection object.
Currently, I am interested in the tokens usage for a complete turn (User question -> Bot response).

And if, ultimately, I really wanted to know the token usage per vector in a list of vectors, I find your second code snippet's proposal very good (InputTokens/TotalTokens on the collection and per item).
I don't think there could be any confusion if there is clear documentation to accompany it. There is always a way to draw attention through an IntelliCode message when the developer accesses the EmbeddingCollection.Usage or Embedding.Usage property.

I hope this response is helpful.
I don't know if it's possible via GitHub, but ideally, this is the type of problem for which I would submit different solutions to the developer community for a vote.

Thanks again.
Fred

from azure-sdk-for-net.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.