Giter VIP home page Giter VIP logo

Comments (10)

machinekoder avatar machinekoder commented on July 16, 2024 4

Here, OpenAI acknowledges that GPT-4 "is not fully reliable" and "makes reasoning errors". In this case, it made a reasoning error about what model it is.

Might be an interesting eval to verify if a model is capable of knowing about itself without additional context.

from evals.

ozgurozkan123 avatar ozgurozkan123 commented on July 16, 2024 1

But it's charged under GPT4 on the account
Screenshot 2023-03-21 at 13 50 15

from evals.

jonathanagustin avatar jonathanagustin commented on July 16, 2024 1

I believe ChatGPT with model gpt-4 (within the browser) is using GPT-4 with additional context.

I also believe the Chat API (not within the browser) with model gpt-4 (like in the playground) is also using GPT-4.

In short: It's incorrect that it reports using GPT-3 when it really uses some version of GPT-4.

My justification is that I am running a bunch of evals and gpt-4 outperforms gpt-3.5-turbo.

And also my experience in using ChatGPT in the browser: The GPT-4 configuration outperforms the GPT-3 configuration and outputs better (more correct) responses.

from evals.

ozgurozkan123 avatar ozgurozkan123 commented on July 16, 2024 1

My only concern and question mark is training data cut off date. I know as an end user when I install or use a consumer facing app that claims gpt4 and answers as gpt3 is a UX problem.

from evals.

Tkinfo11 avatar Tkinfo11 commented on July 16, 2024 1

You would think when the platform was written that it would include the ID for the updated iteration. Not so much self aware but embedded in the program.

from evals.

machinekoder avatar machinekoder commented on July 16, 2024

Better comparison would be the OpenAI Playground, since ChatGPT (Plus) is pre-fed some system context we don't know.

from evals.

ozgurozkan123 avatar ozgurozkan123 commented on July 16, 2024

I tried it on openAI playground as well.
Screenshot 2023-03-21 at 13 48 34

from evals.

ozgurozkan123 avatar ozgurozkan123 commented on July 16, 2024

and the training data cut off date it says is 2020 (same with GPT-3) not 2021
Screenshot 2023-03-21 at 13 53 25

( Chatgpt plus says .
Screenshot 2023-03-21 at 13 52 02

from evals.

jonathanagustin avatar jonathanagustin commented on July 16, 2024

@ozgurozkan123

The issue appears to be: When GPT-4 is configured, is GPT-4 used when the output says it is not GPT-4?

I think the best analogy I can make is: If I trained a parrot to bark like a dog, it does not mean that it is not a parrot.

If the configuration is set to GPT-4, GPT-4 is used even though the output states that it is not GPT-4.

For example, in your query:

ozgur@Ozgurs-MacBook-Pro ~ % curl https://api.openai.com/v1/chat/completions
-H "Content-Type: application/json"
-H "Authorization: Bearer [REDACTED]"
-d '{
"model": "gpt-4",
"messages": [{"role": "user", "content": "What is your model number!"}]
}'
{"id":"chatcmpl-6wVWzn95O6iDc1Z9W1kJNIZZtZZLh","object":"chat.completion","created":1679402249,"model":"gpt-4-0314","usage":{"prompt_tokens":12,"completion_tokens":45,"total_tokens":57},"choices":[{"message":{"role":"assistant","content":"As an AI language model, I do not have a model number like a physical device or product would. I am powered by OpenAI's GPT-3, which stands for Generative Pre-trained Transformer 3."},"finish_reason":"stop","index":0}]}

the query is set to use gpt-4:

"model": "gpt-4"

and the prompt is:

"What is your model number!"

and the output is:

"I am powered by OpenAI's GPT-3, which stands for Generative Pre-trained Transformer 3."

Here, there appears to be a factual error: The output says it's powered by GPT-3 but the fact is that it's powered by GPT-4. Although the output is inaccurate, it is still GPT-4. A model can be wrong about self-identifying questions about itself, but it does not change what model is being used. In the GPT-4 Technical Report, OpenAI states:

image

Here, OpenAI acknowledges that GPT-4 "is not fully reliable" and "makes reasoning errors". In this case, it made a reasoning error about what model it is.

OpenAI does not make any guarantees or promises about accuracy. When you use ChatGPT or OpenAI's API, you agree to OpenAI's Terms of use:

By using our Services, you agree to these Terms.

In Section 7(b) of the Terms of use, the Disclaimer states that they do not warrant that their services will be accurate or error free:

(b) Disclaimer. THE SERVICES ARE PROVIDED “AS IS.” EXCEPT TO THE EXTENT PROHIBITED BY LAW, WE AND OUR AFFILIATES AND LICENSORS MAKE NO WARRANTIES (EXPRESS, IMPLIED, STATUTORY OR OTHERWISE) WITH RESPECT TO THE SERVICES, AND DISCLAIM ALL WARRANTIES INCLUDING BUT NOT LIMITED TO WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, SATISFACTORY QUALITY, NON-INFRINGEMENT, AND QUIET ENJOYMENT, AND ANY WARRANTIES ARISING OUT OF ANY COURSE OF DEALING OR TRADE USAGE. WE DO NOT WARRANT THAT THE SERVICES WILL BE UNINTERRUPTED, ACCURATE OR ERROR FREE, OR THAT ANY CONTENT WILL BE SECURE OR NOT LOST OR ALTERED.

from evals.

jwang47 avatar jwang47 commented on July 16, 2024

As mentioned by @jonathanagustin, our models tend to hallucinate facts and this case was most likely an instance of hallucination. However, it may be possible (though the issue you noticed doesn't point to it) that the UI is actually wrong! If you suspect that might be the case, please contact our support team.

from evals.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.