emmethalm / infinitegpt Goto Github PK

InfiniteGPT is a Python script that lets you input an unlimited size text into the OpenAI API. No more tedious copy & pasting. Long live multithreading!

Python 100.00%

infinitegpt's Issues

Script allows sending 'infinite' amounts of text, but without context/memory

So looking through the script, basically, your code takes a text document, breaks it into pieces, matching tokens (And I am still trying to figure out how you get accurate token count for the openai API without using tiktoken or similar library) and sending each piece separately to the API )And with the ability to set a system message and/or prompt with each chunk) So yeah, in a way you are able to send near infinite texts to the API

But that does not change the underlying issue that the openai API models lack the conversationalbuffermemory that ChatGPT has, so each of those chunks would get treated and responded to without any inherent context or identification with any of the other chunks you send, so even if you would get replies, none of them would make any sense for the whole, as none of those would actually ever understand the full text sent, only the individual chunk sent in that call to it.

I mean, I guess i see the benefit of this script if you want to hit the rate limits of your gpt-4 API access, but other than that? Could you provide a sensible use-case for this script please? Also, for the sake of transparency and openness, I think you should mention the detail that the API can't connect the chunks and will only respond on an individual basis, as there seems to be some confusion regarding the true capabilities of your script.

not work

openai.error.InvalidRequestError: This model's maximum context length is 4097 tokens. However, you requested 4972 tokens (3222 in the messages, 1750 in the completion). Please reduce the length of the messages or completion.

split by paragraph function

Hello

I think it would improve the quality of the chatGPT output if the content is split by paragraphs and not by words. It will prevent the cutting off in the middle of a sentence and thus losing some meaning in the beginning and ending of each chunk

The chunk may not be working as intended.

The following error was returned:
"InvalidRequestError: This model's maximum context length is 4097 tokens. However, your messages resulted in 9203 tokens. Please reduce the length of the messages."

`infiniteGpt` should be converted to be a pip repository (and `ReadMe.md` instructions adjusted accordingly)

Depends on: #10

Acceptance Criteria

Using the setuptools library, add setup.py
Add dependencies as install_requires
Update the instructions in ReadMe.md to guide installation and usage of InfiniteGpt as a pip installable class
(Optional) Publish to PyPI

`blastoff.py` should be converted to a class `InfiniteGpt`

Acceptance Criteria

__main__ should be re-written as run(...) that consumes an input and output file
configure(...) should be written to consume
All other functions (helpers) should be given underscore notation (ex: _split_into_chunks(...)) as syntactic sugar to denote that they are private
All functions should be wrapped in an InfiniteGpt class

Is that it?

This is what you claimed:

This is not what this repo does.

This repo barely does anything - it just sends the data in chunks, it's barely a "hello world" for the api..
GPT still loses the context in the exact same way, I have no idea what the utility of this is, the "README.md" seems to have more effort put into it than the code.

I wouldn't come to knock down a random small repo, but this has 250(!) stars and was featured in the gpt superpower newsletter.
I honestly don't get it.

Give a brief overview of how that actually works

It's unclear how you achieve that and looking at other issues it seems there might be a discrepancy on expectations.

A brief overview of what the script actually does would help narrow down what use cases it serves

`InfiniteGpt` should read from environment variables or take argument overrides when instantiated

Dependency on: #9

Acceptance Criteria

The OpenApi key should be read an environment variable OPEN_API_KEY (default behavior)
Users should be able to pass in their OpenApi Key as an argument override
The __init(...) function of InfiniteGpt should take the same arguments (optionally) as configure(...) and should call configure itself if provided

Practical usage example

To me its unclear how this fixes the problem of max prompt length. Let's say I have the following prompt:

=== PROMPT-start ===
I will provide you with the full text of a unrealeased book called "my unreleased book". Please 
create a short summary of the whole book in 200 words:
Lore ipsum....
...
...
...
(full book which has 100'000 tokens) 
=== PROMPT-end ===

In that case this script will get a response from chatGPT every 1500 tokens and screw up my prompt right?

Or maybe I did not understood the benefit of this script? Can you name one practical example how to use this script?

emmethalm / infinitegpt Goto Github PK

infinitegpt's People

Contributors

Stargazers

Watchers

Forkers

infinitegpt's Issues

Depends on: #10

Acceptance Criteria

Acceptance Criteria

Dependency on: #9

Acceptance Criteria

Recommend Projects

Recommend Topics

Recommend Org