Getting this: Aborting. Summary files combined are too big for the context window of g

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Consider using compression: <a href="https://twitter.com/mckaywrigley/status/164359235

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

I spinned off the chunks to <a class="issue-link js-issue-link" data-error-text="Faile

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Need some multi-layered summaries maybe about autopilot HOT 8 CLOSED

fjrdomingues commented on August 27, 2024

Need some multi-layered summaries maybe

from autopilot.

Comments (8)

hatkyinc2 commented on August 27, 2024 1

@fjrdomingues
In place of multi-layer summary, we could just split the summaries into chunks up to max size and loop the chunks on prompts to ask the questions and accept an empty option

something like

MAX_CHUNK_LENGTH=1000
summary_chunks=[]

chunks=[] //summary_chunks
current_chunk=[];
current_chunk_length= 0
for (summary of summaries) {
  current_summary_length:=summaries.length()
  if current_summary_length>MAX_CHUNK_LENGTH{throw new ERROR('single summary is too big');
  if (current_chunk_length+current_summary_length) >MAX_CHUNK_LENGTH{
     chunks.push(current_chunk);
     current_chunk=[];
     current_chunk_length=0;
  }else{
    current_chunk_length=current_chunk_length+current_summary_length
    current_chunk.push(summary)
  }
}
// push last chunk

summary_chunks would have a summary split, might be one file each or might be all the files in one as now
And loop on summary_chunks and ask almost the same current prompt

even if you want multi-layer summaries in the end, you might use the summary chunks to do it.
As otherwise, you might get into the same issues of max size. Even if you start dividing the tree.

from autopilot.

hatkyinc2 commented on August 27, 2024

yea, I don't think we have this figured out yet.
The length I think was chosen arbitrarily in https://github.com/fjrdomingues/autopilot/blob/main/ui.js#L84
suppose 4,096 tokens is the gpt3.5 limit
You can play with increasing the number in the code.

I think the summaries are too long, I'm not sure if they are the best approach but it works for small projects.
you can play with the prompt here to try to get it to make shorter one
https://github.com/fjrdomingues/autopilot/blob/main/createSummaryOfFiles.js#L61

Task: Create a summary of this file, what it does and how it contributes to the overall project.

langchain use some kind of tokenizing and a special DB to try to match up such things I think, but IDK all the details, not sure exactly how we should use it.
https://python.langchain.com/en/latest/modules/indexes/vectorstores/getting_started.html

from autopilot.

fjrdomingues commented on August 27, 2024

"chosen arbitrarily" - ouch 😅 but ya, true.

Possible solutions (can be used together):

Use gpt-4, double the context window
Change the prompt to create smaller summaries - I had a new prompt, but it seems to got lost somewhere! I'll check the git
Create summaries for folders - I think this will be necessary and it's the most logical for me. On big projects, reading the summary of files at once doesn't make much sense to me. Projects are already organized by folders for similar reasons.
Embeddings and vector DBs

Vector stores would help in reducing the problem. The process would be something like this:

Read each file and create summary (cost from API)
Create an embedding using an API (cost from API)
Separate in chunks
Store chunks in DB with reference to the file
And then when running the UI it would search the DB for semantic relationship between the task and the summaries. Less smart than GPT directly but may work. This (the DB) will return text (maybe we can also return the array of relevant files)
continue the program

vector stores: Sounds like something we should totally do but I see it more as an optimization today (literally today)

from autopilot.

Charuru commented on August 27, 2024

Consider using compression: https://twitter.com/mckaywrigley/status/1643592353817694218

from autopilot.

hatkyinc2 commented on August 27, 2024

@fjrdomingues btw when I say "chosen arbitrarily" I mean basically it's not a formula, but a statically chosen number and I assumed the number wasn't tested
so

  if (countTokens(summaries.toString()) > 3000) {

This doesn't take into account the model and doesn't dictate anything for the AI.
Probably the cost table can be expended with 'max tokens'
and then here it's like

  desirableTokensForAnswer = process.env.desirableTokensForAnswer 
  maxSummaryTokens = maxModelToken(model) - desirableTokensForAnswer
---
coder agent
   coder prompt "you can use a maximum amount of {desirableTokensForAnswer} in your answer"
---
.env.template
  desirableTokensForAnswer = 1000

something like

And I'd still call desirableTokensForAnswer=1000 arbitrary until it passed some kind of testing prof that it's a good number for most cases.

So that would be a calculated number

Also, semi-related, loading summaries should probably also have include&exclude lists, especially if we do something like testing vs coding agents.
So each agent can load a different list, but all of them are indexed.

But generally today if someone indexes a lot of files, include&exclude lists won't ignore the old files generated, so they have to delete and recreate basically, not a good user experience and hard to explain in support

from autopilot.

hatkyinc2 commented on August 27, 2024

I spinned off the chunks to #73
@fjrdomingues Do we need this? It might be a long-term idea, but IDK if we want a big backlog at this point...
Generally, this would only be needed for huge codebases and it might be solved over time by other things, like vector store, compression, better/bigger models, different approach to file selection... I feel by the time we get back to this, we won't

from autopilot.

hatkyinc2 commented on August 27, 2024

@Charuru if you want to try the edge branch, should be no more summary limit
https://github.com/fjrdomingues/autopilot/tree/edge

from autopilot.

hatkyinc2 commented on August 27, 2024

While we might do some kind of multi-layer summary in the future,
Currently, a short-term solution has been put in place with chunking.
In the future, it's possible that a different approach would be taken for scope reduction such as fetching from a vector store or otherwise matching a subset of the info. (I have tested a different concept that we might use but I'm not in mind to do a writeup right in there at this time, preferably I'd test it further first)
As of the above and in an attempt to keep the state of the issue a bit clean,
I'm closing this issue.

from autopilot.

Need some multi-layered summaries maybe about autopilot HOT 8 CLOSED

Comments (8)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent