Comments (2)
Yup, I would recommend defining your own Summarization Model with a custom prompt, something like the following, for example:
from raptor import RetrievalAugmentation, RetrievalAugmentationConfig, BaseSummarizationModel
from openai import OpenAI
from tenacity import retry, stop_after_attempt, wait_random_exponential
class LegalGPT3TurboSummarizationModel(BaseSummarizationModel):
def __init__(self, model="gpt-3.5-turbo"):
self.model = model
@retry(wait=wait_random_exponential(min=1, max=20), stop=stop_after_attempt(6))
def summarize(self, context, max_tokens=500, stop_sequence=None):
try:
client = OpenAI()
response = client.chat.completions.create(
model=self.model,
messages=[
{"role": "system", "content": "You are a helpful legal research assistant."},
{
"role": "user",
"content": f"Write a summary of the following legal text, making sure to include all relevant section numbers and legislation names in the summary: {context}",
},
],
max_tokens=max_tokens,
)
return response.choices[0].message.content
except Exception as e:
print(e)
return e
RAC = RetrievalAugmentationConfig(summarization_model=LegalGPT3TurboSummarizationModel())
RA = RetrievalAugmentation(config=RAC)
RA.add_documents(text)
You can try experimenting with different content and system prompts, or even try stronger models like GPT-4, Claude, Gemini, or Mistral, which may have better performance on legal texts. If you have a sample dataset of legal questions and answers, you could also use DSPy (https://github.com/stanfordnlp/dspy) to automatically optimize the prompt for your specific use case.
from raptor.
I am closing this issue for now. If you have any more questions or have other issues, please feel free to reopen it.
from raptor.
Related Issues (20)
- NarrativeQA metrics
- TypeError: Cannot use scipy.linalg.eigh for sparse A with k >= N. Use scipy.linalg.eigh(A.toarray()) or reduce k. HOT 4
- Question about multiple texts HOT 2
- Question about experiment HOT 3
- When the text gets longer, the embedding API does not seem to work. HOT 5
- Missing implied functionality `add to existing` HOT 3
- Support for Azure OpenAi HOT 2
- Return Citation HOT 2
- Constructing Layer Issue HOT 3
- Is it not possible to use in AZURE? HOT 1
- Adding new Document to the existing RAPTOR setup
- num_layers acts like max_num_layers HOT 2
- UMAP n_neighbors must be greater than 1 HOT 10
- Inquiring about Vector DB implementation HOT 2
- multi doc HOT 1
- There seems to be a bug in the demo code.
- Evaluation code HOT 1
- Support for seeing Created Tree ? HOT 4
- **Outdated Dependencies in requirements.txt Causing Conflicts** HOT 1
- Bug in chunk splitting HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from raptor.