Giter VIP home page Giter VIP logo

langchain's Issues

pip install requirements.txt fails on conda on mac

I get this error when I run pip install -r requirements.txt

ERROR: Could not find a version that satisfies the requirement faiss (from versions: none)
ERROR: No matching distribution found for faiss

As per this issue, requirements.txt should be updated to faiss-cpu.

SelfAskWithSearchChain error when followup required

I have just updated to version 0.0.7

1- When running a simple question like: "What is the capital of Idaho?" , the result is OK

2- When running a question like: "What is the hometown of the reigning men's U.S. Open champion?"

I got the following error:
What is the hometown of the reigning men's U.S. Open champion?
Are follow up questions needed here: Yes.
Follow up: Who is the reigning men's U.S. Open champion?
(.......)

File ~/anaconda3/envs/nlp/lib/python3.10/site-packages/langchain/chains/self_ask_with_search/base.py:166, in SelfAskWithSearchChain.run(self, question)
152 def run(self, question: str) -> str:
153 """Run self ask with search chain.
154
155 Args:
(...)
...
--> 107 elif "snippet" in res["organic_results"][0].keys():
108 toret = res["organic_results"][0]["snippet"]
109 else:
KeyError: 'organic_results'

Combining Conversation Chain and Agents

Is there a current API for hooking a conversational chain with an agent so, during the conversation, actions can be performed (e.g. db lookup etc)?

If not, how would you see this working architecturally?

Support HuggingFaceHub embeddings endpoint

The current implementation of HuggingFaceEmbeddings local sentence-transformers to derive the encodings. This can be limiting as it requires a fairly capable machine to download the model, load it, and run inference.

An alternative is to support embeddings derived directly via the HuggingFaceHub. See this blog post for details.Ā This implementation will set similar expectations as Cohere and OpenAI embeddings API.

SQLDatabaseChain expects only select statements

Running some CRUD-like statements to an agent throws

ResourceClosedError: This result object does not return rows. It has been closed automatically.

From the implementation it appears that it always expects to see rows which are then cast to str and returned as part of the chain. What would the impact of modifying this behaviour be on the expected usecase for the SQL chain as it is?

https://github.com/hwchase17/langchain/blob/261029cef3e7c30277027f5d5283b87197eab520/langchain/sql_database.py#L70-L71

Support Codex embeddings

Current implementation OpenAI embeddings are hard coded to return only text embeddings via. GPT-3. For example,

    def embed_documents(self, texts: List[str]) -> List[List[float]]:
...
        responses = [
            self._embedding_func(text, engine=f"text-search-{self.model_name}-doc-001")
            for text in texts
        ]
    def embed_query(self, text: str) -> List[float]:
...
        embedding = self._embedding_func(
            text, engine=f"text-search-{self.model_name}-query-001"
        )

However, recent literature on reasoning shows CODEX to be more powerful on reasoning tasks than GPT-3. OpenAIEmbeddings should be modified to support both text and code embeddings.

Prompts for harrison/combine_documents_chain

Improved prompts for harrison/combine_documents_chain

"""QUESTION_PROMPT is the prompt used in phase 1 where we run the LLM on each chunk of the doc."""

question_prompt_template = """Use the following portion of a legal contract to see if any of the text is relevant to answer the question. 
Return any relevant text verbatim.
{context}
Question: {question}
Relevant text, if any:"""

QUESTION_PROMPT = PromptTemplate(
    template=question_prompt_template, input_variables=["context", "question"]
)



"""  """

combine_prompt_template = """Given the following extracted parts of a contract and a question, create a final answer with references ("SOURCES"). 
If you don't know the answer, just say that you don't know. Don't try to make up an answer.
ALWAYS return a "SOURCES" part in your answer.
QUESTION: Which state/country's law governs the interpretation of the contract?
=========
Content: This Agreement is governed by English law and the parties submit to the exclusive jurisdiction of the English courts in  relation to any dispute (contractual or non-contractual) concerning this Agreement save that either party may apply to any court for an  injunction or other relief to protect its Intellectual Property Rights.
Source: 28-pl
Content: No Waiver. Failure or delay in exercising any right or remedy under this Agreement shall not constitute a waiver of such (or any other)  right or remedy.\n\n11.7 Severability. The invalidity, illegality or unenforceability of any term (or part of a term) of this Agreement shall not affect the continuation  in force of the remainder of the term (if any) and this Agreement.\n\n11.8 No Agency. Except as expressly stated otherwise, nothing in this Agreement shall create an agency, partnership or joint venture of any  kind between the parties.\n\n11.9 No Third-Party Beneficiaries.
Source: 30-pl
Content: (b) if Google believes, in good faith, that the Distributor has violated or caused Google to violate any Anti-Bribery Laws (as  defined in Clause 8.5) or that such a violation is reasonably likely to occur,
Source: 4-pl
=========
FINAL ANSWER: This Agreement is governed by English law.
SOURCES: 28-pl
QUESTION: What did the president say about Michael Jackson?
=========
Content: Madam Speaker, Madam Vice President, our First Lady and Second Gentleman. Members of Congress and the Cabinet. Justices of the Supreme Court. My fellow Americans.  \n\nLast year COVID-19 kept us apart. This year we are finally together again. \n\nTonight, we meet as Democrats Republicans and Independents. But most importantly as Americans. \n\nWith a duty to one another to the American people to the Constitution. \n\nAnd with an unwavering resolve that freedom will always triumph over tyranny. \n\nSix days ago, Russiaā€™s Vladimir Putin sought to shake the foundations of the free world thinking he could make it bend to his menacing ways. But he badly miscalculated. \n\nHe thought he could roll into Ukraine and the world would roll over. Instead he met a wall of strength he never imagined. \n\nHe met the Ukrainian people. \n\nFrom President Zelenskyy to every Ukrainian, their fearlessness, their courage, their determination, inspires the world. \n\nGroups of citizens blocking tanks with their bodies. Everyone from students to retirees teachers turned soldiers defending their homeland.
Source: 0-pl
Content: And we wonā€™t stop. \n\nWe have lost so much to COVID-19. Time with one another. And worst of all, so much loss of life. \n\nLetā€™s use this moment to reset. Letā€™s stop looking at COVID-19 as a partisan dividing line and see it for what it is: A God-awful disease.  \n\nLetā€™s stop seeing each other as enemies, and start seeing each other for who we really are: Fellow Americans.  \n\nWe canā€™t change how divided weā€™ve been. But we can change how we move forwardā€”on COVID-19 and other issues we must face together. \n\nI recently visited the New York City Police Department days after the funerals of Officer Wilbert Mora and his partner, Officer Jason Rivera. \n\nThey were responding to a 9-1-1 call when a man shot and killed them with a stolen gun. \n\nOfficer Mora was 27 years old. \n\nOfficer Rivera was 22. \n\nBoth Dominican Americans whoā€™d grown up on the same streets they later chose to patrol as police officers. \n\nI spoke with their families and told them that we are forever in debt for their sacrifice, and we will carry on their mission to restore the trust and safety every community deserves.
Source: 24-pl
Content: And a proud Ukrainian people, who have known 30 years  of independence, have repeatedly shown that they will not tolerate anyone who tries to take their country backwards.  \n\nTo all Americans, I will be honest with you, as Iā€™ve always promised. A Russian dictator, invading a foreign country, has costs around the world. \n\nAnd Iā€™m taking robust action to make sure the pain of our sanctions  is targeted at Russiaā€™s economy. And I will use every tool at our disposal to protect American businesses and consumers. \n\nTonight, I can announce that the United States has worked with 30 other countries to release 60 Million barrels of oil from reserves around the world.  \n\nAmerica will lead that effort, releasing 30 Million barrels from our own Strategic Petroleum Reserve. And we stand ready to do more if necessary, unified with our allies.  \n\nThese steps will help blunt gas prices here at home. And I know the news about whatā€™s happening can seem alarming. \n\nBut I want you to know that we are going to be okay.
Source: 5-pl
Content: More support for patients and families. \n\nTo get there, I call on Congress to fund ARPA-H, the Advanced Research Projects Agency for Health. \n\nItā€™s based on DARPAā€”the Defense Department project that led to the Internet, GPS, and so much more.  \n\nARPA-H will have a singular purposeā€”to drive breakthroughs in cancer, Alzheimerā€™s, diabetes, and more. \n\nA unity agenda for the nation. \n\nWe can do this. \n\nMy fellow Americansā€”tonight , we have gathered in a sacred spaceā€”the citadel of our democracy. \n\nIn this Capitol, generation after generation, Americans have debated great questions amid great strife, and have done great things. \n\nWe have fought for freedom, expanded liberty, defeated totalitarianism and terror. \n\nAnd built the strongest, freest, and most prosperous nation the world has ever known. \n\nNow is the hour. \n\nOur moment of responsibility. \n\nOur test of resolve and conscience, of history itself. \n\nIt is in this moment that our character is formed. Our purpose is found. Our future is forged. \n\nWell I know this nation.
Source: 34-pl
=========
FINAL ANSWER: The president did not mention Michael Jackson.
SOURCES:
QUESTION: {question}
=========
{summaries}
=========
FINAL ANSWER:"""

COMBINE_PROMPT = PromptTemplate(
    template=combine_prompt_template, input_variables=["summaries", "question"]
)



"""COMBINE_DOCUMENT_PROMPT is fed into CombineDocumentsChain() at langchain/chains/combine_documents.py"""
COMBINE_DOCUMENT_PROMPT = PromptTemplate(
    template="Content: {page_content}\nSource: {source}",
    input_variables=["page_content", "source"],
)

SequentialChain that runs next chain on each output of the prior chain?

Given:

  • Chain A generates N outputs, e.g. we get back the text and immediately split it into a list based on post-processing we control and expect.
  • Chain B should then run over each of those outputs from Chain A.

Is there any way to do this elegantly in langchain? Perhaps some way to provide an output formatter to a chain, or some for_each pre- / post-processor? Or does this seem like just two independent chains with processing in between?

google/flan-t5-xxl and google/flan-t5-xl don't seem to work with HuggingFaceHub

First, when I load them I get a warning:

hf = HuggingFaceHub(repo_id="google/flan-t5-xl")
You're using a different task than the one specified in the repository. Be sure to know what you're doing :)

Then, when I use it in inference, I get gibberish.

hf("The capital of New York is")
'ew York is The capital of the world New York is The capital of the world New York is The capital of the world New York is The capital of the world New York is The capital of the world New York is The capital of the world New York is The capital of the world New York is The capital of the world New York is The capital of the world New York is The capital of the world New York is The capital of the world New York is The capital of the world New York is The capital of the world New York is The capital of the world New York is The capital of the world New York is The capital of the world New York is The capital of the world New York is The capital of the world New York is The capital of the world New York is The capital of the world New York is The capital of the world New York is The capital of the world New York is The capital of the world New York is The capital of the world New York is'

If I run the API via requests, I get the expected answer

import requests

API_URL = "https://api-inference.huggingface.co/models/google/flan-t5-xl"
headers = {"Authorization": "Bearer api_org_xxxxxxxxxxxxxxxxxxxxxxxxxxx"}

def query(payload):
	response = requests.post(API_URL, headers=headers, json=payload)
	return response.json()
	
output = query({
	"inputs": "The capital of New York is",
})
print(output)
[{'generated_text': 'Albany'}]

Any suggestions?

Unicode error on Windows

File "C:\ProgramData\Anaconda3\envs\LangChain\lib\encodings\cp1252.py", line 23, in decode
          return codecs.charmap_decode(input,self.errors,decoding_table)[0]
      UnicodeDecodeError: 'charmap' codec can't decode byte 0x8f in position 8: character maps to <undefined>

more consistent printing of intermediate steps

right now, some chains print out intermediate steps, some dont. lets standardize it so that they all have the same flag which turns it on/off, and things are printed out in a standard way. Ideally colorized

make integration_tests fails currently

pytest tests/integration_tests
============================= test session starts ==============================
platform darwin -- Python 3.9.12, pytest-7.1.1, pluggy-1.0.0
rootdir: /Users/delip/workspace/langchain
plugins: anyio-3.5.0, dotenv-0.5.2
collected 14 items / 1 error

==================================== ERRORS ====================================
____________ ERROR collecting tests/integration_tests/test_faiss.py ____________
ImportError while importing test module '/Users/delip/workspace/langchain/tests/integration_tests/test_faiss.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
../../opt/anaconda3/lib/python3.9/importlib/__init__.py:127: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
tests/integration_tests/test_faiss.py:9: in <module>
    from langchain.faiss import FAISS
E   ModuleNotFoundError: No module named 'langchain.faiss'

Allow different LLM objects for each PromptTemplate in SequentialChain

I would like to use SequentialChain with the option to use a different LLM class at each step. The rationale behind this is that I am using different temperature settings for different prompts within my chain. I also potentially may use different models for each step in the future.

a rough idea for config -- have a json dict specifying LLM config, and pass in a list of configs (or list of LLM objects) which is the same length as the number of prompttemplates in the chain if you want to use different objects per chain, or one LLM object or config object in the case where you want to use the same for all

improve error messages for missing keys in pydantic classes

Currently is as below, which is way too ugly

    335 Create a new model by parsing and validating input data from keyword arguments.
    336 
    337 Raises ValidationError if the input data cannot be parsed to form a valid model.
    338 """
    339 # Uses something other than `self` the first arg to allow "self" as a settable attribute
--> 340 values, fields_set, validation_error = validate_model(__pydantic_self__.__class__, data)
    341 if validation_error:
    342     raise validation_error
...
---> 53     input_variables = values["input_variables"]
     54     template = values["template"]
     55     template_format = values["template_format"]

KeyError: 'input_variables'

Bibtex Citation

Do you have a bibtex citation for this repo?

e.g. something like the following (from https://github.com/bigscience-workshop/promptsource)

@misc{bach2022promptsource,
      title={PromptSource: An Integrated Development Environment and Repository for Natural Language Prompts},
      author={Stephen H. Bach and Victor Sanh and Zheng-Xin Yong and Albert Webson and Colin Raffel and Nihal V. Nayak and Abheesht Sharma and Taewoon Kim and M Saiful Bari and Thibault Fevry and Zaid Alyafeai and Manan Dey and Andrea Santilli and Zhiqing Sun and Srulik Ben-David and Canwen Xu and Gunjan Chhablani and Han Wang and Jason Alan Fries and Maged S. Al-shaibani and Shanya Sharma and Urmish Thakker and Khalid Almubarak and Xiangru Tang and Xiangru Tang and Mike Tian-Jian Jiang and Alexander M. Rush},
      year={2022},
      eprint={2202.01279},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}

Serialize BasePromptTemplate to json rather than a file

I want to save prompt templates in a JSONField in a django db. the current save() method on BasePromptTemplate outputs to a file rather than a json object. I'd prefer to have a method that loads the prompt to & from json, & I decide what to do with the JSON myself.

Cannot import `PromptTemplate`

Hello,

When running the "Map Reduce" demo, I see the error below:

ImportError                               Traceback (most recent call last)
Cell In [3], line 1
----> 1 from langchain import OpenAI, PromptTemplate, LLMChain
      2 from langchain.text_splitter import CharacterTextSplitter
      3 from langchain.chains.mapreduce import MapReduceChain

ImportError: cannot import name 'PromptTemplate' from 'langchain' (/Users/mteoh/projects/langchain_sandbox/.env/lib/python3.9/site-packages/langchain/__init__.py)

I see it defined here: https://github.com/hwchase17/langchain/blob/c02eb199b6587aeeb50fbb083693572bd2f030cc/langchain/prompts/prompt.py#L13

And mentioned here:
https://github.com/hwchase17/langchain/blob/c02eb199b6587aeeb50fbb083693572bd2f030cc/langchain/__init__.py#L35

However, when grepping in the library directory, I do not find it:

:~/projects/langchain_sandbox/.env/lib/python3.9/site-packages/langchain $ grep -r PromptTemplate .

Relevant versions of my packages:

$ pip freeze | grep "langchain\|openai"
langchain==0.0.16
openai==0.25.0

Any advice? Thanks! Excited about this work.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    šŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. šŸ“ŠšŸ“ˆšŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ā¤ļø Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.