Giter VIP home page Giter VIP logo

trulens's People

Contributors

anupam128 avatar arn-tru avatar caleblutru avatar coreyhu avatar daniel-huang-1230 avatar davidkurokawa avatar dependabot[bot] avatar divya-gopinath avatar ejisoo avatar github-actions[bot] avatar ingridstevens avatar josephrp avatar joshreini1 avatar klasleino avatar lariel-fernandes avatar mymoza avatar nikhil-vytla avatar noahvl avatar piotrm0 avatar rajib76 avatar rshih32 avatar sayedsohan avatar schmidtseb avatar shayaks avatar stokedout avatar timbmg avatar venkatkakoju avatar vivekgangasani avatar walnutdust avatar yisding avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

trulens's Issues

OpenAI Chat LLM with ChatPromptTemplate raises error with TrueChain

Hi,

I'm trying to integrate trulens eval in our setup.
We are using ChatOpenAI model with ChatPromptTemplate in langchain.
When calling the chain directly, it works fine. Doing the same through TruChain results in an error.
Versions:
trulens-eval==0.10.0
langchain==0.0.266

This is the code reproducing the issue.
It demonstrates that calling the chain directly works.

from trulens_eval import TruChain
from langchain.chat_models import ChatOpenAI
from langchain.chains import LLMChain
from langchain.schema import SystemMessage
from langchain.prompts.chat import ChatPromptTemplate


llm = ChatOpenAI(temperature=0.9)
# you need to set the Open AI API Key 

prompt = ChatPromptTemplate.from_messages(
    [SystemMessage(content="You are a friendly bot, who speaks like a dog")]
)

chain = LLMChain(llm=llm, prompt=prompt, verbose=True)

truchain = TruChain(
    chain,
    app_id="Chain1_ChatApplication",
)

input = {"input": "Hello"}
result_chain = chain(input)
print("Result from chain works fine: " + str(result_chain))
result = truchain(input)
print("Result from trulens: " + str(result))

Request to add functionality to Trulens to measure Latency

By measuring latency, we can compare how effective these models are across different implementations. I believe it would be beneficial for evaluating and comparing the performance of this wrapper. Particularly, working with an e-GPU but in a Sequential process, it may help triage bottleneck issues.

"Display full app json" always shows data for first row on the Dashboard

If I open trulens-eval dashboard evaluations -> http://localhost:8501/Evaluations
And have mulitple records,
when I select a different record then the first one in the row,
still the first record's data is shown in the bottom part "Display full app json".
Expected behaviour: the data for the other row should be displayed.

My code to run trulens:

truchain = TruChain(chain, app_id="app_id", tags="my tag)
await truchain.acall_with_record(variables)

trulens-eval version: 0.11.0

Quickstart example not actually logging

The change to use a context manager doesn't actually log anything:

with tru_llm_standalone as recording:
    llm_standalone(prompt_input)

If instead I change that to

tru_llm_standalone.call_with_record(prompt_input)

I see records logged.

[Bug] Dashboard breaks with `llama-index>0.7.23`

Affects releases: 0.9.0

Issue

Dashboard breaks if I my Python environment has llama-index 0.7.24.post1 or later.

Not sure if it's an issue with the new llama-index releases or a compatibility issue between that and trulens-eval.

I can only get it to work by setting llama-index>=0.7.16,<=0.7.23 in requirements.txt or pyproject.toml

Reproduce

Update to llama-index=0.7.24.post1 or later (up to 0.8.2), then run Tru().start_dashboard() and browse to the dashboard.

See dashboard log here: log.txt

Suggestion

A temporary workaround would be pinning the llama-index version to 0.7.23 in requirements.txt and setup.py.

But users of trulens-eval would miss on any future llama-index updates while the issue lasts.

trulens doesn't work with llama 0.7.5

The code below:

import os
os.environ["OPENAI_API_KEY"] = "sk-***"
os.environ["HUGGINGFACE_API_KEY"] = "hf_***"

# Imports main tools:
from trulens_eval import TruLlama, Feedback, Tru, feedback
tru = Tru()

from llama_index import VectorStoreIndex, SimpleDirectoryReader

documents = SimpleDirectoryReader('data').load_data()
index = VectorStoreIndex.from_documents(documents)

query_engine = index.as_query_engine()

response = query_engine.query("What did the author do growing up?")
print(response)

import numpy as np



# Initialize Huggingface-based feedback function collection class:
hugs = feedback.Huggingface()
openai = feedback.OpenAI()

# Define a language match feedback function using HuggingFace.
f_lang_match = Feedback(hugs.language_match).on_input_output()
# By default this will check language match on the main app input and main app
# output.

# Question/answer relevance between overall question and answer.
f_qa_relevance = Feedback(openai.relevance).on_input_output()

# Question/statement relevance between question and each context chunk.
f_qs_relevance = Feedback(openai.qs_relevance).on_input().on(
    TruLlama.select_source_nodes().node.text
).aggregate(np.min)

tru_query_engine = TruLlama(query_engine,
    app_id='LlamaIndex_App1',
    feedbacks=[f_lang_match, f_qa_relevance, f_qs_relevance])

generates:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
[/mnt/llm/trulens/1.py](https://vscode-remote+ssh-002dremote-***.vscode-resource.vscode-cdn.net/mnt/llm/trulens/1.py) in line 33
     [27](file:///mnt/llm/trulens/1.py?line=26) # Question/statement relevance between question and each context chunk.
     [29](file:///mnt/llm/trulens/1.py?line=28) f_qs_relevance = Feedback(openai.qs_relevance).on_input().on(
     [30](file:///mnt/llm/trulens/1.py?line=29)     TruLlama.select_source_nodes().node.text
     [31](file:///mnt/llm/trulens/1.py?line=30) ).aggregate(np.min)
---> [33](file:///mnt/llm/trulens/1.py?line=32) tru_query_engine = TruLlama(query_engine,
     [34](file:///mnt/llm/trulens/1.py?line=33)     app_id='LlamaIndex_App1',
     [35](file:///mnt/llm/trulens/1.py?line=34)     feedbacks=[f_lang_match, f_qa_relevance, f_qs_relevance])

File [~/.conda/envs/llamaindex/lib/python3.10/site-packages/trulens_eval/tru_llama.py:134](https://vscode-remote+ssh-002dremote***.vscode-resource.vscode-cdn.net/mnt/llm/trulens/~/.conda/envs/llamaindex/lib/python3.10/site-packages/trulens_eval/tru_llama.py:134), in TruLlama.__init__(self, app, **kwargs)
    [132](file:///home/wodecki/.conda/envs/llamaindex/lib/python3.10/site-packages/trulens_eval/tru_llama.py?line=131) kwargs['app'] = app
    [133](file:///home/wodecki/.conda/envs/llamaindex/lib/python3.10/site-packages/trulens_eval/tru_llama.py?line=132) kwargs['root_class'] = Class.of_object(app)  # TODO: make class property
--> [134](file:///home/wodecki/.conda/envs/llamaindex/lib/python3.10/site-packages/trulens_eval/tru_llama.py?line=133) kwargs['instrument'] = LlamaInstrument()
    [136](file:///home/wodecki/.conda/envs/llamaindex/lib/python3.10/site-packages/trulens_eval/tru_llama.py?line=135) super().__init__(**kwargs)

File [~/.conda/envs/llamaindex/lib/python3.10/site-packages/trulens_eval/tru_llama.py:101](https://vscode-remote+ssh-002dremote-***.vscode-resource.vscode-cdn.net/mnt/llm/trulens/~/.conda/envs/llamaindex/lib/python3.10/site-packages/trulens_eval/tru_llama.py:101), in LlamaInstrument.__init__(self)
     [97](file:///home/wodecki/.conda/envs/llamaindex/lib/python3.10/site-packages/trulens_eval/tru_llama.py?line=96) def __init__(self):
     [98](file:///home/wodecki/.conda/envs/llamaindex/lib/python3.10/site-packages/trulens_eval/tru_llama.py?line=97)     super().__init__(
     [99](file:///home/wodecki/.conda/envs/llamaindex/lib/python3.10/site-packages/trulens_eval/tru_llama.py?line=98)         root_method=TruLlama.query_with_record,
    [100](file:///home/wodecki/.conda/envs/llamaindex/lib/python3.10/site-packages/trulens_eval/tru_llama.py?line=99)         modules=LlamaInstrument.Default.MODULES,
--> [101](file:///home/wodecki/.conda/envs/llamaindex/lib/python3.10/site-packages/trulens_eval/tru_llama.py?line=100)         classes=LlamaInstrument.Default.CLASSES(),  # was thunk
    [102](file:///home/wodecki/.conda/envs/llamaindex/lib/python3.10/site-packages/trulens_eval/tru_llama.py?line=101)         methods=LlamaInstrument.Default.METHODS
    [103](file:///home/wodecki/.conda/envs/llamaindex/lib/python3.10/site-packages/trulens_eval/tru_llama.py?line=102)     )

File [~/.conda/envs/llamaindex/lib/python3.10/site-packages/trulens_eval/tru_llama.py:55](https://vscode-remote+ssh-002dremote-***.vscode-resource.vscode-cdn.net/mnt/llm/trulens/~/.conda/envs/llamaindex/lib/python3.10/site-packages/trulens_eval/tru_llama.py:55), in LlamaInstrument.Default.()
     [42](file:///home/wodecki/.conda/envs/llamaindex/lib/python3.10/site-packages/trulens_eval/tru_llama.py?line=41) MODULES = {"llama_index."}.union(
     [43](file:///home/wodecki/.conda/envs/llamaindex/lib/python3.10/site-packages/trulens_eval/tru_llama.py?line=42)     LangChainInstrument.Default.MODULES
     [44](file:///home/wodecki/.conda/envs/llamaindex/lib/python3.10/site-packages/trulens_eval/tru_llama.py?line=43) )  # NOTE: llama_index uses langchain internally for some things
     [46](file:///home/wodecki/.conda/envs/llamaindex/lib/python3.10/site-packages/trulens_eval/tru_llama.py?line=45) # Putting these inside thunk as llama_index is optional.
     [47](file:///home/wodecki/.conda/envs/llamaindex/lib/python3.10/site-packages/trulens_eval/tru_llama.py?line=46) CLASSES = lambda: {
     [48](file:///home/wodecki/.conda/envs/llamaindex/lib/python3.10/site-packages/trulens_eval/tru_llama.py?line=47)     llama_index.indices.query.base.BaseQueryEngine,
     [49](file:///home/wodecki/.conda/envs/llamaindex/lib/python3.10/site-packages/trulens_eval/tru_llama.py?line=48)     llama_index.indices.base_retriever.BaseRetriever,
     [50](file:///home/wodecki/.conda/envs/llamaindex/lib/python3.10/site-packages/trulens_eval/tru_llama.py?line=49)     llama_index.indices.base.BaseIndex,
     [51](file:///home/wodecki/.conda/envs/llamaindex/lib/python3.10/site-packages/trulens_eval/tru_llama.py?line=50)     llama_index.chat_engine.types.BaseChatEngine,
     [52](file:///home/wodecki/.conda/envs/llamaindex/lib/python3.10/site-packages/trulens_eval/tru_llama.py?line=51)     llama_index.prompts.base.Prompt,
     [53](file:///home/wodecki/.conda/envs/llamaindex/lib/python3.10/site-packages/trulens_eval/tru_llama.py?line=52)     # llama_index.prompts.prompt_type.PromptType, # enum
     [54](file:///home/wodecki/.conda/envs/llamaindex/lib/python3.10/site-packages/trulens_eval/tru_llama.py?line=53)     llama_index.question_gen.types.BaseQuestionGenerator,
---> [55](file:///home/wodecki/.conda/envs/llamaindex/lib/python3.10/site-packages/trulens_eval/tru_llama.py?line=54)     llama_index.indices.query.response_synthesis.ResponseSynthesizer,
     [56](file:///home/wodecki/.conda/envs/llamaindex/lib/python3.10/site-packages/trulens_eval/tru_llama.py?line=55)     llama_index.indices.response.refine.Refine,
     [57](file:///home/wodecki/.conda/envs/llamaindex/lib/python3.10/site-packages/trulens_eval/tru_llama.py?line=56)     llama_index.llm_predictor.LLMPredictor,
     [58](file:///home/wodecki/.conda/envs/llamaindex/lib/python3.10/site-packages/trulens_eval/tru_llama.py?line=57)     llama_index.llm_predictor.base.LLMMetadata,
     [59](file:///home/wodecki/.conda/envs/llamaindex/lib/python3.10/site-packages/trulens_eval/tru_llama.py?line=58)     llama_index.llm_predictor.base.BaseLLMPredictor,
     [60](file:///home/wodecki/.conda/envs/llamaindex/lib/python3.10/site-packages/trulens_eval/tru_llama.py?line=59)     llama_index.vector_stores.types.VectorStore,
     [61](file:///home/wodecki/.conda/envs/llamaindex/lib/python3.10/site-packages/trulens_eval/tru_llama.py?line=60)     llama_index.question_gen.llm_generators.BaseQuestionGenerator,
     [62](file:///home/wodecki/.conda/envs/llamaindex/lib/python3.10/site-packages/trulens_eval/tru_llama.py?line=61)     llama_index.indices.service_context.ServiceContext,
     [63](file:///home/wodecki/.conda/envs/llamaindex/lib/python3.10/site-packages/trulens_eval/tru_llama.py?line=62)     llama_index.indices.prompt_helper.PromptHelper,
     [64](file:///home/wodecki/.conda/envs/llamaindex/lib/python3.10/site-packages/trulens_eval/tru_llama.py?line=63)     llama_index.embeddings.base.BaseEmbedding,
     [65](file:///home/wodecki/.conda/envs/llamaindex/lib/python3.10/site-packages/trulens_eval/tru_llama.py?line=64)     llama_index.node_parser.interface.NodeParser
     [66](file:///home/wodecki/.conda/envs/llamaindex/lib/python3.10/site-packages/trulens_eval/tru_llama.py?line=65) }.union(LangChainInstrument.Default.CLASSES())
     [68](file:///home/wodecki/.conda/envs/llamaindex/lib/python3.10/site-packages/trulens_eval/tru_llama.py?line=67) # Instrument only methods with these names and of these classes. Ok to
     [69](file:///home/wodecki/.conda/envs/llamaindex/lib/python3.10/site-packages/trulens_eval/tru_llama.py?line=68) # include llama_index inside methods.
     [70](file:///home/wodecki/.conda/envs/llamaindex/lib/python3.10/site-packages/trulens_eval/tru_llama.py?line=69) METHODS = dict_set_with(
     [71](file:///home/wodecki/.conda/envs/llamaindex/lib/python3.10/site-packages/trulens_eval/tru_llama.py?line=70)     {
     [72](file:///home/wodecki/.conda/envs/llamaindex/lib/python3.10/site-packages/trulens_eval/tru_llama.py?line=71)         "get_response":
   (...)
     [94](file:///home/wodecki/.conda/envs/llamaindex/lib/python3.10/site-packages/trulens_eval/tru_llama.py?line=93)     }, LangChainInstrument.Default.METHODS
     [95](file:///home/wodecki/.conda/envs/llamaindex/lib/python3.10/site-packages/trulens_eval/tru_llama.py?line=94) )

AttributeError: module 'llama_index.indices.query' has no attribute 'response_synthesis'

Request for Citation

Hi, it is exciting to have Trulens. I wonder if a bibtex citation can be provided for the users to cite the repo?

Random gradients in torch summarizer pipeline

This code has different gradients each time

from trulens.nn.attribution import InputAttribution, InternalInfluence
from trulens.nn.attribution import IntegratedGradients
from trulens.nn.quantities import MaxClassQoI
from trulens.nn.distributions import PointDoi, LinearDoi
from trulens.nn.slices import Cut, InputCut, OutputCut, Slice
from trulens.nn.models import get_model_wrapper

from transformers import pipeline

summarizer = pipeline("summarization")

summarizer(
    """
    America has changed dramatically during recent years. Not only has the number of 
    graduates in traditional engineering disciplines such as mechanical, civil, 
    electrical, chemical, and aeronautical engineering declined, but in most of 
    the premier American universities engineering curricula now concentrate on 
    and encourage largely the study of engineering science. As a result, there 
    are declining offerings in engineering subjects dealing with infrastructure, 
    the environment, and related issues, and greater concentration on high 
    technology subjects, largely supporting increasingly complex scientific 
    developments. While the latter is important, it should not be at the expense 
    of more traditional engineering.

    Rapidly developing economies such as China and India, as well as other 
    industrial countries in Europe and Asia, continue to encourage and advance 
    the teaching of engineering. Both China and India, respectively, graduate 
    six and eight times as many traditional engineers as does the United States. 
    Other industrial countries at minimum maintain their output, while America 
    suffers an increasingly serious decline in the number of engineering graduates 
    and a lack of well-educated engineers.
"""
)

m = summarizer.model.to("cpu")
tokenizer = summarizer.tokenizer

batch_sentences = ["Hello I'm a single sentence", "And another sentence", "And the very very last one"]
batch = tokenizer(batch_sentences, padding=True, truncation=True, return_tensors="pt").to(m.device)
input = batch['input_ids']

wrapped_model = get_model_wrapper(m, input_shape=(None, 7), device='cpu')

embed_cut = Cut('model_encoder_layers_0_self_attn_q_proj', anchor='in')
from trulens.nn.quantities import QoI
import torch
class SummarizerQoI(QoI):
    def __call__(self, seq_2_seq_output):
        logits = seq_2_seq_output['logits']
        
        max_token, max_indices = torch.max(logits,dim=-1)
        qoi = torch.mean(max_token, 1, True)
        return qoi

infl = InternalInfluence(
    wrapped_model, 
    Slice(embed_cut, OutputCut()), 
    SummarizerQoI(),
    PointDoi(cut=embed_cut))

attrs = infl.attributions(**batch)

ValueError in the tensorflow 2 / keras notebook provided in the website.

ValueError in the tensorflow 2 / keras notebook provided in the website.

The error occurs when executing the 6th code cell, it contains following code:

infl = InputAttribution(model)
attrs_input = infl.attributions(x_pp)

Error Log:

ValueError                                Traceback (most recent call last)
<ipython-input-18-9b10bbb33c8d> in <module>()
      1 infl = InputAttribution(model)
----> 2 attrs_input = infl.attributions(x_pp)

/usr/local/lib/python3.7/dist-packages/trulens/nn/attribution.py in attributions(self, *model_args, **model_kwargs)
    269             to_cut=self.slice.to_cut,
    270             intervention=D,
--> 271             doi_cut=doi_cut)
    272         # Take the mean across the samples in the DoI.
    273         if isinstance(qoi_grads, DATA_CONTAINER_TYPE):

/usr/local/lib/python3.7/dist-packages/trulens/nn/models/keras.py in qoi_bprop(self, qoi, model_args, model_kwargs, doi_cut, to_cut, attribution_cut, intervention)
    430             intervention, DATA_CONTAINER_TYPE) else [intervention]
    431 
--> 432         Q = qoi(to_tensors[0]) if len(to_tensors) == 1 else qoi(to_tensors)
    433 
    434         doi_tensors, intervention = self._prepare_intervention_with_input(

/usr/local/lib/python3.7/dist-packages/trulens/nn/quantities.py in __call__(self, y)
    109 
    110     def __call__(self, y: TensorLike) -> TensorLike:
--> 111         self._assert_cut_contains_only_one_tensor(y)
    112 
    113         if self.activation is not None:

/usr/local/lib/python3.7/dist-packages/trulens/nn/quantities.py in _assert_cut_contains_only_one_tensor(self, x)
     76                 '`{}` expected to receive an instance of `Tensor`, but '
     77                 'received an instance of {}'.format(
---> 78                     self.__class__.__name__, type(x)))
     79 
     80 

ValueError: `MaxClassQoI` expected to receive an instance of `Tensor`, but received an instance of <class 'keras.engine.keras_tensor.KerasTensor'>

AzureOpenAI fails: missing deployment_id

Bug when using the new AzureOpenAI wrapper added in PR #242

How to reproduce:

azure = AzureOpenAI(model_engine="gpt-35-turbo", deployment_id="gpt-4")

Error:

Traceback (most recent call last):
  File "/home/lab/lab392-rfpvirtualassistant/scripts/langchain/eval.py", line 15, in <module>
    pipeline = QAPipeline(config['AWS']['AWSKendraSearchIndexId'], config['OpenAI']['OpenAIAnswerDeployment'], eval_mode=True, answers=answers)
  File "/home/lab/lab392-rfpvirtualassistant/scripts/langchain/pipeline.py", line 58, in __init__
    azure = AzureOpenAI(model_engine="gpt-35-turbo", deployment_id=openAIDeployment)
  File "/home/lab/anaconda/envs/lab392/lib/python3.10/site-packages/trulens_eval/feedback.py", line 1689, in __init__
    super().__init__(
  File "/home/lab/anaconda/envs/lab392/lib/python3.10/site-packages/trulens_eval/feedback.py", line 1049, in __init__
    super().__init__(
  File "/home/lab/anaconda/envs/lab392/lib/python3.10/site-packages/trulens_eval/feedback.py", line 1020, in __init__
    super().__init__(*args, **kwargs)
  File "/home/lab/anaconda/envs/lab392/lib/python3.10/site-packages/trulens_eval/util.py", line 1660, in __init__
    super().__init__(__tru_class_info=class_info, **kwargs)
  File "pydantic/main.py", line 341, in pydantic.main.BaseModel.__init__
pydantic.error_wrappers.ValidationError: 1 validation error for AzureOpenAI
deployment_id
  field required (type=value_error.missing)

I believe the problem is that in class OpenAI, the deployment_id parameter is lost since it is not assigned to self_kwargs. PR incoming with my suggestion to fix this bug.

Explain semantic segmentation models via PyTorch backend

Hi team, I'm delighted to explore this library for finding notorious issues in our trained models.

I'm trying to apply TruLens to various tasks (CV - segmentation, localisation; NLP - classification, generation etc.) and got stuck in the initial stages itself.

For started, I have tried to load pre-trained semantic/instance segmentation models and couldn't get them to work (after spending few cycles on debugging).

A minute change in the intro_demo_pytorch.ipynb at the line:

pytorch_model = models.segmentation.fcn_resnet50(pretrained=True)

throws the following error while calculating the InputAttribution():

ValueError                                Traceback (most recent call last)
<ipython-input-11-9b10bbb33c8d> in <module>()
      1 infl = InputAttribution(model)
----> 2 attrs_input = infl.attributions(x_pp)

3 frames
/usr/local/lib/python3.7/dist-packages/trulens/nn/attribution.py in attributions(self, *model_args, **model_kwargs)
    269             intervention=D,
    270             doi_cut=doi_cut)
--> 271 
    272         # Take the mean across the samples in the DoI.
    273         if isinstance(qoi_grads, DATA_CONTAINER_TYPE):

/usr/local/lib/python3.7/dist-packages/trulens/nn/models/pytorch.py in qoi_bprop(self, qoi, model_args, model_kwargs, doi_cut, to_cut, attribution_cut, intervention)
    461         grads_list = []
    462         for z in zs:
--> 463             z_flat = ModelWrapper._flatten(z)
    464             qoi_out = qoi(y)
    465 

/usr/local/lib/python3.7/dist-packages/trulens/nn/quantities.py in __call__(self, y)
    109 
    110     def __call__(self, y: TensorLike) -> TensorLike:
--> 111         self._assert_cut_contains_only_one_tensor(y)
    112 
    113         if self.activation is not None:

/usr/local/lib/python3.7/dist-packages/trulens/nn/quantities.py in _assert_cut_contains_only_one_tensor(self, x)
     76                 '`{}` expected to receive an instance of `Tensor`, but '
     77                 'received an instance of {}'.format(
---> 78                     self.__class__.__name__, type(x)))
     79 
     80 

ValueError: `MaxClassQoI` expected to receive an instance of `Tensor`, but received an instance of <class 'collections.OrderedDict'>

Can you share some guidance on triaging this error? I'm using the same input image and transforms.

Using it with out the Sqllite db

Hi Teams
The record tracking function is look great!
I would like to ask is it possible to use it with out running a db, I mainly want to use it to get the tracking record for each time of my call.

Thank you very much

Multiple tags for a record?

I was wondering if it is supported to add multiple tags for a record.

Example code from the documentation for adding tags:
https://www.trulens.org/trulens_eval/langchain_quickstart/#instrument-chain-for-logging-with-trulens

truchain = TruChain(chain,
    app_id='Chain1_ChatApplication',
    feedbacks=[f_lang_match],
    tags = "prototype")

What if I wanted to add one more tag, e.g.
tags = ["tag1","tag2"]

Is something like this supported?
Currently I get an error, that a single string is expected.

Usability QoL items

  • move quick start to top and add before hard to see badges "Quickstart Pytorch, and Tensorflow/Keras notebooks are below:"
  • In the notebooks, make clear attribution shape is input shape and that mask visualizer is just a helper for image data and not mandatory

Can it work with a Local LLM?

import os
from langchain import PromptTemplate, LLMChain
from langchain.llms import GPT4All
from langchain.callbacks.manager import CallbackManager
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
from langchain.vectorstores.faiss import FAISS
from langchain.embeddings import HuggingFaceEmbeddings
from trulens_eval import TruChain, Feedback, Huggingface, Tru


tru = Tru()


os.environ["HUGGINGFACE_API_KEY"] = ""


gpt4all_path = './models/gpt4all-converted.bin'


callback_manager = CallbackManager([StreamingStdOutCallbackHandler()])


embeddings = HuggingFaceEmbeddings(model_name='sentence-transformers/all-MiniLM-L6-v2')
hugs = Huggingface()


f_lang_match = Feedback(hugs.language_match).on_input_output()

llm = GPT4All(model=gpt4all_path,verbose=True,temp=0.2)


index = FAISS.load_local("my_faiss_index", embeddings)


template = """Answer the following question using {context}
Question: {question}
Answer:
"""


def get_best_answer(question):
    matched_docs, sources = similarity_search(question, index, n=4)
    context = "\n".join([doc.page_content for doc in matched_docs])
    prompt = PromptTemplate(template=template, input_variables=["context", "question"]).partial(context=context)
    llm_chain = LLMChain(prompt=prompt, llm=llm)
    truchain = TruChain(llm_chain, app_id='Chain1_ChatApplication', feedbacks=[f_lang_match],tags = "prototype")
    answer = truchain.run(question)
    return answer



def similarity_search(query, index, n=4):
    matched_docs = index.similarity_search(query, k=n)
    sources = []
    for doc in matched_docs:
        sources.append(
            {
                "page_content": doc.page_content,
                "metadata": doc.metadata,
            }
        )
    return matched_docs, sources


while True:

    question = input("Please enter your question (or type 'exit' to close the program): ")


    if question.lower() == "exit":
        break


    answer = get_best_answer(question)


    print("Answer:", answer)

gives

✅ app Chain1_ChatApplication -> default.sqlite
✅ feedback def. feedback_definition_hash_44c43fe23fdbb98055154e6bb126142a -> default.sqlite
Traceback (most recent call last):
  File "c:\Users\abhishek\Local-LLM\gpt4all_truelens.py", line 75, in <module>
    answer = get_best_answer(question)
             ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "c:\Users\abhishek\Local-LLM\gpt4all_truelens.py", line 48, in get_best_answer
    answer = truchain.run(question)
             ^^^^^^^^^^^^^^^^^^^^^^
TypeError: 'RuntimeError' object is not callable`

[Bug] Selector `GetItems` crashes with `AttributeError`

How to reproduce:

from trulens_eval import Select

selector = Select.RecordCalls._call.args.inputs[["key1", "key2"]]

repr(selector)

Output:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File ".../.venv/lib/python3.9/site-packages/trulens_eval/util.py", line 951, in __repr__
    return "JSONPath()" + ("".join(map(repr, self.path)))
  File ".../.venv/lib/python3.9/site-packages/trulens_eval/util.py", line 926, in __repr__
    return f"[{','.join(self.indices)}]"
AttributeError: 'GetItems' object has no attribute 'indices'

Then obviously this crashes any Feedback that uses this selector

Suggested fix

On util.py

    def __repr__(self):
        return f"[{','.join(self.items)}]"

QoI has no access to instance being explained

Quantities of interests receive output values on DoI samples but these are not necessarily the instance being explained, except in some cases like point. Because of this, one cannot define a QoI for "explained instance predicted class logits" because predicted class is not know unless the DoI sample is the same as the explained instance.

A related issue is that some may interpret QoI like the default max to mean what I described above but that is not what the max QoI implements because of the issue noted.

[Bug] Missing Package and Unicode Encode Error

Hello, you forgot to import os :D
image
And it seems that these symbols below only work with Streamlit. The Azure Function libs do not support encoding these unicode characters. When I replace them with nothing, my app works. Could you consider removing them?
Capture

[Bug] TruChain with AgentExecutor breaks LangChain

This bug affects release 0.9.0 and it was probably introduced with #362 (I tested with the codebase before and after the merge, before it works fine).

Context

While trying to reproduce this example from LangChain I noticed that the TruChain with an AgentExecutor causes TypeError: 'NoneType' object is not subscriptable in langchain.chains.base

Reproduce

from langchain import LLMMathChain
from langchain.agents import initialize_agent, Tool, AgentType
from langchain.chat_models import ChatOpenAI
from trulens_eval import Tru, TruChain

llm=ChatOpenAI(temperature=0, model="gpt-3.5-turbo-0613")

llm_math_chain = LLMMathChain.from_llm(llm, verbose=True)

tools = [
    Tool(
        name="Calculator",
        func=llm_math_chain.run,
        description="useful for when you need to answer questions about math"
    ),
]

agent = initialize_agent(tools, llm, agent=AgentType.OPENAI_FUNCTIONS, verbose=True)

app = TruChain(agent)

# works fine
agent(inputs={"input": "how much is Euler's number divided by PI"})

# raises `TypeError: 'NoneType' object is not subscriptable`
app(inputs={"input": "how much is Euler's number divided by PI"})

Output logs and stacktrace here: logs.txt

Handle single-entry dictionary output in default QoIs

It seems that the output of torchvision models is an OrderedDict and not a torch.Tensor, which by default has a single key, 'out'. Currently a custom QoI is required to handle this type of output:

class TorchvisionMaxClassQoI(MaxClassQoI):
    def __call__(self, y):
        super().__call__(y['out'])

It would be nice to handle this case by default instead of requiring a custom QoI. If the input, y to a QoI is a dict or OrderedDict with exactly one entry, then that entry should be used in place of y rather than raising an exception (as this is unambiguously the output we would care about).

Installation fails due to fastavro

Trulens package needs fastavro==1.7.4.
1.7.4 version has an issue and it has been fixed in latest update (1.8.2)
Can you please fix this ?

[FR] Allow passing `dict` as input to feedback function

Context

Some feedback functions may need to operate on the final formatted prompt, not only on prompt inputs. For example, when using the ReAct pattern the prompt might contain dynamic instructions or examples that need to be taken into account when evaluating the LLM behavior.

One could try to achieve that by passing both the prompt template and the inputs dictionary to the feedback function, then letting it format the prompt before calculating the metric:

class CustomProvider(Provider):
    def my_feedback_func(self, prompt: str, inputs: dict) -> float:
        prompt = prompt.format(**inputs)
        return float(len(prompt))

Feedback(CustomProvider().my_feedback_func) \
    .on(Select.App.app.prompt.template) \
    .on(Select.RecordCalls._call.args.inputs)

However, that is not possible, because the FeedbackCall schema doesn't allow a dict as argument to the feedback function.
Using the above defined feedback in an app will produce the following error:

Feedback Function Exception Caught: Traceback (most recent call last):
  File ".../.venv/lib/python3.9/site-packages/trulens_eval/feedback.py", line 862, in run
    feedback_call = FeedbackCall(
  File "pydantic/main.py", line 341, in pydantic.main.BaseModel.__init__
pydantic.error_wrappers.ValidationError: 1 validation error for FeedbackCall
args -> inputs
  str type expected (type=type_error.str)

Suggestion

Change the schema of FeedbackCall to the following:

class FeedbackCall(SerialModel):
    args: Dict[str, Union[str, JSON]]
    ret: float

Dependency of visualizers on backend leads to confusing error

Many visualizers, including the Tiler class should be able to be used without necessarily using other TruLens features. However, typically, the backend will be set when the user calls get_model_wrapper, unless they have explicitly set the backend environment variable (which is probably not the norm). This creates a problem when using the Tiler as it uses the backend to get the channel dimension. When the backend has not been set either in the environment variable or by calling get_model_wrapper, the backend comes back as None, leading to an error, that will perhaps be confusing to a user trying to use just the visualization library of TruLens.

I suggest that we try to handle the case where the backend is not set when the tiler is called (and other places the channel dimension comes up in the visualizers). Most of the time, the channel dimension should be able to be inferred, because it is used for the purpose of displaying RGB or grayscale images, which can only have a 3 or a 1 as the size of the channel dimension. Thus, the only ambiguous case is when the image itself has a height/width of 3 or 1. In those cases, perhaps we can adopt a default, e.g., the convention used by matplotlib (which I believe is channels last).

The new process when trying to obtain the channel dimension would be:

  • check if the backend is set. If so, use the backend channel dimension;
  • otherwise, check if there is a single dimension with size 1 or 3. If so, use that as the channel dimension;
  • otherwise, assume the user has provided images in the format that the visualization library (matplotlib) expects.

This way the visualizations module never throws errors simply because the user never specified the backend.

Error when running IntegratedGradients.attributions(image)

Hello! Would appreciate any help to solve this problem! I'm trying to use the IntegratedGradients() to explain my model.

yolo = Load_Yolo_model() # yolo is <class 'tensorflow.python.keras.engine.functional.Functional'> object
model_wrapped = get_model_wrapper(yolo)
ig_computer = IntegratedGradients(model_wrapped, resolution=20)

with PIL.Image.open(image_path).convert('RGB') as img:
    x = np.array(img.resize((416, 416)))
    x_np = np.array(img.resize((416, 416), PIL.Image.ANTIALIAS))[np.newaxis]
    input_attributions = ig_computer.attributions(x_np)

But run into a problem:

Traceback (most recent call last):
  File "trulens_yolov3.py", line 31, in <module>
    input_attributions = ig_computer.attributions(x_np)
  File "/home/ubuntu/anaconda3/envs/TF23/lib/python3.8/site-packages/trulens/nn/attribution.py", line 264, in attributions
    qoi_grads = self.model.qoi_bprop(
  File "/home/ubuntu/anaconda3/envs/TF23/lib/python3.8/site-packages/trulens/nn/models/tensorflow_v2.py", line 424, in qoi_bprop
    Q = qoi(outputs[0]) if len(outputs) == 1 else qoi(outputs)
  File "/home/ubuntu/anaconda3/envs/TF23/lib/python3.8/site-packages/trulens/nn/quantities.py", line 111, in __call__
    self._assert_cut_contains_only_one_tensor(y)
  File "/home/ubuntu/anaconda3/envs/TF23/lib/python3.8/site-packages/trulens/nn/quantities.py", line 63, in _assert_cut_contains_only_one_tensor
    raise QoiCutSupportError(
trulens.nn.quantities.QoiCutSupportError: Cut provided to quantity of interest was comprised of multiple tensors, but `MaxClassQoI` is only defined for cuts comprised of a single tensor (received a list of 3 tensors).

Either (1) select a slice where the `to_cut` corresponds to a single tensor, or (2) implement/use a `QoI` object that supports lists of tensors, i.e., where the parameter, `x`, to `__call__` is expected/allowed to be a list of 3 tensors.

Default cut to InputAttribution is incorrect

The default cut argument to InputAttribution is set to None:

cut: CutLike = None,

This is then passed to InternalInfluence with a cut (InputCut(), None), and InternalInfluence interprets a None cut as an InputCut(). This is not the intended default behavior, which should be OutputCut().

This causes correctness issues, where attributions are returned identical to the input (tested on tensorflow==2.4.0).

Testing Local LLMs?

import os
from langchain import PromptTemplate, LLMChain
from langchain.llms import GPT4All
from langchain.callbacks.manager import CallbackManager
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
from langchain.vectorstores.faiss import FAISS
from langchain.embeddings import HuggingFaceEmbeddings
from trulens_eval import TruChain, Feedback, Huggingface, Tru


tru = Tru()


os.environ["HUGGINGFACE_API_KEY"] = ""


gpt4all_path = './models/gpt4all-converted.bin'


callback_manager = CallbackManager([StreamingStdOutCallbackHandler()])


embeddings = HuggingFaceEmbeddings(model_name='sentence-transformers/all-MiniLM-L6-v2')
hugs = Huggingface()


f_lang_match = Feedback(hugs.language_match).on_input_output()

llm = GPT4All(model=gpt4all_path,verbose=True,temp=0.2)


index = FAISS.load_local("my_faiss_index", embeddings)


template = """Answer the following question using {context}
Question: {question}
Answer:
"""


def get_best_answer(question):
    matched_docs, sources = similarity_search(question, index, n=4)
    context = "\n".join([doc.page_content for doc in matched_docs])
    prompt = PromptTemplate(template=template, input_variables=["context", "question"]).partial(context=context)
    llm_chain = LLMChain(prompt=prompt, llm=llm)
    truchain = TruChain(llm_chain, app_id='Chain1_ChatApplication', feedbacks=[f_lang_match],tags = "prototype")
    answer = truchain.run(question)
    return answer



def similarity_search(query, index, n=4):
    matched_docs = index.similarity_search(query, k=n)
    sources = []
    for doc in matched_docs:
        sources.append(
            {
                "page_content": doc.page_content,
                "metadata": doc.metadata,
            }
        )
    return matched_docs, sources


while True:

    question = input("Please enter your question (or type 'exit' to close the program): ")


    if question.lower() == "exit":
        break


    answer = get_best_answer(question)


    print("Answer:", answer)

gives

✅ app Chain1_ChatApplication -> default.sqlite
✅ feedback def. feedback_definition_hash_44c43fe23fdbb98055154e6bb126142a -> default.sqlite
Traceback (most recent call last):
  File "c:\Users\abhishek\Local-LLM\gpt4all_truelens.py", line 75, in <module>
    answer = get_best_answer(question)
             ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "c:\Users\abhishek\Local-LLM\gpt4all_truelens.py", line 48, in get_best_answer
    answer = truchain.run(question)
             ^^^^^^^^^^^^^^^^^^^^^^
TypeError: 'RuntimeError' object is not callable`

Explain object detection models via PyTorch backend

Hi team,

I'm trying to explain the output of localization models (fasterrcnn_resnet50_fpn, and others in Torchvision).
There are no issues when I wrap the pytorch model in get_model_wrapper() but passing this wrapped model to InputAttributions() or IntegratedGradients() throws, what seems a trivial PyTorch error:

Would you know, where in trulens/pytorch.py are we creating views and if that's something that can be quickly fixed?

RuntimeError                              Traceback (most recent call last)
<ipython-input-43-9b10bbb33c8d> in <module>()
      1 infl = InputAttribution(model)
----> 2 attrs_input = infl.attributions(x_pp)

7 frames
/usr/local/lib/python3.7/dist-packages/trulens/nn/attribution.py in attributions(self, *model_args, **model_kwargs)
    269             to_cut=self.slice.to_cut,
    270             intervention=D,
--> 271             doi_cut=doi_cut)
    272         # Take the mean across the samples in the DoI.
    273         if isinstance(qoi_grads, DATA_CONTAINER_TYPE):

/usr/local/lib/python3.7/dist-packages/trulens/nn/models/pytorch.py in qoi_bprop(self, qoi, model_args, model_kwargs, doi_cut, to_cut, attribution_cut, intervention)
    455             attribution_cut=attribution_cut,
    456             intervention=intervention,
--> 457             return_tensor=False)
    458 
    459         y = to_cut.access_layer(y)

/usr/local/lib/python3.7/dist-packages/trulens/nn/models/pytorch.py in fprop(self, model_args, model_kwargs, doi_cut, to_cut, attribution_cut, intervention, return_tensor, input_timestep)
    370         ]
    371         # Run the network.
--> 372         output = self.model(*model_args, **model_kwargs)
    373         if isinstance(output, tuple):
    374             output = output[0]

/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
   1049         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1050                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1051             return forward_call(*input, **kwargs)
   1052         # Do not call functions when jit is used
   1053         full_backward_hooks, non_full_backward_hooks = [], []

/usr/local/lib/python3.7/dist-packages/torchvision/models/detection/generalized_rcnn.py in forward(self, images, targets)
     75             original_image_sizes.append((val[0], val[1]))
     76 
---> 77         images, targets = self.transform(images, targets)
     78 
     79         # Check for degenerate boxes

/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
   1049         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1050                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1051             return forward_call(*input, **kwargs)
   1052         # Do not call functions when jit is used
   1053         full_backward_hooks, non_full_backward_hooks = [], []

/usr/local/lib/python3.7/dist-packages/torchvision/models/detection/transform.py in forward(self, images, targets)
    118 
    119         image_sizes = [img.shape[-2:] for img in images]
--> 120         images = self.batch_images(images, size_divisible=self.size_divisible)
    121         image_sizes_list: List[Tuple[int, int]] = []
    122         for image_size in image_sizes:

/usr/local/lib/python3.7/dist-packages/torchvision/models/detection/transform.py in batch_images(self, images, size_divisible)
    222         batched_imgs = images[0].new_full(batch_shape, 0)
    223         for img, pad_img in zip(images, batched_imgs):
--> 224             pad_img[: img.shape[0], : img.shape[1], : img.shape[2]].copy_(img)
    225 
    226         return batched_imgs

RuntimeError: A view was created in no_grad mode and is being modified inplace with grad mode enabled. This view is the output of a function that returns multiple views. Such functions do not allow the output views to be modified inplace. You should replace the inplace operation by an out-of-place one.

sqlalchemy.exc.OperationalError: (psycopg2.OperationalError) server closed the connection unexpectedly

I am running in ubuntu 20.04 machine. I have integrated trulens with llama-index for quering. Here is the code -

from multiprocessing.managers import BaseManager
from trulens_eval import TruLlama, Feedback, Tru, feedback
from sqlalchemy import URL
from llama_index import SimpleWebPageReader
from llama_index import VectorStoreIndex
import numpy as np

def initialize_trulens():
    global feedbacks

    # Initiating the trulens dashboard
    url_object = URL.create(
        "postgresql+psycopg2",
        username=os.environ.get("SUPERBASE_USERNAME"),
        password=os.environ.get("SUPERBASE_PASSWORD"), 
        host=os.environ.get("SUPERBASE_HOST"),
        port=os.environ.get("SUPERBASE_PORT"),
        database=os.environ.get("SUPERBASE_DATABASE")
    )
    tru = Tru(database_url= url_object)
    tru.run_dashboard()

    # Initialize Huggingface-based feedback function collection class:
    hugs = feedback.Huggingface()
    openai = feedback.OpenAI()
    # Define a language match feedback function using HuggingFace.
    f_lang_match = Feedback(hugs.language_match).on_input_output()
    # By default this will check language match on the main app input and main app
    # output.

    # Question/answer relevance between overall question and answer.
    f_qa_relevance = Feedback(openai.relevance).on_input_output()

    # Question/statement relevance between question and each context chunk.
    f_qs_relevance = Feedback(openai.qs_relevance).on_input().on(
        TruLlama.select_source_nodes().node.text
    ).aggregate(np.min)

    feedbacks = [f_lang_match, f_qa_relevance, f_qs_relevance]

def query():
    documents = SimpleWebPageReader(html_to_text=True).load_data(
         ["http://paulgraham.com/worked.html"]
    )
    index = VectorStoreIndex.from_documents(documents)
    
    query_engine = index.as_query_engine()
    tru_query_engine_recorder = TruLlama(
        query_engine,
        app_id='LlamaIndex_App1',
        feedbacks=feedbacks
    )
    response = tru_query_engine_recorder.query("What did the author do growing up?")
    print(response)
    return response

    
if __name__ == "__main__":
    print("server started...")
    initialize_trulens()
    manager = BaseManager(('', 5602), b'password')
    manager.register('query_index', query)
    server = manager.get_server()
    server.serve_forever()

After running the server it is working good. I can see the dashboard of trulens. But after some time I am getting the following error and trulens dashboard gets disconnected from database.

Tru was already initialized. Cannot change database_url=postgresql+psycopg2://<database_username>:<database_password>@<database_host>:<database_port>/<database_name> or database_file=None .

/usr/local/lib/python3.10/site-packages/matplotlib/axes/_axes.py:6826: RuntimeWarning: All-NaN slice encountered

  xmin = min(xmin, np.nanmin(xi))

/usr/local/lib/python3.10/site-packages/matplotlib/axes/_axes.py:6827: RuntimeWarning: All-NaN slice encountered

  xmax = max(xmax, np.nanmax(xi))

/usr/local/lib/python3.10/site-packages/matplotlib/axes/_axes.py:6826: RuntimeWarning: All-NaN slice encountered

  xmin = min(xmin, np.nanmin(xi))

/usr/local/lib/python3.10/site-packages/matplotlib/axes/_axes.py:6827: RuntimeWarning: All-NaN slice encountered

  xmax = max(xmax, np.nanmax(xi))

/usr/local/lib/python3.10/site-packages/matplotlib/axes/_axes.py:6826: RuntimeWarning: All-NaN slice encountered

  xmin = min(xmin, np.nanmin(xi))

/usr/local/lib/python3.10/site-packages/matplotlib/axes/_axes.py:6827: RuntimeWarning: All-NaN slice encountered

  xmax = max(xmax, np.nanmax(xi))

/usr/local/lib/python3.10/site-packages/matplotlib/axes/_axes.py:6826: RuntimeWarning: All-NaN slice encountered

  xmin = min(xmin, np.nanmin(xi))

/usr/local/lib/python3.10/site-packages/matplotlib/axes/_axes.py:6827: RuntimeWarning: All-NaN slice encountered

  xmax = max(xmax, np.nanmax(xi))

/usr/local/lib/python3.10/site-packages/matplotlib/axes/_axes.py:6826: RuntimeWarning: All-NaN slice encountered

  xmin = min(xmin, np.nanmin(xi))

/usr/local/lib/python3.10/site-packages/matplotlib/axes/_axes.py:6827: RuntimeWarning: All-NaN slice encountered

  xmax = max(xmax, np.nanmax(xi))

2023-09-26 14:01:57.225 MediaFileHandler: Missing file 16a57003a519fd6aa13a2f733bd4fa522ba8e58545862765c5d0ce92.png

2023-09-26 14:01:57.235 MediaFileHandler: Missing file 6b6481b1f4db67286783dec664cc63b2153e3c7c262a3bff8a328923.png

2023-09-26 14:01:57.236 MediaFileHandler: Missing file 37d4f4b62a7b6acf699e54df90a7bf371cbab2d17fca60c9372fd503.png

/usr/local/lib/python3.10/site-packages/matplotlib/axes/_axes.py:6826: RuntimeWarning: All-NaN slice encountered

  xmin = min(xmin, np.nanmin(xi))

/usr/local/lib/python3.10/site-packages/matplotlib/axes/_axes.py:6827: RuntimeWarning: All-NaN slice encountered

  xmax = max(xmax, np.nanmax(xi))

Tru was already initialized. Cannot change database_url=postgresql+psycopg2://<database_username>:<database_password>@<database_host>:<database_port>/<database_name> or database_file=None .

/usr/local/lib/python3.10/site-packages/matplotlib/axes/_axes.py:6826: RuntimeWarning: All-NaN slice encountered

  xmin = min(xmin, np.nanmin(xi))

/usr/local/lib/python3.10/site-packages/matplotlib/axes/_axes.py:6827: RuntimeWarning: All-NaN slice encountered

  xmax = max(xmax, np.nanmax(xi))

2023-09-26 14:02:14.752 MediaFileHandler: Missing file 16a57003a519fd6aa13a2f733bd4fa522ba8e58545862765c5d0ce92.png

2023-09-26 14:02:14.753 MediaFileHandler: Missing file 6b6481b1f4db67286783dec664cc63b2153e3c7c262a3bff8a328923.png

2023-09-26 14:02:14.754 MediaFileHandler: Missing file 37d4f4b62a7b6acf699e54df90a7bf371cbab2d17fca60c9372fd503.png

/usr/local/lib/python3.10/site-packages/matplotlib/axes/_axes.py:6826: RuntimeWarning: All-NaN slice encountered

  xmin = min(xmin, np.nanmin(xi))

/usr/local/lib/python3.10/site-packages/matplotlib/axes/_axes.py:6827: RuntimeWarning: All-NaN slice encountered

  xmax = max(xmax, np.nanmax(xi))

/usr/local/lib/python3.10/site-packages/matplotlib/axes/_axes.py:6826: RuntimeWarning: All-NaN slice encountered

  xmin = min(xmin, np.nanmin(xi))

/usr/local/lib/python3.10/site-packages/matplotlib/axes/_axes.py:6827: RuntimeWarning: All-NaN slice encountered

  xmax = max(xmax, np.nanmax(xi))

Tru was already initialized. Cannot change database_url=postgresql+psycopg2://<database_username>:<database_password>@<database_host>:<database_port>/<database_name> or database_file=None .

/usr/local/lib/python3.10/site-packages/matplotlib/axes/_axes.py:6826: RuntimeWarning: All-NaN slice encountered

  xmin = min(xmin, np.nanmin(xi))

/usr/local/lib/python3.10/site-packages/matplotlib/axes/_axes.py:6827: RuntimeWarning: All-NaN slice encountered

  xmax = max(xmax, np.nanmax(xi))

Tru was already initialized. Cannot change database_url=postgresql+psycopg2://<database_username>:<database_password>@<database_host>:<database_port>/<database_name> or database_file=None .

/usr/local/lib/python3.10/site-packages/matplotlib/axes/_axes.py:6826: RuntimeWarning: All-NaN slice encountered

  xmin = min(xmin, np.nanmin(xi))

/usr/local/lib/python3.10/site-packages/matplotlib/axes/_axes.py:6827: RuntimeWarning: All-NaN slice encountered

  xmax = max(xmax, np.nanmax(xi))

/usr/local/lib/python3.10/site-packages/matplotlib/axes/_axes.py:6826: RuntimeWarning: All-NaN slice encountered

  xmin = min(xmin, np.nanmin(xi))

/usr/local/lib/python3.10/site-packages/matplotlib/axes/_axes.py:6827: RuntimeWarning: All-NaN slice encountered

  xmax = max(xmax, np.nanmax(xi))

/usr/local/lib/python3.10/site-packages/matplotlib/axes/_axes.py:6826: RuntimeWarning: All-NaN slice encountered

  xmin = min(xmin, np.nanmin(xi))

/usr/local/lib/python3.10/site-packages/matplotlib/axes/_axes.py:6827: RuntimeWarning: All-NaN slice encountered

  xmax = max(xmax, np.nanmax(xi))

Tru was already initialized. Cannot change database_url=postgresql+psycopg2://<database_username>:<database_password>@<database_host>:<database_port>/<database_name> or database_file=None .

2023-09-26 14:18:28.504 Session with id 36790e30-3097-4309-acd6-b1d908187e66 is already connected! Connecting to a new session.

Tru was already initialized. Cannot change database_url=postgresql+psycopg2://<database_username>:<database_password>@<database_host>:<database_port>/<database_name> or database_file=None .

2023-09-26 14:18:28.840 Uncaught app exception

Traceback (most recent call last):

  File "/usr/local/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1965, in _exec_single_context

    self.dialect.do_execute(

  File "/usr/local/lib/python3.10/site-packages/sqlalchemy/engine/default.py", line 921, in do_execute

    cursor.execute(statement, parameters)

psycopg2.OperationalError: server closed the connection unexpectedly

	This probably means the server terminated abnormally

	before or while processing the request.

server closed the connection unexpectedly

	This probably means the server terminated abnormally

	before or while processing the request.





The above exception was the direct cause of the following exception:



Traceback (most recent call last):

  File "/usr/local/lib/python3.10/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 541, in _run_script

    exec(code, module.__dict__)

  File "/usr/local/lib/python3.10/site-packages/trulens_eval/Leaderboard.py", line 141, in <module>

    main()

  File "/usr/local/lib/python3.10/site-packages/trulens_eval/Leaderboard.py", line 137, in main

    streamlit_app()

  File "/usr/local/lib/python3.10/site-packages/trulens_eval/Leaderboard.py", line 51, in streamlit_app

    df, feedback_col_names = lms.get_records_and_feedback([])

  File "/usr/local/lib/python3.10/site-packages/trulens_eval/database/utils.py", line 60, in wrapper

    callback(*args, **kwargs)

  File "/usr/local/lib/python3.10/site-packages/trulens_eval/database/sqlalchemy_db.py", line 44, in <lambda>

    run_before(lambda self, *args, **kwargs: check_db_revision(self.engine)),

  File "/usr/local/lib/python3.10/site-packages/trulens_eval/database/utils.py", line 112, in check_db_revision

    if is_legacy_sqlite(engine):

  File "/usr/local/lib/python3.10/site-packages/trulens_eval/database/utils.py", line 82, in is_legacy_sqlite

    tables = list(inspector.get_table_names())

  File "/usr/local/lib/python3.10/site-packages/sqlalchemy/engine/reflection.py", line 396, in get_table_names

    return self.dialect.get_table_names(

  File "<string>", line 2, in get_table_names

  File "/usr/local/lib/python3.10/site-packages/sqlalchemy/engine/reflection.py", line 97, in cache

    ret = fn(self, con, *args, **kw)

  File "/usr/local/lib/python3.10/site-packages/sqlalchemy/dialects/postgresql/base.py", line 3368, in get_table_names

    return self._get_relnames_for_relkinds(

  File "/usr/local/lib/python3.10/site-packages/sqlalchemy/dialects/postgresql/base.py", line 3364, in _get_relnames_for_relkinds

    return connection.scalars(query).all()

  File "/usr/local/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1344, in scalars

    return self.execute(

  File "/usr/local/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1412, in execute

    return meth(

  File "/usr/local/lib/python3.10/site-packages/sqlalchemy/sql/elements.py", line 516, in _execute_on_connection

    return connection._execute_clauseelement(

  File "/usr/local/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1635, in _execute_clauseelement

    ret = self._execute_context(

  File "/usr/local/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1844, in _execute_context

    return self._exec_single_context(

  File "/usr/local/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1984, in _exec_single_context

    self._handle_dbapi_exception(

  File "/usr/local/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 2339, in _handle_dbapi_exception

    raise sqlalchemy_exception.with_traceback(exc_info[2]) from e

  File "/usr/local/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1965, in _exec_single_context

    self.dialect.do_execute(

  File "/usr/local/lib/python3.10/site-packages/sqlalchemy/engine/default.py", line 921, in do_execute

    cursor.execute(statement, parameters)

sqlalchemy.exc.OperationalError: (psycopg2.OperationalError) server closed the connection unexpectedly

	This probably means the server terminated abnormally

	before or while processing the request.

server closed the connection unexpectedly

	This probably means the server terminated abnormally

	before or while processing the request.



[SQL: SELECT pg_catalog.pg_class.relname 

FROM pg_catalog.pg_class JOIN pg_catalog.pg_namespace ON pg_catalog.pg_namespace.oid = pg_catalog.pg_class.relnamespace 

WHERE pg_catalog.pg_class.relkind = ANY (ARRAY[%(param_1)s, %(param_2)s]) AND pg_catalog.pg_class.relpersistence != %(relpersistence_1)s AND pg_catalog.pg_table_is_visible(pg_catalog.pg_class.oid) AND pg_catalog.pg_namespace.nspname != %(nspname_1)s]

[parameters: {'param_1': 'r', 'param_2': 'p', 'relpersistence_1': 't', 'nspname_1': 'pg_catalog'}]

(Background on this error at: https://sqlalche.me/e/20/e3q8)

/usr/local/lib/python3.10/site-packages/matplotlib/axes/_axes.py:6826: RuntimeWarning: All-NaN slice encountered

  xmin = min(xmin, np.nanmin(xi))

/usr/local/lib/python3.10/site-packages/matplotlib/axes/_axes.py:6827: RuntimeWarning: All-NaN slice encountered

  xmax = max(xmax, np.nanmax(xi))

Changing the input of `agreement_measure` returns nan

Usecase: When changing the input to a custom input, agreement_measure returns nan.

If one changes the input like so:

f_agreement_measure = Feedback(GroundTruthAgreement(answers, azure).agreement_measure).on(
                Select.Record.calls[0].rets.query).on(
                Select.Record.main_output["answer"]
)

You can see that in agreement_measure_calls, the ret and meta are empty because it wasn't able to find the input in the dictionary:

[{'args': {'prompt': 'Prompt 1 Modified', 'response': "response 1"}, 'ret': nan, 'meta': {}}]

From what I've understood so far, it's because in GroundTruthAgreement and _find_response(), it is searching for the original query Prompt 1, like in Select.Record.main_input. Instead, it is searching for a query containing Prompt 1 Modified, contained in Select.Record.calls[0].rets.query, but can't find it:

self.ground_truth = [{'query': 'Prompt 1', 'response': 'response 1'}]

Therefore, it returns nan.

【Error】NameError: Fields must not use names with leading underscores

When I follow the demo in langchain_quickstart.ipynb

Error occurs at this line

from IPython.display import JSON
# Imports main tools:
from trulens_eval import TruChain, Feedback, Huggingface, Tru # Error here
from trulens_eval.schema import FeedbackResult
tru = Tru()

Full error message is below:

---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[26], line 4
      1 from IPython.display import JSON
      3 # Imports main tools:
----> 4 from trulens_eval import TruChain, Feedback, Huggingface, Tru
      5 from trulens_eval.schema import FeedbackResult
      6 tru = Tru()

File ~/miniconda3/envs/py310/lib/python3.10/site-packages/trulens_eval/__init__.py:83
      1 """
      2 # Trulens-eval LLM Evaluation Library
      3 
   (...)
     78 
     79 """
     81 __version__ = "0.17.0"
---> 83 from trulens_eval.feedback import Bedrock
     84 from trulens_eval.feedback import Feedback
     85 from trulens_eval.feedback import Huggingface

File ~/miniconda3/envs/py310/lib/python3.10/site-packages/trulens_eval/feedback/__init__.py:14
     11 AggCallable = Callable[[Iterable[float]], float]
     13 # Specific feedback functions:
---> 14 from trulens_eval.feedback.embeddings import Embeddings
     15 # Main class holding and running feedback functions:
     16 from trulens_eval.feedback.feedback import Feedback

File ~/miniconda3/envs/py310/lib/python3.10/site-packages/trulens_eval/feedback/embeddings.py:8
      5 from pydantic import PrivateAttr
      7 from trulens_eval.utils.imports import REQUIREMENT_SKLEARN
----> 8 from trulens_eval.utils.pyschema import WithClassInfo
      9 from trulens_eval.utils.serial import SerialModel
     12 class Embeddings(SerialModel, WithClassInfo):

File ~/miniconda3/envs/py310/lib/python3.10/site-packages/trulens_eval/utils/pyschema.py:588
    584 # Key of structure where class information is stored.
    585 CLASS_INFO = "__tru_class_info"
--> 588 class WithClassInfo(pydantic.BaseModel):
    589     """
    590     Mixin to track class information to aid in querying serialized components
    591     without having to load them.
    592     """
    594     # Using this odd key to not pollute attribute names in whatever class we mix
    595     # this into. Should be the same as CLASS_INFO.

File ~/miniconda3/envs/py310/lib/python3.10/site-packages/pydantic/_internal/_model_construction.py:104, in __new__(mcs, cls_name, bases, namespace, __pydantic_generic_metadata__, __pydantic_reset_parent_namespace__, **kwargs)

File ~/miniconda3/envs/py310/lib/python3.10/site-packages/pydantic/_internal/_model_construction.py:345, in inspect_namespace(namespace, ignored_types, base_class_vars, base_class_fields)

NameError: Fields must not use names with leading underscores; e.g., use 'WithClassInfo__tru_class_info' instead of '_WithClassInfo__tru_class_info'.

trulens_eval.feedback.OpenAI with Azure

Hi there 👋

I am using from trulens_eval.feedback.OpenAI but I need to pass extra args to use with azure, I don't understand the usage of OpenAIEndpoint since I cannot see extra args to setup openai, usually you can do something like

openai.api_type = "azure"
openai.api_base = os.getenv("OPENAI_API_BASE")
openai.api_version = "2023-03-15-preview"
openai.api_key = os.getenv("OPENAI_API_KEY")

Not sure how to do it with trulens

Thanks a lot

Cheers,

Fra

Attributions fails in tf2 with graph activated through tf.function annotated call

Describe the bug
The symptom occurs with error 'Tensor' has no attribute 'numpy'
The problem is because this call in tensorflow_v2 is expecting eager, but it is not in eager mode.

To Reproduce
implement a tensorflow.keras.Model

Override call() function with @tf.function annotation.

Quick hack fix is to remove the annotation, but this is not the correct longterm solution

from tensorflow.keras import Model
class CustomModel(Model):

def __init__ ....

@tf.function
def call()...

(Run attribution on this model)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.