Giter VIP home page Giter VIP logo

insight's Introduction

INSIGHT

Please visit https://insightai.dev/project for our managed solution with many more features!

Insight is an autonomous AI that can do medical research. It has a boss agent that takes an objective and an executive summary of the tasks completed already and their results and creates a task list. A worker agent picks up a task from the list and completes it, saving the results to llama index. The boss gets informed of the results and changes/reprioritizes the task list. The workers can call into the pubmed and mygene APIs (more to come). The workers also get context from llama index to help complete their tasks.

INSIGHT can also reload and continue runs, and also load any human readable data file and use it along side the other findings!

You can also load your llama Index database and talk to it, asking arbitrary questions about your data, by running talk_to_index.py You will have to specify the path to your index in the bottom of the file. See the bottom of talk_to_index.py for an example.

Please reach out to me or contribute if this interests you :) My email is [email protected]

graph TB;

    subgraph APIs;
        API1[PUBMED API];
        API2[MYGENE API];
    end;

    Boss((BOSS AGENT)) <--> GPT[LLM];
    Llama[(LLAMA INDEX)] -->|Summary of results| Boss;
    Boss -->|Create| Queue[TASK LIST];

    Worker((WORKER AGENT)) <--> GPT;
    Queue --> |Pull| Worker;
    Llama -->|Context for task| Worker;
    Worker --> Result[Task Result];

    Result --> |Text| Llama;
    Result -->|Code| Executor{PYTHON EXECUTOR};

    Executor --> API1[PUBMED];
    Executor --> API2[MYGENE];
    Executor --> Execution[Execution Result];

    Execution --> Llama;

    Llama <--> TalkToIndex[Talk To Index];

    User{{User}} -->|Query| TalkToIndex;
    TalkToIndex -->|Result| User;

Getting Started

  1. Sign up for OpenAI

  2. Expose the following environment variable

    • OPENAI_API_KEY

    OR

    Add your api key to the config file. IF YOU DO THIS, DO NOT COMMIT THEM WITH ANY VERSION CONTROL SYSTEM!

  3. run pip install -r requirements.txt

  4. run python main.py

Output

The program saves the result from every task and adds it to the output directory out

It also creates a key findings markdown file over all results that distills the data via the following commands:

  • Give a brief high level summary of all the data.
  • Briefly list all the main points that the data covers.
  • Give all of the key insights about the data.
  • Generate several creative hypotheses given the data.
  • What are some high level research directions to explore further given the data?
  • Describe the key findings in great detail. Do not include filler words.

Arbitrary commands can be added. Open this in a markdown editor for the best experience.

Here is an example output structure

.
└── out  /
    ├── Objective  /
    │   ├── Task 1/
    │   │   ├── Result 1/
    │   │   │   ├── Raw Result
    │   │   │   └── Vector Embedding of Result
    │   │   ├── Result 2/
    │   │   │   ├── Raw Result
    │   │   │   └── Vector Embedding of Result
    │   │   ├── .
    │   │   ├── .
    │   │   ├── Summary of task results
    │   │   └── API Call (If task was an API call)
    │   ├── Task 2
    │   ├── .
    │   ├── .
    │   ├── .
    │   └── Task N
    └── key_findings.md

BE MINDFUL OF EXPENSES!!

Currently an execution for a few minutes should cost no more than a few cents. This will go up if you use a more powerful model like GPT-4

insight's People

Contributors

batmanscode avatar clayoneil avatar oneil512 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

insight's Issues

Remote connection closed by OpenAI

Sometimes OpenAI closed the remote connection on us. Should be handled gracefully with a retry decorator.

Traceback (most recent call last):
  File "/home/nate/INSIGHT/main.py", line 225, in <module>
    run(api_key=api_key, OBJECTIVE=objective, MAX_ITERATIONS=iterations, TOOLS=tools, my_data_path=my_data_path, reload_path=reload_path)
  File "/home/nate/INSIGHT/main.py", line 177, in run
    result, task, task_list, summaries = run_(
  File "/home/nate/INSIGHT/main.py", line 76, in run_
    result, result_is_python = worker_agent(OBJECTIVE, task, master_index, cache, TOOLS)
  File "/home/nate/INSIGHT/agents.py", line 149, in worker_agent
    response = get_gpt_completion(prompt, engine="text-davinci-003", temp=0.0)
  File "/home/nate/INSIGHT/.venv/lib/python3.10/site-packages/backoff/_sync.py", line 105, in retry
    ret = target(*args, **kwargs)
  File "/home/nate/INSIGHT/utils.py", line 648, in get_gpt_completion
    response = openai.Completion.create(
  File "/home/nate/INSIGHT/.venv/lib/python3.10/site-packages/openai/api_resources/completion.py", line 25, in create
    return super().create(*args, **kwargs)
  File "/home/nate/INSIGHT/.venv/lib/python3.10/site-packages/openai/api_resources/abstract/engine_api_resource.py", line 153, in create
    response, _, api_key = requestor.request(
  File "/home/nate/INSIGHT/.venv/lib/python3.10/site-packages/openai/api_requestor.py", line 216, in request
    result = self.request_raw(
  File "/home/nate/INSIGHT/.venv/lib/python3.10/site-packages/openai/api_requestor.py", line 529, in request_raw
    raise error.APIConnectionError(
openai.error.APIConnectionError: Error communicating with OpenAI: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))

PUBMED tool error - `TypeError: 'NoneType' object is not iterable`

query_term = 'toe numbness', retmax = 10, retstart = 0
Exception executing code from api.pubmed_wrapper import pubmed_wrapper
query_term = 'toe numbness', retmax = 10, retstart = 0
ret = pubmed_wrapper(query_term, retmax, retstart), invalid syntax. Maybe you meant '==' or ':=' instead of '='? (<string>, line 2)
Cannot parse pubmed result, expected xml. a bytes-like object is required, not 'NoneType'
Adding whole document. Note this will lead to suboptimal results.
Task 'PUBMED: Research common causes of toe numbness' failed. Code NOTE: Code did not run succesfully

from api.pubmed_wrapper import pubmed_wrapper
query_term = 'toe numbness', retmax = 10, retstart = 0
ret = pubmed_wrapper(query_term, retmax, retstart) did not run succesfully.
Traceback (most recent call last):
  File "/workspaces/INSIGHT/main.py", line 225, in <module>
    run(api_key=api_key, OBJECTIVE=objective, MAX_ITERATIONS=iterations, TOOLS=tools, my_data_path=my_data_path, reload_path=reload_path)
  File "/workspaces/INSIGHT/main.py", line 177, in run
    result, task, task_list, summaries = run_(
                                         ^^^^^
  File "/workspaces/INSIGHT/main.py", line 89, in run_
    handle_results(result, index, doc_store, doc_store_task_key, task_id_counter, RESULT_CUTOFF)
  File "/workspaces/INSIGHT/utils.py", line 530, in handle_results
    for i, r in enumerate(result):
                ^^^^^^^^^^^^^^^^^
TypeError: 'NoneType' object is not iterable

Extend to other domains?

Is there a possibility to extend this tool to other domains than medical research?
Say I have a bunch of documents and a list of tasks to perform on them (search for specific data points, extract insights based on connected data points from different documents, etc).
TIA!

Use Turbo as default?

Just based on the 10x price difference would it make sense to use gpt-3.5-turbo as the default model, even for individual completion calls?

A basic shim like this (generated by GPT-4) could plausibly be added to utils.py to allow this even for calls to get_gpt_completion: https://austegard.com/pv?d56d5a68a610a29ee7e353616a480628 (Scroll down to white code block)

Also would suggest moving the hardcoded engine strings to config.py

PS! Loved the demo during the 🦙-index meetup!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.