oneil512 / insight Goto Github PK

View Code? Open in Web Editor NEW

393.0 13.0 57.0 119 KB

INSIGHT is an autonomous AI that can do medical research!

License: Apache License 2.0

Python 100.00%

agent ai chatgpt gpt llm medical ml python

insight's Introduction

INSIGHT

Please visit https://insightai.dev/project for our managed solution with many more features!

Insight is an autonomous AI that can do medical research. It has a boss agent that takes an objective and an executive summary of the tasks completed already and their results and creates a task list. A worker agent picks up a task from the list and completes it, saving the results to llama index. The boss gets informed of the results and changes/reprioritizes the task list. The workers can call into the pubmed and mygene APIs (more to come). The workers also get context from llama index to help complete their tasks.

INSIGHT can also reload and continue runs, and also load any human readable data file and use it along side the other findings!

You can also load your llama Index database and talk to it, asking arbitrary questions about your data, by running talk_to_index.py You will have to specify the path to your index in the bottom of the file. See the bottom of talk_to_index.py for an example.

Please reach out to me or contribute if this interests you :) My email is [email protected]

graph TB;

    subgraph APIs;
        API1[PUBMED API];
        API2[MYGENE API];
    end;

    Boss((BOSS AGENT)) <--> GPT[LLM];
    Llama[(LLAMA INDEX)] -->|Summary of results| Boss;
    Boss -->|Create| Queue[TASK LIST];

    Worker((WORKER AGENT)) <--> GPT;
    Queue --> |Pull| Worker;
    Llama -->|Context for task| Worker;
    Worker --> Result[Task Result];

    Result --> |Text| Llama;
    Result -->|Code| Executor{PYTHON EXECUTOR};

    Executor --> API1[PUBMED];
    Executor --> API2[MYGENE];
    Executor --> Execution[Execution Result];

    Execution --> Llama;

    Llama <--> TalkToIndex[Talk To Index];

    User{{User}} -->|Query| TalkToIndex;
    TalkToIndex -->|Result| User;

Getting Started

Sign up for OpenAI
Expose the following environment variable
- OPENAI_API_KEY
OR

Add your api key to the config file. IF YOU DO THIS, DO NOT COMMIT THEM WITH ANY VERSION CONTROL SYSTEM!
run pip install -r requirements.txt
run python main.py

Output

The program saves the result from every task and adds it to the output directory out

It also creates a key findings markdown file over all results that distills the data via the following commands:

Give a brief high level summary of all the data.
Briefly list all the main points that the data covers.
Give all of the key insights about the data.
Generate several creative hypotheses given the data.
What are some high level research directions to explore further given the data?
Describe the key findings in great detail. Do not include filler words.

Arbitrary commands can be added. Open this in a markdown editor for the best experience.

Here is an example output structure

.
└── out  /
    ├── Objective  /
    │   ├── Task 1/
    │   │   ├── Result 1/
    │   │   │   ├── Raw Result
    │   │   │   └── Vector Embedding of Result
    │   │   ├── Result 2/
    │   │   │   ├── Raw Result
    │   │   │   └── Vector Embedding of Result
    │   │   ├── .
    │   │   ├── .
    │   │   ├── Summary of task results
    │   │   └── API Call (If task was an API call)
    │   ├── Task 2
    │   ├── .
    │   ├── .
    │   ├── .
    │   └── Task N
    └── key_findings.md

BE MINDFUL OF EXPENSES!!

Currently an execution for a few minutes should cost no more than a few cents. This will go up if you use a more powerful model like GPT-4

insight's People

Contributors

Stargazers

Watchers

insight's Issues

Remote connection closed by OpenAI

Sometimes OpenAI closed the remote connection on us. Should be handled gracefully with a retry decorator.

Traceback (most recent call last):
  File "/home/nate/INSIGHT/main.py", line 225, in <module>
    run(api_key=api_key, OBJECTIVE=objective, MAX_ITERATIONS=iterations, TOOLS=tools, my_data_path=my_data_path, reload_path=reload_path)
  File "/home/nate/INSIGHT/main.py", line 177, in run
    result, task, task_list, summaries = run_(
  File "/home/nate/INSIGHT/main.py", line 76, in run_
    result, result_is_python = worker_agent(OBJECTIVE, task, master_index, cache, TOOLS)
  File "/home/nate/INSIGHT/agents.py", line 149, in worker_agent
    response = get_gpt_completion(prompt, engine="text-davinci-003", temp=0.0)
  File "/home/nate/INSIGHT/.venv/lib/python3.10/site-packages/backoff/_sync.py", line 105, in retry
    ret = target(*args, **kwargs)
  File "/home/nate/INSIGHT/utils.py", line 648, in get_gpt_completion
    response = openai.Completion.create(
  File "/home/nate/INSIGHT/.venv/lib/python3.10/site-packages/openai/api_resources/completion.py", line 25, in create
    return super().create(*args, **kwargs)
  File "/home/nate/INSIGHT/.venv/lib/python3.10/site-packages/openai/api_resources/abstract/engine_api_resource.py", line 153, in create
    response, _, api_key = requestor.request(
  File "/home/nate/INSIGHT/.venv/lib/python3.10/site-packages/openai/api_requestor.py", line 216, in request
    result = self.request_raw(
  File "/home/nate/INSIGHT/.venv/lib/python3.10/site-packages/openai/api_requestor.py", line 529, in request_raw
    raise error.APIConnectionError(
openai.error.APIConnectionError: Error communicating with OpenAI: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))

PUBMED tool error - `TypeError: 'NoneType' object is not iterable`

query_term = 'toe numbness', retmax = 10, retstart = 0
Exception executing code from api.pubmed_wrapper import pubmed_wrapper
query_term = 'toe numbness', retmax = 10, retstart = 0
ret = pubmed_wrapper(query_term, retmax, retstart), invalid syntax. Maybe you meant '==' or ':=' instead of '='? (<string>, line 2)
Cannot parse pubmed result, expected xml. a bytes-like object is required, not 'NoneType'
Adding whole document. Note this will lead to suboptimal results.
Task 'PUBMED: Research common causes of toe numbness' failed. Code NOTE: Code did not run succesfully

from api.pubmed_wrapper import pubmed_wrapper
query_term = 'toe numbness', retmax = 10, retstart = 0
ret = pubmed_wrapper(query_term, retmax, retstart) did not run succesfully.
Traceback (most recent call last):
  File "/workspaces/INSIGHT/main.py", line 225, in <module>
    run(api_key=api_key, OBJECTIVE=objective, MAX_ITERATIONS=iterations, TOOLS=tools, my_data_path=my_data_path, reload_path=reload_path)
  File "/workspaces/INSIGHT/main.py", line 177, in run
    result, task, task_list, summaries = run_(
                                         ^^^^^
  File "/workspaces/INSIGHT/main.py", line 89, in run_
    handle_results(result, index, doc_store, doc_store_task_key, task_id_counter, RESULT_CUTOFF)
  File "/workspaces/INSIGHT/utils.py", line 530, in handle_results
    for i, r in enumerate(result):
                ^^^^^^^^^^^^^^^^^
TypeError: 'NoneType' object is not iterable

Extend to other domains?

Is there a possibility to extend this tool to other domains than medical research?
Say I have a bunch of documents and a list of tasks to perform on them (search for specific data points, extract insights based on connected data points from different documents, etc).
TIA!

Use Turbo as default?

Just based on the 10x price difference would it make sense to use gpt-3.5-turbo as the default model, even for individual completion calls?

A basic shim like this (generated by GPT-4) could plausibly be added to utils.py to allow this even for calls to get_gpt_completion: https://austegard.com/pv?d56d5a68a610a29ee7e353616a480628 (Scroll down to white code block)

Also would suggest moving the hardcoded engine strings to config.py

PS! Loved the demo during the 🦙-index meetup!

Why don't you just use AUTO GPT?

https://github.com/Significant-Gravitas/Auto-GPT

oneil512 / insight Goto Github PK

insight's Introduction

INSIGHT

Getting Started

Output

insight's People

Contributors

Stargazers

Watchers

Forkers

insight's Issues

Remote connection closed by OpenAI

PUBMED tool error - `TypeError: 'NoneType' object is not iterable`

Extend to other domains?

Use Turbo as default?

Why don't you just use AUTO GPT?

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent