Giter VIP home page Giter VIP logo

data-to-paper's Introduction

Backward-traceable AI-driven Research

data-to-paper is a framework that systematically navigates the power of AI through a complete end-to-end scientific research, starting from raw data and concluding with comprehensive, transparent, backward-traceable, human-verifiable scientific papers (example).

Implementation. Towards this goal, data-to-paper systematically guides interacting LLM and rule-based agents through the conventional scientific path, from annotated data, through creating research hypotheses, conducting literature search, writing and debugging data analysis code, interpreting the results, and ultimately the step-by-step writing of a complete research paper. Tracking information flow through these steps, the process creates transparent and highly-traceable manuscripts, where results, methodology and data are programmatically linked (demo).

The data-to-paper framework is created as a research project to understand the capacities and limitations of LLM-driven scientific research, and to develop ways of harnessing LLM to accelerate research while maintaining, and even enhancing, key scientific values, such as transparency, traceability and verifiability, and while allowing scientist to oversee and direct the process (see also: living guidelines).

License: MIT

data_to_paper.mp4

Reference

The data-to-paper framework is described in the following pre-print:

  • Tal Ifargan, Lukas Hafner, Maor Kern, Ori Alcalay and Roy Kishony, "Autonomous LLM-driven research from data to human-verifiable research papers", arXiv:2404.17605

Key features

  • Data-chained manuscripts. The process creates transparent and verifiable manuscripts, where results, methodology and data are programmatically linked (all numeric values can be click-traced back to the code lines that created them).
  • Field agnostic. We strive to make the framework as general as possible, so that it can be used across different fields of research.
  • Open-goal or fixed-goal research. data-to-paper can be used to autonomously raise and test a hypothesis, or to test a specific pre-defined user-provided hypothesis.
  • Coding guardrails. Standard statistical packages are overridden with multiple guardrails to minimize common LLM coding errors.
  • Human-in-the-loop. A GUI app allows the user to oversee the process, and to intervene at each research step.
  • Record & Replay. The entire process is recorded, including all LLM responses, Human feedback, and literature search retrievals, allowing for transparent replay.

Installation

See INSTALL.md for installation instructions.

How to run

  1. Install data-to-paper (INSTALL).
  2. Run data-to-paper: python data_to_paper/data_to_paper/scripts/run.py
  3. This will open a startup dialog that will allow you to specify your own project, or to reproduce example projects (data-to-paper/projects)
  4. Click "Start" to start the run, with human overseeing and feedback.
  5. At the end of the process, a pdf of the manuscript will be created in the project folder.

GUI app demo

data-to-paper_app_demo_1.mp4

Examples

We ran data-to-paper on the following test cases:

  • Health Indicators (open goal). A clean unweighted subset of CDC’s Behavioral Risk Factor Surveillance System (BRFSS) 2015 annual dataset (Kaggle). Here is an example Paper created by data-to paper.

Try out: python run.py diabetes

  • Social Network (open goal). A directed graph of Twitter interactions among the 117th Congress members (Fink et al). Here is an example Paper created by data-to paper.

Try out: python run.py social_network

  • Treatment Policy (fixed-goal). A dataset on treatment and outcomes of non-vigorous infants admitted to the Neonatal Intensive Care Unit (NICU), before and after a change to treatment guidelines was implemented (Saint-Fleur et al). Here is an example Paper created by data-to paper.

Try out: python run.py npr_nicu

  • Treatment Optimization (fixed-goal). A dataset of pediatric patients, which received mechanical ventilation after undergoing surgery, including an x-ray-based determination of the optimal tracheal tube intubation depth and a set of personalized patient attributes to be used in machine learning and formula-based models to predict this optimal depth (Shim et al). Here is an example Paper created by data-to paper.

We defined three levels of difficulty for the research question for this paper.

  1. easy: Compare two ML methods for predicting optimal intubation depth
    Try out: python run.py ML_easy

  2. medium: Compare one ML method and one formula-based method for predicting optimal intubation depth
    Try out: python run.py ML_medium

  3. hard: Compare 4 ML methods with 3 formula-based methods for predicting optimal intubation depth
    Try out: python run.py ML_hard

Contributing

We invite people to try out data-to-paper with their own data and are eager for feedback and suggestions. It is currently designed for relatively simple research goals and simple datasets, where we want to raise and test a statistical hypothesis.

We also invite people to help develop and extend the data-to-paper framework in science or other fields.

Important notes

Disclaimer. By using this software, you agree to assume all risks associated with its use, including but not limited to data loss, system failure, or any other issues that may arise, especially, but not limited to, the consequences of running of LLM created code on your local machine. The developers of this project do not accept any responsibility or liability for any losses, damages, or other consequences that may occur as a result of using this software.

Accountability. You are solely responsible for the entire content of created manuscripts including their rigour, quality, ethics and any other aspect. The process should be overseen and directed by a human-in-the-loop and created manuscripts should be carefully vetted by a domain expert. The process is NOT error-proof and human intervention is necessary to ensure accuracy and the quality of the results.

Compliance. It is your responsibility to ensure that any actions or decisions made based on the output of this software comply with all applicable laws, regulations, and ethical standards. The developers and contributors of this project shall not be held responsible for any consequences arising from using this software. Further, data-to-paper manuscripts are watermarked for transparency as AI-created. Users should not remove this watermark.

Token Usage. Please note that the use of most language models through external APIs, especially GPT4, can be expensive due to its token usage. By utilizing this project, you acknowledge that you are responsible for monitoring and managing your own token usage and the associated costs. It is highly recommended to check your API usage regularly and set up any necessary limits or alerts to prevent unexpected charges.

Related projects

Here are some other cool multi-agent related projects:

And also this curated list of awesome-agents.

data-to-paper's People

Contributors

ahmet-kaplan avatar hershman avatar lukashafner avatar maorsaleh avatar poper515 avatar rkishony avatar talifargan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

data-to-paper's Issues

Allow restarting a step from the App

Should have a button on the app "Restart Step" - taking you back tot he Mission prompt editing phase.
We can also allow going back to earlier steps, invalidating all products from there.

Scattered notes after some attempted runs

Just evaluating this tool.

Some notes:

  1. It's important that the user be able to edit and override anything, including past LLM outputs. (Obviously, this should be logged, but nothing should be "locked").
  2. There should probably be a package deployment step. (I.e. after exploration, if it's geodata, then we need to load and install and evaluate geopandas. If it's qual data, we need other packages, etc.)
  3. When it gets "stuck", it's important to be able to revert to earlier in the chain. (I've had two failed runs so far)
  4. When running, your README says: "python data_to_paper/data_to_paper/run/run.py" Run no longer exists. it is available in the venv's. bin/ directory though. (and in scripts/)
  5. Many of the tests no longer run.
  6. Setting up a new run from old data should be more possible. (I.e. I'm just getting started with my third attempt at a given dataset, and need to find all the source descriptors)
  7. Don't hardcode paths for files, reduces portability
  8. I'd quite like to discuss your prompts and system prompts, but I'll submit the ones which work for me once I've been able to finish a run.
  9. In the feedback loop cycle, having to toggle between output and feedback in the bottom right corner is somewhat tedious, especially with the giant prompt taking up most of the screen. Perhaps each session should be a series of accordions to show the history of that step's interactions (And to provide better editing capability?)
  10. When dealing with qual data, it needs to have more emphasis on preparing the data (in terms of making sure that units are incorporated so they can be passed to subsequent contexts, that categorical data is correctly imported, etc)
  11. (Just failed run 3 because it ran into an error and I cannot fix it -- as my keys weren't defined. Having a "test APIs" step before starting seems like a good idea? Also you should be able to fetch pricing information
  12. Consider using JSON mode output (https://platform.openai.com/docs/guides/text-generation/json-mode)
  13. A "sensemaking" check is needed. In your twitter example, the central claim is: "The negative values of the coefficients indicate that interactions are less likely between members from states that are numerically distant in the dataset’s coding system," and that's ... hilariously bad? (Also, a case where implementing geopandas is probably indicated.)
  14. Being able to load in a bibliography seems like a good idea.

paper don't compile

Sometime when a dataframe is converted to latex using the built in to_latex() method the final latex product is not compiling.
I have recognized that the error comes from the use of either multirow and multicolumn latex commands and providing hypertargeted number as their first inputs instead of a plain integer.

See commit 94ed3db for example paper that dosen't compile and possible fix path on the same branch.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.