Giter VIP home page Giter VIP logo

aim-manuscript's Introduction

Project Description:

We present code, data, and supplementary figures and documents used in the preparation of the manuscript "Defining the AIM: An Abstraction for Improving Machine Learning Prediction". We illustrate the need for abstraction describing Machine Learning pipelines to facilitate the comparison, improvement, and study of ML results by focusing on the famous ALL/AML dataset [1]. We define an abstraction layer for leaderboard style competitions to improve ML results.

Repository Contents:

  • LiteratureSearch folder:
    This folder contains two notebooks, one giving the results of our literature analysis (LiteratureSearchResults.ipynb) and the other presenting ML pipelines for the articles (SummaryofMLpipelines.ipynb).
  • ReproducingMLpipelines folder:
    This folder contains 12 notebooks, 5 for each article we studied in depth, 5 for the comparison of the articles' methods (Table 1 in the manuscript), and 2 for comparison summaries. We also included the intermediate .Rdata file we created in the folder.
  • See the ReproducingMLpipeline example folder for reproducible containers (Singularity and Docker) to run the pipeline.

Data and Associated Repos:

  • Data in the Golub et al. paper[1]: The datasets used in [1] with training dataset(38 by 7129) and testing dataset(34 by 7129).
  • Data Version 2: leukemia data in R package spikeslab(72 by 3571). We have shown that this data is a transformed version of the original data.
  • Data Version 3: 'golub' data in R package multtest. In which, 'golub' is the training dataset (38 by 3051) and 'golub.cl' is the test dataset (34 by 3051). We also have shown that this data is another transformed dataset based on the original data.
  • We use the data in [1] (also here and in the LiteratureSearch folder) to reproduce results in the papers.
  • Associated Repos
    Previous work

If you have any questions, please contact us [email protected] and [email protected].

aim-manuscript's People

Contributors

victoriastodden avatar xiaomianwu avatar vsoch avatar

Stargazers

Jianzhang Chen avatar

Watchers

 avatar James Cloos avatar telin avatar

Forkers

xiaomianwu

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.