urbslab Goto Github PK
Name: UrbsLab
Type: Organization
Bio: ML and AI Lab in Department of Computational Biomedicine at Cedars-Sinai Medical Center
Twitter: docurbs
Location: United States of America
Name: UrbsLab
Type: Organization
Bio: ML and AI Lab in Department of Computational Biomedicine at Cedars-Sinai Medical Center
Twitter: docurbs
Location: United States of America
A set of Python-based Jupyter notebooks illustrating a documented example of a semi-automated term harmonization pipeline applied to harmonizing medical history terms across 28 clinical trials of pulminary arterial hypertension
An automated, rigorous, and largely scikit-learn based machine learning analysis pipeline for binary classification. Adopts current best practices to avoid bias, optimize performance, ensure replicatability, capture complex associations (e.g. interactions and heterogeneity), and enhance interpretability. Includes (1) exploratory analysis, (2) data cleaning, (3) partitioning, (4) scaling, (5) imputation, (6) filter-based feature selection, (7) collective feature selection, (8) modeling with 'optuna' hyperparameter optimization across 13 implemented ML algorithms (including three rule-based machine learning algorithms: ExSTraCS, XCS, and eLCS), (9) testing evaluations with 16 classification metrics, model feature importance estimation, (10) automatically saves all results, models, and publication-ready plots (including proposed composite feature importance plots), (11) non-parametric statistical comparisons across ML algorithms and analyzed datasets, and (12) automatically generated PDF summary reports.
An rigorous, well documented machine learning analysis pipeline for binary classification datasets assembled as a Jupyter Notebook. Includes exploratory analysis, data processing, feature processing, ML modeling (9 algorithms, including the original ExSTraCS algorithm) with hyperparameter sweeps, visualizations, and statistical analysis. A comprehensive starting point to adapt to your own dataset an as an example of how to integrate a non-scikit-learn ML algorithm into a comparative pipeline.
Feature Inclusion Bin Evolver for Risk Stratification (FIBERS) is an evolutionary algorithm that constructs bins of features, seeking to optimize the bins' stratification of event risk over time.
Source code for the Genetic Architecture Model Emulator for Testing and Evaluating Software (GAMETES) is an algorithm for the generation of complex single nucleotide polymorphism (SNP) models for simulated association studies.
Python scripts to generate an diverse archive of simulated SNP datasets using GAMETES
Supplemental materials and code for our GP-LCS project, adapting ExSTraCS to evolve GP trees rather than rules for comparison to other stand-alone GP algorithms
Documentation and informational resources for LPC use
Assembly of Jupyter notebooks comprising basic machine learning pipeline tasks. This student driven, independent study project will eventually evolve into a user-friendly starting point for ML pipeline example notebooks.
LCS Discovery and Visualization Environment (LCS-DIVE)
This repository includes educational materials on machine learning and a basic example machine learning analysis pipeline. These materials were originally developed for a workshop series at the University of Pennsylvania.
Code and results for an investigation of pancreatic cancer datasets applying our binary classification machine learning analysis pipeline notebook. Includes analysis and comparison of three pancreatic cancer datasets.
Example PyKE code and Jupyter Notebook for a simple backwards chaining expert system as described in this lecture on YouTube: https://www.youtube.com/watch?v=mzsk5_EmZq8
RARE: Relevant Association Rare-variant-bin Evolver (under development); an evolutionary algorithm approach to binning rare variants as a rare variant association analysis tool. Applications, visualizations, and modifications currently in works.
A scikit-learn-compatible Python implementation of eLCS, a supervised learning variant of Learning Classifier Systems
A scikit-learn implementation based on ExSTraCS 2.0 (under development)
Experimental variation of scikit-ExSTraCS that allows the user to import an initial rule population that will get initially evaluated and assigned fitness values prior to the start of learning iterations. This allows for the import of manually curated expert knowledge derived rules, or rules derived from other sources.
A scikit-learn compatible implementation of FIBERS (Feature Inclusion Bin Evolver for Risk Stratification)
scikit-RARE is scikit compatible pypi package for the RARE (Relevant Association Rare-variant-bin Evolver) evolutionary algorithm.
A scikit-learn-compatible Python implementation of ReBATE, a suite of Relief-based feature selection algorithms for Machine Learning.
scikit learn compatible implementation of XCS, the most popular and best studied learning classifier system algorithm to date.
An (updated and expanded) rigorous, well documented machine learning analysis pipeline for binary classification datasets assembled as a Jupyter Notebook. Includes exploratory analysis, data processing, feature processing, ML modeling (13 algorithms) with hyperparameter sweeps, visualizations, and statistical analysis. A comprehensive starting point to adapt to your own dataset.
An rigorous, machine learning analysis pipeline for binary classification datasets assembled as parallelizable command line modules. Includes exploratory analysis, data processing, feature processing, ML modeling (11 algorithms) with hyperparameter sweeps, visualizations, and statistical analysis. A comprehensive starting point to adapt to your own dataset.
Simple Transparent End-To-End Automated Machine Learning Pipeline for Supervised Learning in Tabular Binary Classification Data
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.