Giter VIP home page Giter VIP logo

xai4chem's Introduction

XAI4Chem

Explainable AI for Chemistry

Installation

For the environment,

conda create -n xai4chem python=3.10 -y
conda activate xai4chem

Then install from GitHub:

python -m pip install git+https://github.com/ersilia-os/xai4chem.git 

Usage

Data

Read data file and split:

import pandas as pd  
from sklearn.model_selection import train_test_split

data = pd.read_csv("plasmodium_falciparum_3d7_ic50.csv") #data path

# Extract SMILES and target values
smiles = data["smiles"]
target = data["pchembl_value"] #target value's column_name

# Split data into training and test sets
smiles_train, smiles_valid, y_train, y_valid = train_test_split(smiles, target, test_size=0.2, random_state=42)

# Reset indices
smiles_train.reset_index(drop=True, inplace=True)
smiles_valid.reset_index(drop=True, inplace=True)
y_train.reset_index(drop=True, inplace=True)
y_valid.reset_index(drop=True, inplace=True)

Calculate and transform descriptors: Choose either Descriptors( any of; Datamol, Mordred and RDKit) or Fingerprints(Morgan)

from xai4chem import DatamolDescriptor

descriptor = DatamolDescriptor(discretize=False)

# Fit the descriptor to training data
descriptor.fit(smiles_train)

# Transform the data
smiles_train_transformed = descriptor.transform(smiles_train)
smiles_valid_transformed = descriptor.transform(smiles_valid)

Model Training and Evaluation

The tool provides a Regressor class for training and evaluating regression models. It supports XGBoost, LGBM and CatBoost algorithms. You can train the model with default parameters or perform hyperparameter optimization using Optuna.

Also, you can specify the number of features(k) to use. Feature selection will automatically select the relevant k features during training.

from xai4chem import Regressor

# use xgboost,lgbm or catboost
regressor = Regressor(output_folder, algorithm='xgboost', k=100) #Specify the output folder where evaluation metrics and interpretability plots will be saved.

# Train the model
regressor.fit(smiles_train_transformed, y_train, default_params=False)

#you can save the trained model
#regressor.save('model_filename.joblib') #pass the filename

# Evaluate the model
regressor.evaluate(smiles_valid_transformed, y_valid)

Model Interpretation

The Regressor class also provides functionality for interpreting model predictions. You can generate plots by;

regressor.explain(smiles_train_transformed)

Classification model.

To generate interpretability plots for a trained classification model, use;

from xai4chem.explain_model import explain_model

explanation = explain_model(model, X, output_folder)

# Parameters:
# model: A trained model.
# X: The feature set used for explanation.
# output_folder: Folder to save the interpretability plots.

xai4chem's People

Contributors

hellennamulinda avatar gemmaturon avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.