Giter VIP home page Giter VIP logo

fip3's Introduction

FIP3 - feature interrelation profiling

A Python library and script collection for identifying, quantifying and comparing interrelations between arbitrary boolean features (e.g. presence of structural motifs within a molecule, the molecule exhibiting a specific type of biological activity) from their co-occurrences in feature vectors (e.g. features of individual chemical structures) within a given feature vector set (e.g. a chemical database or its subset).

The rationale behind feature interrelation profiling and its example application is further described in Profiling and analysis of chemical compounds using pointwise mutual information.

Dependencies

  • Pandas: needed for core functionality as well as any subsequent interrelation profile analysis
  • RDKit for all chemistry-related functionality
  • Recommended:

Getting started

  • The Sphinx documentation is available at the project GitHub page. It can also be generated by make html command in the docs/source folder.
  • A Jupyter notebook with example use is also available.
  • This library is Python-only, so just installing the core dependencies into the environment, cloning this repository and adding it to the PYTHONPATH should work fine.
  • Building a co-occurrence profile is then simply a matter of:
>>> from fip.profiles import CooccurrenceProfile

# Some dummy feature sets
>>> FEATURE_TUPLES = (('a', 'b', 'c', 'd'), ('a', 'b', 'x'), ('c', 'd'))

# Create a co-occurrence profile instance
>>> p = CooccurrenceProfile.from_feature_lists(FEATURE_TUPLES)

# Unlike any prior implementations of interrelation profiling, FIP3 is a full rework 
# that uses sparse data_mount representation, with lazy, on-demand imputation of missing or
# insignificant pair values. Any explicit interrelations are mapped pairwise
# in a MultiIndex Pandas DataFrame, and can be accessed and handled as such:
>>> p.df
                   value
feature1 feature2       
a        a             2
         b             2
         c             1
         d             1
b        b             2
         c             1
         d             1
c        c             2
         d             2
d        d             2
a        x             1
b        x             1
x        x             1

>>> from fip.profiles import CooccurrenceProbabilityProfile
>>> q = CooccurrenceProbabilityProfile.from_cooccurrence_profile(p)
>>> q
<fip.profiles.CooccurrenceProbabilityProfile object at 0x7f3222e05290>

>>> from fip.profiles import PointwiseMutualInformationProfile
>>> r = PointwiseMutualInformationProfile.from_cooccurrence_probability_profile(q)
>>> r
<fip.profiles.PointwiseMutualInformationProfile object at 0x7f3212047210>

>>> r.df
                      value
feature1 feature2          
a        a         0.000000
         b         0.584963
         c        -0.415037
         d        -0.415037
b        b         0.000000
         c        -0.415037
         d        -0.415037
c        c         0.000000
         d         0.584963
d        d         0.000000
a        x         0.584963
b        x         0.584963
x        x         0.000000

>>> r.select_raw_interrelations_involving('c')
                      value
feature1 feature2          
c        d         0.584963
a        c        -0.415037
b        c        -0.415037


# Export to explicit matrix DataFrame is also possible, with imputation:
>>> r.to_explicit_matrix()
          a         b         c         d         x
a       0.0  0.584963 -0.415037 -0.415037  0.584963
b  0.584963       0.0 -0.415037 -0.415037  0.584963
c -0.415037 -0.415037       0.0  0.584963 -1.754888
d -0.415037 -0.415037  0.584963       0.0 -1.754888
x  0.584963  0.584963 -1.754888 -1.754888       0.0

# much more in documentation :)

Available through the MIT License.

Supported by Junior Internal Grant of the UCT Prague (2021, #2103)

fip3's People

Contributors

cmeloi avatar

Stargazers

 avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.