we proposed a deep learning based framework, iDeep, to fuse heterogeneous data for predicting RNA-protein interaction sites. The deep learning framework can
not only learn the hidden feature patterns from individual source of data, but also extracted the shared representation across them. In addition, the convolutional neural network in iDeep can automatically identify binding motifs. To validate our proposed method over other methods,
we perform experiments on large-scale CLIP-seq datasets. The comprehensive results indicated the huge advantage of iDeep, which performs much better than the state-of-the-art methods.
Dependency
keras 1.0.0 library
sklearn
Content
./datasets: the training and testing dataset with extracted features, label and sequence.
./cbust_folder: Cluster-buster tool is used to generate motif features.
./pwms_folder: 102 PWMs from CISBP-RNA (Position Weight Matrix).
./predicted_motifs: detected binding motifs for individual proteins from iDeep. and it also includes the report file ame.html from AME in MEME suite, it reporte the enrichment score for the predicted motifs.
./ideep.py: the python code, it can be ran to reproduce our results.
./make_feature_table.py: it is modified based on primescore.
Contact: Xiaoyong Pan (xypan172436atgmail.com)