connorcoley / conv_qsar_fast Goto Github PK
View Code? Open in Web Editor NEWLicense: MIT License
License: MIT License
Dear professor,
This is just a reference for me in the future and also for people who encounter the same problem as I did.
It seems like the codes won't work with Tanimoto kernel. It was caused by the list type nature of input data. Adding a conditional statement to convert inputs from list to ndarray for both the training and testing should work:
For training:
if kwargs['kernel'] not in ['tanimoto']:
model.fit([x[0] for x in data[0]['mols']], data[0]['y'])
else:
train_x = np.array([x[0] for x in data[0]['mols']])
train_y = np.array(data[0]['y'])
For testing:
if kwargs['kernel'] not in ['tanimoto']:
predicted_y = model.predict(test_x)
else:
test_x = np.array(test_x)
predicted_y = model.predict(test_x)
I encountered some unresolved problems while running the model.
I have changed the model inputs into two molecules for predicting chemical reaction.
But the results were poor. and the loss & val_loss can't get update.
So do you mind to give me some advice ? thankyou !
python main/core.py
File "main/core.py", line 302
exec lr_func_string
^
SyntaxError: Missing parentheses in call to 'exec'
#Thanks Connor for publishing this project- it is a fascinating take on QSAR approaches-
I noticed that train_model in core.py assumes that all inputs are molecular tensors, so fingerprint-based models fail because they are single arrays. For example, the command
python conv_qsar_fast/main/main_cv.py conv_qsar_fast/inputs/tox21_Morgan/tox21_ahr.cfg
fails with a error message along the lines of:
Error when checking model input: the list of Numpy arrays that you are passing to your model is not the size the model expected. Expected to see 1 arrays but instead got the following list of 3 arrays: [array([[1, 0, 1, 1, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0,
0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0...
I'm running python 3.6 and having some weird output from training the examples...
It seems like the lib was written in python 2.x but after wrapping a few range
calls in list
it seems like things are working (executing) fine except that the example input.cfg's are stopping very early (~12-20 epochs for the Ab-oct example) and generating some pretty horrendous models:
Any thoughts on where to start trouble shooting?
Also what were the versions of Rdkit, Theano, and python this was written in?
This is a very interesting cheminformatrics approach. I was trying to learn and use the source code. I managed to change a couple of errors reported by Python 3 and made the code running with keras 2.1.6 and theano 1.0.3.
But the example cases always stopped at 10th epochs with early stopping. It did not appear the training performed correctly and the testing performance was very poor compared to what was reported in the paper. Any suggestion to make it compatible with Py3? Thanks.
Hi, when I run the script "python conv_qsar_fast/main/main_cv.py conv_qsar_fast/inputs/De-aq.cfg", some minor errors show up. I fixed it and then it runs normally. However, the final results are odd. The training loss never decreases in each CV folder and the predicted values stay all the same. I run it several times but with same results. Don't know why.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.