Giter VIP home page Giter VIP logo

Comments (7)

kexinhuang12345 avatar kexinhuang12345 commented on July 18, 2024 1

Hey Cyrus, that's a good point! That is also why we provided a log transformation function. You can try:

from tdc.single_pred import ADME
data = ADME(name = 'VDss_Lombardo')
data.convert_to_log()
data.label_distribution()

We want to keep the raw data as well, but we will add a note on the website about this important information.

I think in our current ADMET benchmark, we are not using that. So, it may be a good idea to use a log-scale for VDss and Half life. I will try that out to see if the performance makes sense. What other datasets have you identified this issue? Thanks!

from tdc.

cyrusmaher avatar cyrusmaher commented on July 18, 2024

Hi Kexin, that's for your quick reply! I went ahead and ran this for all the TDC endpoints that I'm working with. I only see benefits for VD, clearance, and half life. Note for the log transform, I use a robust version so it works for non-positive numbers...

def robust_log(x):
    return np.sign(x) * np.log(abs(x) + 1)

image

from tdc.

kexinhuang12345 avatar kexinhuang12345 commented on July 18, 2024

Thanks so much for making this table! It looks like you are using the old version of TDC. We have found that eDrug3D is very noisy, so we replaced them with higher quality ones. You can check them out in our website.

Would it be fast to generate these numbers for these new datasets? If not, let me know, i can also run some codes to test the difference.

Also, regarding the log transformation, i made a PR to incorporate your point: #56 one difference is instead of using +1 for numeric stability, i use 1e-10. Since it looks like some raw values are pretty small and +1 would make a difference.

from tdc.

cyrusmaher avatar cyrusmaher commented on July 18, 2024

Hi Kexin, good catch adding a smaller number! I got called off to do covid variant work, but I should be able to get back to this next week. Once it's ready, I'll add the updated table here.

from tdc.

kexinhuang12345 avatar kexinhuang12345 commented on July 18, 2024

Sounds good, thanks a lot!!!

from tdc.

cyrusmaher avatar cyrusmaher commented on July 18, 2024

Here you go! I updated the robust log computation and added the latest datasets:
image

from tdc.

kexinhuang12345 avatar kexinhuang12345 commented on July 18, 2024

Thanks so much! This looks good, it seems the currently supported dataset does not require log transformation. Closing for now! Feel free to reopen if you have any question!

from tdc.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.