Giter VIP home page Giter VIP logo

Comments (21)

guolinke avatar guolinke commented on May 12, 2024 7

@aldanor any updates ?

from lightgbm.

aldanor avatar aldanor commented on May 12, 2024 2

Any thoughts on this?

from lightgbm.

redditur avatar redditur commented on May 12, 2024 2

@guolinke @chivee

I would also be very interested in seeing this feature implemented in LightGBM. As aldanor stated above the Pseudo-code suggested earlier is correct and is how XGBoost implements monotonic constraints.

As such this feature should be fairly trivial to implement for someone with an intimate knowledge of the codebase.

from lightgbm.

mayer79 avatar mayer79 commented on May 12, 2024 1

From practical perspective (outside kaggle-world!), this feature would be extremely helpful in many applications where reasonable model behavior is relevant.

from lightgbm.

guolinke avatar guolinke commented on May 12, 2024 1

It seems the MC(Monotonic constraints) could be cumulative, that is, if both model A and B is MC, then A+B is MC.
So we only need to enable MC in decision tree learning.

combine @chivee 's pseudo code and @AbdealiJK 's suggestion.

I think the algorithm is:


min_value = node.min_value
max_value = node.max_value

check(min_value <= split.left_output) 
check(min_value <= split.right_output)
check(max_value >= split.left_otput)
check(max_value >= split.right_output)
mid = (split.left_output + split.right_output) / 2;

if (split.feature is monotonic increasing) {
  check(split.left_output <= split.right_output)
  node.left_child.set_max_value(mid)
  node.right_child.set_min_value(mid)
}
if (split.feature is monotonic decreasing ) {
  check(split.left_output >= split.right_output)
  node.left_child.set_min_value(mid)
  node.right_child.set_max_value(mid)
}

from lightgbm.

chivee avatar chivee commented on May 12, 2024

I'm pasting the snippets for the monotonic constraints here

IF (split is a continuous variable and monotonic)

THEN take average of left and right child nodes if current split is used

IF monotonic increasing THEN CHECK left average <= right average

IF monotonic decreasing THEN CHECK left average >= right average

@alexvorobiev , do you have referable papers for this features?

from lightgbm.

alexvorobiev avatar alexvorobiev commented on May 12, 2024

@chivee I only have the reference to the R GBM package https://cran.r-project.org/package=gbm

from lightgbm.

chivee avatar chivee commented on May 12, 2024

@alexvorobiev , thanks for your sharing. I'm trying to get the idea behind this method.

from lightgbm.

AbdealiLoKo avatar AbdealiLoKo commented on May 12, 2024

Note that the given pseudo code only ensures the split to be in the correct order and not the whole model as a later split could lead the model to be non monotonic

from lightgbm.

aldanor avatar aldanor commented on May 12, 2024

@guolinke Would you be able to advise how to approach this and whether it's feasible? I.e., where should it belong, would it be sufficient to implement it just somewhere in feature_histogram.hpp? I guess FeatureMetainfo could just contain the -1/0/1 constraint then.

Here's the meat of the implementation in XGBoost, for reference: https://github.com/dmlc/xgboost/blob/master/src/tree/param.h#L422 -- all of it pretty much contained in CalcSplitGain(), plus CalcWeight(). Where would stuff like this go in LightGBM?

from lightgbm.

guolinke avatar guolinke commented on May 12, 2024

@aldanor
I don't know the details about the monotonic constraints.
What is the idea? And why it is needed?

following may is useful:

The split gain calculation: https://github.com/Microsoft/LightGBM/blob/master/src/treelearner/feature_histogram.hpp#L291-L297

The leaf-output calculation:
https://github.com/Microsoft/LightGBM/blob/master/src/treelearner/feature_histogram.hpp#L305-L308

from lightgbm.

StrikerRUS avatar StrikerRUS commented on May 12, 2024

@guolinke I may add some links here about the implementation in XGBoost:
https://xgboost.readthedocs.io/en/latest//tutorials/monotonic.html
dmlc/xgboost#1514
dmlc/xgboost#1516

from lightgbm.

aldanor avatar aldanor commented on May 12, 2024

@aldanor
I don't know the details about the monotonic constraints.
What is the idea? And why it is needed?

@guolinke Monotonic constraints may be a very important requirement for the resulting models. For many reasons: e.g., as noted above, there could be domain knowledge that must be respected - e.g., in insurance and risk management problems.

How about we all cooperate and make this work?

from lightgbm.

guolinke avatar guolinke commented on May 12, 2024

@aldanor very cool, would like to work together with it.

from lightgbm.

guolinke avatar guolinke commented on May 12, 2024

@aldanor would you like to create a PR first ? I can provide my help in the PR.

from lightgbm.

aldanor avatar aldanor commented on May 12, 2024

@guolinke I will give it a try, yep. Your suggested algorithm in the snippet above looks fine, that's kind of what like xgboost does (in exact mode though, not histogram; do you think there would be any complications here because of binning?)

Where would this code belong then, treelearner/feature_histogram.hpp? (I still have to read through most of the code).

Edit: what do you mean by check(...) here? E.g., if (!(...)) { return; }?

from lightgbm.

guolinke avatar guolinke commented on May 12, 2024

@aldanor
The check means return gain with -inf if didn't meet the condition, as a result, that split will not be chosen.
I think there is not different for the MC in binned algorithm.

We need to update the calculation of gain: https://github.com/Microsoft/LightGBM/blob/master/src/treelearner/feature_histogram.hpp#L354-L357 and https://github.com/Microsoft/LightGBM/blob/master/src/treelearner/feature_histogram.hpp#L415-L418 .

We may need to wrap these to a new function, and implement both non-constraint and MC for them.

from lightgbm.

j-mark-hou avatar j-mark-hou commented on May 12, 2024

< removed due to irrelevance>

from lightgbm.

guolinke avatar guolinke commented on May 12, 2024

@j-mark-hou
there is one bug in your code, refer to @AbdealiJK `s comment and my algorithm below.

from lightgbm.

j-mark-hou avatar j-mark-hou commented on May 12, 2024

got it, I'll wait for someone with a better understanding of the codebase to implement this then.

from lightgbm.

guolinke avatar guolinke commented on May 12, 2024

you can try #1314

from lightgbm.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.