fabianp / mord Goto Github PK

Ordinal regression algorithms

License: Other

Python 100.00%

mord's Issues

Feature request: `predict_proba()` method would be useful

Hey,

nice work! It would be great to have useful predict_proba() added for each classes so that the package is even more compatible with sklearn and convenient to use (in my project I need exact estimates of probabilities).

Cheers,
Jan

predict_proba() sometimes gives negative values

When working with 3 or more ordinal classes, the predict_proba function, which relies on the threshold_proba calculation, sometimes outputs negative values.

LAD method predicts not existing classes in training data

Hi,

I tried to used LAD method and it predicts not existing classes.
I discovered this issue while trying to plot the confusion matrix.
Indeed I have encoded classes between 5 and 14. But LAD method predicts sometimes a class 4 which should not exist.

Have I done a mistake ? Should I reclassify this 4 class into a 5 class (less qualitative)?

Regards

Coefficients differs from statsmodel when there is no regularization (alpha = 0)

Hi There!
I am enjoying using your package to analyse ordinal data. However, when I compared the results of mord.LogisticAT(alpha = 0) against from statsmodels.miscmodels.ordinal_model import OrderedModel I obtained different coefficients. Should not they be the same? (As long as I am not using any regularization technique?).

Here is a reproducible example:

import pandas as pd
from statsmodels.miscmodels.ordinal_model import OrderedModel

url = "https://stats.idre.ucla.edu/stat/data/ologit.dta"
data_student = pd.read_stata(url)
X, y = data_student[['pared', 'public', 'gpa']], data_student['apply'] 
mod_prob = OrderedModel(y,
                        X,
                        distr='logit')

res_prob = mod_prob.fit(method='bfgs')
res_prob.summary()

MORD:

import mord
from sklearn.preprocessing import LabelEncoder

le = LabelEncoder()
y = le.fit_transform(y)
model_mord = mord.LogisticAT(alpha = 50)
model_mord.fit(X, y)
y_pred = model_mord.predict(X)
print('\nCoef mord', model_mord.coef_)
Coef mord [-0.02192845  0.12534365  0.01633311]

which differ a lot from OrderedModel (coefficient of pared is 1.0476 in stasmodel and -0.02192845 in mord).

Also, could I exponentiate the model_mort.coef_ and interpret it as the traditional Odds_Ratio?

Thanks in advance!!

Release 0.5

@fabianp could you release current master as v0.5 and push to Pypi? Thanks!

Unable to determine build deps if numpy not installed

If you don't have numpy installed, setuptools can't parse setup.py. This is because there's import mord in the setup.py, which exectued mord/__init__.py, which imports mord/threshold_based.py, which imports numpy and failed.

Yes, it needs numpy to run, but pip can't determine that it needs numpy without having numpy already installed (note this is also true for scipy and sklearn).

I can't think of an easy fix that doesn't require having the version number in two places,

davids@davids:~$ mkvirtualenv test
Using base prefix '/usr'
New python executable in /home/davids/.virtualenvs/test/bin/python3
Also creating executable in /home/davids/.virtualenvs/test/bin/python
Installing setuptools, pip, wheel...done.
(test) davids@davids:~$ pip install mord
Collecting mord
  Downloading https://files.pythonhosted.org/packages/82/3f/f5689633bcde3fced1e50296e555cd67141e12a90499655f9dfc0771251d/mord-0.5.tar.gz
    Complete output from command python setup.py egg_info:
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/tmp/pip-install-hesxyg_3/mord/setup.py", line 2, in <module>
        import mord
      File "/tmp/pip-install-hesxyg_3/mord/mord/__init__.py", line 1, in <module>
        from .threshold_based import *
      File "/tmp/pip-install-hesxyg_3/mord/mord/threshold_based.py", line 7, in <module>
        import numpy as np
    ImportError: No module named 'numpy'

    ----------------------------------------
Command "python setup.py egg_info" failed with error code 1 in /tmp/pip-install-hesxyg_3/mord/

however

davids@davids:~$ mkvirtualenv test
Using base prefix '/usr'
New python executable in /home/davids/.virtualenvs/test/bin/python3
Also creating executable in /home/davids/.virtualenvs/test/bin/python
Installing setuptools, pip, wheel...done.
(test) davids@davids:~$ pip install numpy scipy scikit-learn
Collecting numpy
  Using cached https://files.pythonhosted.org/packages/29/b9/479ccb55cc7dcff3d4fc7c8c26d4887846875e7d4f04483a36f335bed712/numpy-1.15.0-cp35-cp35m-manylinux1_x86_64.whl
Collecting scipy
  Using cached https://files.pythonhosted.org/packages/cd/32/5196b64476bd41d596a8aba43506e2403e019c90e1a3dfc21d51b83db5a6/scipy-1.1.0-cp35-cp35m-manylinux1_x86_64.whl
Collecting scikit-learn
  Using cached https://files.pythonhosted.org/packages/b6/e2/a1e254a4a4598588d4fe88b45ab88a226c289ecfd0f6c90474eb6a9ea6b3/scikit_learn-0.19.2-cp35-cp35m-manylinux1_x86_64.whl
Installing collected packages: numpy, scipy, scikit-learn
Successfully installed numpy-1.15.0 scikit-learn-0.19.2 scipy-1.1.0
(test) davids@davids:~$ pip install mord
Collecting mord
  Downloading https://files.pythonhosted.org/packages/82/3f/f5689633bcde3fced1e50296e555cd67141e12a90499655f9dfc0771251d/mord-0.5.tar.gz
Installing collected packages: mord
  Running setup.py install for mord ... done
Successfully installed mord-0.5
(test) davids@davids:~$

Odds Ratio for Threshold-based method

Hi, does anyone know the meaning of the coefficent in Threshold-based ordinal logistic method ? Can we compute the odds ratio by log-tranforming the coefficent in this kind of methods? And what's the difference between this method and the method in SAS, SPSS, R?

I want to know how to execute parallel lines test? Before going on Oridnal logistic Regression, we should go on parallel lines test. But I don't find method in python.

datasets package is not installed

After

pip install -U 'git+https://github.com/fabianp/mord'

This happens:

ImportError                               Traceback (most recent call last)
<ipython-input-1-a8a4fbc902d0> in <module>()
      1 import mord
----> 2 from mord.datasets.base import load_housing
      3 from sklearn import linear_model, metrics, preprocessing
      4 
      5 data = load_housing()

ImportError: No module named datasets.base

Seems the datasets package is not installed.

Only one intercept value

I have a dataset with close to 120 observations, 21 covariates and 5 response. I ran the ordinal ridge regression and while looking up the the intercept value it shows there is only one intercept for the entire model. Should there not be (5-1) intercepts.

AttributeError: module 'mord' has no attribute 'MulticlassLogistic'

It looks like MulticlassLogistic is not implemented even through mentioned in the documentation

missing significance values for regression and coefficients ,R^2 seems to be off boundaries

thanks for creating this package!

I was trying to use it to predict a rank based on lag values of ranks as well

I got the coefs but what is their significance? : t test, p values and the F value for the regression? it is possible to get that?

the score function gave a result of -79218.32 ...seems strange if that is the R^2

I did this:
`from mord import OrdinalRidge

ord_reg=OrdinalRidge()

temp_df=train_aamc.dropna()
cols=[x for x in temp_df if "lag" in x]

ord_reg.fit(temp_df[cols],temp_df["aamc"])
ord_reg.score(temp_df[cols],temp_df["aamc"])
`

Obtaining boundaries

Hello,

How can i obtain the boundaries (or thresholds) delimited by your ordinal regression code?
Also, is it possible to obtain a real value instead of a integer result?

Best regards,

fabianp / mord Goto Github PK

mord's Issues

Feature request: `predict_proba()` method would be useful

predict_proba() sometimes gives negative values

LAD method predicts not existing classes in training data

Coefficients differs from statsmodel when there is no regularization (alpha = 0)

Release 0.5

Unable to determine build deps if numpy not installed

Odds Ratio for Threshold-based method

I want to know how to execute parallel lines test? Before going on Oridnal logistic Regression, we should go on parallel lines test. But I don't find method in python.

datasets package is not installed

Only one intercept value

AttributeError: module 'mord' has no attribute 'MulticlassLogistic'

missing significance values for regression and coefficients ,R^2 seems to be off boundaries

Obtaining boundaries

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent