Giter VIP home page Giter VIP logo

cs-ranking's People

Contributors

dependabot[bot] avatar kiudee avatar prithagupta avatar timokau avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cs-ranking's Issues

Sequence-based indexing in theano is deprecated

In csrank/discretechoice/nested_logit_model.py we make use of sequence indexing in theano multiple times. For example:

rows, cols = tt.eq(self.y_nests, i).nonzero()

Here rows and cols are both 1d tensors which are then used to index a different tensor:

utility = tt.set_subtensor(utility[rows, cols], tt.dot(self.Xt[rows, cols], weights[i]))

Theano complains that this is deprecated:

FutureWarning: Using a non-tuple sequence for multidimensional indexing is deprecated; use `arr[tuple(seq)]` instead of `arr[seq]`. In the future this will be interpreted as an array index, `arr[np.array(seq)]`, which will result either in an error or a different result.

I'm not sure how to fix this. According to the documentation, theano should support boolean mask indexing. So I thought we should be able to do

mask = tt.eq(self.y_nests, i)
utility = tt.set_subtensor(utility[mask], tt.dot(self.Xt[mask], weights[i]))

instead (as tt.eq should return a boolean mask). But unfortunately that doesn't work; it gives the same warning.

Any ideas?

Migrate away from tf1

There has been some internal discussion about this, but I think its time to also open an issue about it. We are still using tensorflow 1, which has been outdated for a while now. Switching to tensorflow 2 would be a significant effort, since the underlying model fundamentally changed (there is no explicit graph construction anymore). At that point, it may be worth evaluating switching to pytorch instead. pytorch is a newer, very popular autodiff framework.

This article comes to the conclusion that

TensorFlow is still mentioned in many more job listings that PyTorch, but the gap is closing. PyTorch has taken the lead in usage in research papers at top conferences and almost closed the gap in Google search results. TensorFlow remains three times more common in usage according to the most recent Stack Overflow Developer Survey.

Here's another relevant article. Overall it seems to me that pytorch is the more future-proof choice, and if we're going to have to rewrite a lot of the code anyway we might as well switch. I do not have any practical experience in pytorch yet though, that's just what I could determine from other's opinions and first impressions.

We should also think about how we want to do the transition. This is a major undertaking and probably will take a while. Should we support tf1 and newthing in parallel? Gradually move models to newthing (thereby having mixed support)? Fork the project? Work on one big PR/branch, effectively blocking most other work for the time due to potential conflicts?

Organize imports

Currently we import all submodules in __init__:

from .choicefunction import *
from .core import *
from .dataset_reader import *
from .discretechoice import *
from .objectranking import *
from .tunable import Tunable
from .tuning import ParameterOptimizer

This results in the user seeing a confusing list of submodules. We should trim that by using __all__ to only import important classes and functions.

The remaining modules are still available, but hidden by default.

Implement proper ranking conversion for nDCG

nDCG is expecting relevance scores as input. When supplying rankings, we first have to convert the ranking into a set of ordered scores. This can quickly lead to numerical problems, due to the exponential growth of 2 ** s.

Todo

  • Implement a method of converting rankings to relevance stores, which ensures numerical stability for nDCG.
  • Reactivate the test in test_metrics and account for the conversion.

"mean of empty slice" in spearman correlation calculation

During the tests, numpy complains about a "mean of empty slice". That happens because the calculation of the spearman correlation filters the labels it applies to as follows:

if len(np.unique(r2)) == len(r2):

And then averages its results:

return np.nanmean(np.array(rho))

Which may be empty (or consist of only NaNs) due to the previous filter. What is the intention behind that filter?

CC @prithagupta

Clean up notebooks

We need the following notebooks:

  • Usage of FATE-Network
  • Usage of FETA-Network
  • Run of experiments on synthetic data
  • ...

Adhere to scikit-learn estimator interface

Rationale

Most of the learners implemented in cs-ranking already implement an interface similar to the one described in https://scikit-learn.org/stable/developers/develop.html#rolling-your-own-estimator,
i.e., we usually have a fit and predict method implemented.
For users to be able to use all learners effortlessly in a scikit-learn pipeline.Pipeline or to apply model_selection.GridSearchCV, we should make sure that all additional requirements are also fulfilled.

To do

  • Use get_params and set_params to set parameters. This is important, since GridSearchCV or BayesSearchCV call set_params for hyperparameter optimization. sklearn.base.BaseEstimator implements basic versions of these. The current way we handle hyperparameters should be deprecated.
  • It is recommended to not do any parameter validation in __init__, but rather in fit itself. set_params is supposed to do exactly the same thing as __init__ with respect to parameters.
  • Init parameters should be written without changes as attributes. All generated attributes should have a trailing _.
  • There should be no mandatory parameters. The user should be able to run the learner without having to provide arguments.
  • Implement a score method. This is helpful, since hyperparameter optimizers call this function by default. Otherwise the user has to implement a custom one.
  • Implement clone methods for each learner.

Most of these changes are independent of each other and could be done using separate branches.

Move scripts for the experiments to another repository

Currently, the scripts we use for our experiments are still part of this repository. Since this repository is moving towards being a library for object ranking and choice, this code should be moved to a separate repository.

Document the release process

We are using bump2version now to change the version number and create a tagged commit. This triggers an upload of the new version to PyPi.
At the same time the HISTORY.rst file needs to be updated with the recent changes.

This process should be documented in Sphinx.

The correct order is:

  1. Update HISTORY.rst and commit.
  2. Run bump2version [patch|minor|major]
  3. Push commits and tag to master/branch.

Device placement is logged by default

We have a utility function configure_numpy_keras which is used in some of the experiment scripts:

def configure_numpy_keras(seed=42):
tf.set_random_seed(seed)
os.environ["KERAS_BACKEND"] = "tensorflow"
devices = [x.name for x in device_lib.list_local_devices()]
logger = logging.getLogger("ConfigureKeras")
logger.info("Devices {}".format(devices))
n_gpus = len([x.name for x in device_lib.list_local_devices() if x.device_type == 'GPU'])
if n_gpus == 0:
config = tf.ConfigProto(intra_op_parallelism_threads=1, inter_op_parallelism_threads=1,
allow_soft_placement=True, log_device_placement=False,
device_count={'CPU': multiprocessing.cpu_count() - 2})
else:
config = tf.ConfigProto(allow_soft_placement=True,
log_device_placement=True, intra_op_parallelism_threads=2,
inter_op_parallelism_threads=2) # , gpu_options = gpu_options)
sess = tf.Session(config=config)
K.set_session(sess)
np.random.seed(seed)
logger.info("Number of GPUS {}".format(n_gpus))

It does the following:

  • Set random seeds
  • Sets the KERAS_BACKEND to Tensorflow
  • Checks the number of GPUs and sets the Tensorflow options accordingly
  • Creates a Tensorflow session for Keras to use

There are a few issues (and maybe more) with this:

  • Everything is set to hardcoded constants. Making it configurable is desirable.
  • log_device_placement is set to True, which can cause slowdowns due to logging and should be False by default.
  • It is not clear, if tensorflow_util.py is the correct location, if the function is only ever used in experiments.
  • It is not documented.

Migrate Optimizer to BoTorch

Rationale

scikit-optimize is currently not maintained anymore and BoTorch implements several features making it very useful for our library:

  • Proper handling of hyper priors (including sensible defaults), which should help stabilize our runs
  • Analytic and Monte-Carlo acquisition functions designed for noisy target functions
  • Batching of hyperparameter runs (allowing parallel execution)

Fix warnings in tests

We have many warnings in the tests, since we accidentally escape characters in the docstrings.

Example:

"""
Construct the CmpNet which is used to approximate the :math:`U_1(x_i,x_j)`. For each pair of objects in
:math:`x_i, x_j \in Q` we construct two sub-networks with weight sharing in all hidden layers.
The output of these networks are connected to two sigmoid units that produces the outputs of the network,
i.e., :math:`U(x_1,x_2), U(x_2,x_1)` for each pair of objects are evaluated. :math:`U(x_1,x_2)` is a measure
of how favorable it is to choose :math:`x_1` over :math:`x_2`.

Document the new choice settings

Todo

  • Update the README.rst
  • Mirror the intro.rst
  • Update API:
    • Check old API reference and fix if necessary
    • Include new settings
  • Write example notebooks for both settings

Look into typing with mypy

See #129 (comment). We already declare many types in docstrings. Using mypy would require us to formalize this a bit more, with the added bonus of static guarantees and better tooling support (such as enhanced tab completion).

Support saving of models

Rationale

When training the models on different datasets, it would be advantageous to be able to save the model as is to a file. That way it is easy to later load the model and e.g. evaluate it on new instances etc.

Check documentation style

We have recently added some static analysis and formatting tools. One thing we are not checking for yet is inline documentation. There are some tools out there, for example pycodestyle. It seems like pylint has some doc checking functionality too.

I think it would be valuable to extend our static checks to the inline documentation. We already check stand-alone rst files with doc8..

Improve Tunable

The new Tunable class should be able to change the set of tunable parameters during runtime (currently it is a class method).
This would allow us to attach arbitrary numbers of parameters to a model (e.g. coming from callbacks etc).

  • Change Tunable to be nestable
    • Methods should be object methods
  • Change optimizer to use object methods

Potential problems

  • Fit function is called after optimizer needs to know about parameters
  • Is it realistic that a model can have parameters to be tuned which do depend on the model (and thus are not settable in advance by the user)?
  • When the user provides tunable objects -> ensure that they reset properly across iterations.

Potential solutions

  • Let optimizer handle an ordered dictionary of all the tunables.

AllPositive Choice Baseline does not predict anything for the non-variadic case

The _predict_scores_fixed method of the class AllPositive requires X and Y inputs:

def _predict_scores_fixed(self, X, Y, **kwargs):
return np.zeros_like(Y) + Y.mean()

In the variadic case, it is called with X and Y:

scores[ranking_size] = self._predict_scores_fixed(
x, Y[ranking_size], **kwargs

However, it is called without Y in the predict_scores method for the non-variadic case:

scores = self._predict_scores_fixed(X, **kwargs)

This leads to a None prediction.

FATE Choice ignores parameters of the fit method

The fit method parameters are:

def fit(
self,
X,
Y,
epochs=35,
inner_epochs=1,
callbacks=None,
validation_split=0.1,
verbose=0,
global_lr=1.0,
global_momentum=0.9,
min_bucket_size=500,
refit=False,
tune_size=0.1,
thin_thresholds=1,
**kwargs,
):

However the parameters are not passed to the super class:

super().fit(X_train, Y_train, **kwargs)

super().fit(X, Y, **kwargs)

Write docstrings for dataset generators

Dataset generators like

class ChoiceDatasetGenerator(SyntheticDatasetGenerator):
def __init__(self, dataset_type='pareto', **kwargs):
super(ChoiceDatasetGenerator, self).__init__(
learning_problem=CHOICE_FUNCTION, **kwargs)
dataset_function_options = {'linear': self.make_latent_linear_choices,
"pareto": self.make_globular_pareto_choices}
if dataset_type not in dataset_function_options.keys():
dataset_type = "pareto"
self.dataset_function = dataset_function_options[dataset_type]
inherit the docstring of the parent class, which is not very informative.

Speed up Travis-CI builds

Currently, the tests for the probabilistic models implemented in PyMC3 take a long time to run, which causes a long delay between updating a pull request and receiving Travis confirmation.

Measures

  • Parallelize build using several environments
  • Speed up PyMC3 tests
  • Speed up installation process and optimize caching (if possible)

Prepare package for PyPI/Anaconda

Rationale

For users it is much more convenient to be able to install the package from PyPI using a simple

pip install cs-ranking

or

conda install -c conda-forge cs-ranking

rather than checking out the repository.

What needs to be done

Make auxiliary dependencies optional

Currently we require quite a few dependencies, which makes installing the library difficult. I went ahead and categorized the different dependencies (see below). We should do the following:

  1. Remove these dependencies from install_requires
  2. Move these to extras_require

Dependencies

  • Core (mandatory):
    • numpy
    • scipy
    • scikit-learn
    • scikit-optimize
    • joblib
    • keras
    • tensorflow
    • docopt
  • Data I/O or generation (optional):
    • psycopg2-binary for database access
    • pandas
    • h5py
    • pygmo
  • Required for some of the probabilistic models (optional):
    • pymc3
    • theano
  • Nice to haves (optional):
    • tqdm for progress bars

Importing csrank causes error without optional theano installed

Simply running

import csrank

after installing csrank from pip without any optional dependencies, results in the following error:

csrank.util.MissingExtraError: Could not import the optional dependency theano. Please install it or specify the "probabilistic" extra when installing this package.

This should definitely not happen.

Version: 1.2.0

Check if dataset generator exists and otherwise raise an exception

Currently, if the dataset generator receives an invalid dataset, it silently picks a default generator.

if dataset_type not in dataset_function_options.keys():
dataset_type = "medoid"

This is unexpected behavior and should be changed. If the dataset generator is unknown, an exception should be raised.

We should check all dataset generators/parsers for similar behavior.

Callbacks

  • Bug in LRScheduler of callbacks
  • Create LRSchedular and EarStopping independent of Keras implementation to avoid issues.
    One way would be to inherit from Callback from Keras.

There is an issue with the current LrScheduler. The problem is that it exponentially decreases the learning rate at each epoch after the epoch_drop.
For learning rate of 0.015 and epoch drop=5 and drop percentage =0.9

Epoch 00001: LearningRateScheduler setting learning rate to 0.014999999664723873.
Epoch 00005: LearningRateScheduler setting learning rate to 0.013499999698251487.
Epoch 00006: LearningRateScheduler setting learning rate to 0.012149999476969242.
Epoch 00007: LearningRateScheduler setting learning rate to 0.010934999864548446.
Epoch 00008: LearningRateScheduler setting learning rate to 0.009841500129550696
Epoch 00009: LearningRateScheduler setting learning rate to 0.00885734986513853.
Epoch 00010: LearningRateScheduler setting learning rate to 0.007174453390762211
Epoch 00011: LearningRateScheduler setting learning rate to 0.005811307118274272
While the output should be:
Epoch 00001: LearningRateScheduler setting learning rate to 0.014999999664723873.
Epoch 00005: LearningRateScheduler setting learning rate to 0.013499999698251487.
Epoch 00010: LearningRateScheduler setting learning rate to 0.012149999728426338.
Epoch 00015: LearningRateScheduler setting learning rate to 0.010934999755583704.

The f-measure is ill-defined when there are no true positives or no positive predicitons

sklearn issues a warning during the tests:

sklearn.exceptions.UndefinedMetricWarning: F-score is ill-defined and being set to 0.0 in samples with no predicted labels.

This is because

  • some of the test samples generated in csrank/tests/test_choice_functions.py:trivial_choice_problem have no true positives
  • some of the learners predict no positives for some of the generated problems

In both of those cases the f-measure is not properly defined. sklearn assigns 0 and 1 respectively.

How should we deal with this? A metric should be defined for these possibilities. 0 and 1 in those cases seems somewhat reasonable, so maybe we should just silence the warning?

Add a `pyproject.toml`

The current standards for specifying the build of a python project, its dependencies and the configuration of various tools is pyproject.toml. It may be a good idea to adopt this. There are some holdouts for the tools. Specifying the build system without needing to run a python program (which may have dependencies itself) is the biggest benefit.

Use a semantic linter for the python code

Now that we use a linter for formatting (#78), we could also use a semantic linter. Two common options are flake8, which is more conservative (less reports) and pylint. We would first need to address the issues those linters raise. For example for flake8 (ignoring line length, since black takes care of that and disagrees with flake8s limit):

$ flake8 **/*.py | grep -v 'line too long' | wc -l
36

Fixing those will likely improve the code anyway.

Implement Tests

  • Learners
    • Object Ranking
    • Discrete Choice
    • General Choice Functions
  • Dataset Parsers
  • Metrics
  • Losses

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.