kiudee / cs-ranking Goto Github PK
View Code? Open in Web Editor NEWContext-sensitive ranking and choice in Python with PyTorch
Home Page: https://cs-ranking.readthedocs.io
License: Apache License 2.0
Context-sensitive ranking and choice in Python with PyTorch
Home Page: https://cs-ranking.readthedocs.io
License: Apache License 2.0
Currently variables like learning_rate
are specified as a tunable and it is not possible to continue training with a lower learning rate.
Thoughts?
It would be nice to check our documentation before committing / merging. Now that we're making use of the pre-commit framework, this should be easy. Options include
We just need to make sure our docs pass the linter first. This issue might be relevant regarding warnings.
In csrank/discretechoice/nested_logit_model.py
we make use of sequence indexing in theano multiple times. For example:
Here rows
and cols
are both 1d tensors which are then used to index a different tensor:
Theano complains that this is deprecated:
FutureWarning: Using a non-tuple sequence for multidimensional indexing is deprecated; use `arr[tuple(seq)]` instead of `arr[seq]`. In the future this will be interpreted as an array index, `arr[np.array(seq)]`, which will result either in an error or a different result.
I'm not sure how to fix this. According to the documentation, theano should support boolean mask indexing. So I thought we should be able to do
mask = tt.eq(self.y_nests, i)
utility = tt.set_subtensor(utility[mask], tt.dot(self.Xt[mask], weights[i]))
instead (as tt.eq
should return a boolean mask). But unfortunately that doesn't work; it gives the same warning.
Any ideas?
There has been some internal discussion about this, but I think its time to also open an issue about it. We are still using tensorflow 1, which has been outdated for a while now. Switching to tensorflow 2
would be a significant effort, since the underlying model fundamentally changed (there is no explicit graph construction anymore). At that point, it may be worth evaluating switching to pytorch
instead. pytorch
is a newer, very popular autodiff framework.
This article comes to the conclusion that
TensorFlow is still mentioned in many more job listings that PyTorch, but the gap is closing. PyTorch has taken the lead in usage in research papers at top conferences and almost closed the gap in Google search results. TensorFlow remains three times more common in usage according to the most recent Stack Overflow Developer Survey.
Here's another relevant article. Overall it seems to me that pytorch
is the more future-proof choice, and if we're going to have to rewrite a lot of the code anyway we might as well switch. I do not have any practical experience in pytorch
yet though, that's just what I could determine from other's opinions and first impressions.
We should also think about how we want to do the transition. This is a major undertaking and probably will take a while. Should we support tf1
and newthing
in parallel? Gradually move models to newthing
(thereby having mixed support)? Fork the project? Work on one big PR/branch, effectively blocking most other work for the time due to potential conflicts?
Currently we import all submodules in __init__
:
Lines 3 to 9 in 55396fb
This results in the user seeing a confusing list of submodules. We should trim that by using __all__
to only import important classes and functions.
The remaining modules are still available, but hidden by default.
nDCG is expecting relevance scores as input. When supplying rankings, we first have to convert the ranking into a set of ordered scores. This can quickly lead to numerical problems, due to the exponential growth of 2 ** s
.
test_metrics
and account for the conversion.During the tests, numpy complains about a "mean of empty slice". That happens because the calculation of the spearman correlation filters the labels it applies to as follows:
cs-ranking/csrank/metrics_np.py
Line 24 in ba03234
And then averages its results:
cs-ranking/csrank/metrics_np.py
Line 29 in ba03234
Which may be empty (or consist of only NaN
s) due to the previous filter. What is the intention behind that filter?
CC @prithagupta
See #118 (comment) for details. Should be resolved after merging #118 to avoid conflicts.
Currently we only document the fit
function.
We need the following notebooks:
Most of the learners implemented in cs-ranking already implement an interface similar to the one described in https://scikit-learn.org/stable/developers/develop.html#rolling-your-own-estimator,
i.e., we usually have a fit
and predict
method implemented.
For users to be able to use all learners effortlessly in a scikit-learn pipeline.Pipeline or to apply model_selection.GridSearchCV, we should make sure that all additional requirements are also fulfilled.
get_params
and set_params
to set parameters. This is important, since GridSearchCV
or BayesSearchCV call set_params
for hyperparameter optimization. sklearn.base.BaseEstimator
implements basic versions of these. The current way we handle hyperparameters should be deprecated.__init__
, but rather in fit
itself. set_params
is supposed to do exactly the same thing as __init__
with respect to parameters._
.clone
methods for each learner.Most of these changes are independent of each other and could be done using separate branches.
Currently, the scripts we use for our experiments are still part of this repository. Since this repository is moving towards being a library for object ranking and choice, this code should be moved to a separate repository.
We are using bump2version
now to change the version number and create a tagged commit. This triggers an upload of the new version to PyPi.
At the same time the HISTORY.rst
file needs to be updated with the recent changes.
This process should be documented in Sphinx.
The correct order is:
HISTORY.rst
and commit.bump2version [patch|minor|major]
We have a utility function configure_numpy_keras
which is used in some of the experiment scripts:
cs-ranking/csrank/tensorflow_util.py
Lines 40 to 58 in a635d59
It does the following:
KERAS_BACKEND
to TensorflowThere are a few issues (and maybe more) with this:
log_device_placement
is set to True
, which can cause slowdowns due to logging and should be False
by default.tensorflow_util.py
is the correct location, if the function is only ever used in experiments.scikit-optimize is currently not maintained anymore and BoTorch implements several features making it very useful for our library:
While working on fixing #126, I noticed that our online docs for choice functions are broken:
Notice that there are no details other than the name. The choice functions don't link to any additional documentation either. The docs for our other types of estimators work as expected.
The documentation at https://kiudee.github.io/cs-ranking/ returns a 404 error.
It appears our changes to travis-ci must have caused a problem and the documentation is not updated correctly.
Maintaining a running average of weights has been shown to improve generalization.
We should evaluate the effectiveness for our architecture(s).
Paper: Averaging Weights Leads to Wider Optima and Better Generalization
We have many warnings in the tests, since we accidentally escape characters in the docstrings.
Example:
cs-ranking/csrank/core/cmpnet_core.py
Lines 69 to 74 in 6c4c30c
Since the last release, we have improved the UX of our API, fixed unexpected behaviour, fixed a few bugs and did some refactoring. That's just off the top of my head, there are probably other things as well.
We should think about a new release.
See #129 (comment). We already declare many types in docstrings. Using mypy
would require us to formalize this a bit more, with the added bonus of static guarantees and better tooling support (such as enhanced tab completion).
To prevent issues linke #137 we should have a CI check that tries to import csrank
with only the minimal dependencies.
When training the models on different datasets, it would be advantageous to be able to save the model as is to a file. That way it is easy to later load the model and e.g. evaluate it on new instances etc.
We have recently added some static analysis and formatting tools. One thing we are not checking for yet is inline documentation. There are some tools out there, for example pycodestyle. It seems like pylint
has some doc checking functionality too.
I think it would be valuable to extend our static checks to the inline documentation. We already check stand-alone rst
files with doc8
..
The new Tunable class should be able to change the set of tunable parameters during runtime (currently it is a class method).
This would allow us to attach arbitrary numbers of parameters to a model (e.g. coming from callbacks etc).
The _predict_scores_fixed
method of the class AllPositive
requires X and Y inputs:
cs-ranking/csrank/choicefunction/baseline.py
Lines 23 to 24 in 49e39df
In the variadic case, it is called with X and Y:
cs-ranking/csrank/choicefunction/baseline.py
Lines 50 to 51 in 49e39df
However, it is called without Y in the predict_scores method for the non-variadic case:
This leads to a None
prediction.
The fit method parameters are:
cs-ranking/csrank/choicefunction/fate_choice.py
Lines 134 to 150 in 49e39df
However the parameters are not passed to the super class:
The current testing setup is a source of frequent problems and should be simplified.
Dataset generators like
inherit the docstring of the parent class, which is not very informative.Currently, the tests for the probabilistic models implemented in PyMC3 take a long time to run, which causes a long delay between updating a pull request and receiving Travis confirmation.
Currently, as soon as one of the parallel envs is finished, travis immediately tries to deploy to gh-pages and PyPI. This is obviously not desirable.
We should split the build into two stages 'build' and 'deploy' as described here:
https://docs.travis-ci.com/user/build-stages/matrix-expansion/
For users it is much more convenient to be able to install the package from PyPI using a simple
pip install cs-ranking
or
conda install -c conda-forge cs-ranking
rather than checking out the repository.
Currently we require quite a few dependencies, which makes installing the library difficult. I went ahead and categorized the different dependencies (see below). We should do the following:
install_requires
extras_require
Simply running
import csrank
after installing csrank
from pip without any optional dependencies, results in the following error:
csrank.util.MissingExtraError: Could not import the optional dependency theano. Please install it or specify the "probabilistic" extra when installing this package.
This should definitely not happen.
Version: 1.2.0
Currently, if the dataset generator receives an invalid dataset, it silently picks a default generator.
We should check all dataset generators/parsers for similar behavior.
A bug like #126 could have been cached by a static check for unused variables. We should think about using (at least some of) pylint
s checks.
There is an issue with the current LrScheduler. The problem is that it exponentially decreases the learning rate at each epoch after the epoch_drop.
For learning rate of 0.015 and epoch drop=5 and drop percentage =0.9
Epoch 00001: LearningRateScheduler setting learning rate to 0.014999999664723873.
Epoch 00005: LearningRateScheduler setting learning rate to 0.013499999698251487.
Epoch 00006: LearningRateScheduler setting learning rate to 0.012149999476969242.
Epoch 00007: LearningRateScheduler setting learning rate to 0.010934999864548446.
Epoch 00008: LearningRateScheduler setting learning rate to 0.009841500129550696
Epoch 00009: LearningRateScheduler setting learning rate to 0.00885734986513853.
Epoch 00010: LearningRateScheduler setting learning rate to 0.007174453390762211
Epoch 00011: LearningRateScheduler setting learning rate to 0.005811307118274272
While the output should be:
Epoch 00001: LearningRateScheduler setting learning rate to 0.014999999664723873.
Epoch 00005: LearningRateScheduler setting learning rate to 0.013499999698251487.
Epoch 00010: LearningRateScheduler setting learning rate to 0.012149999728426338.
Epoch 00015: LearningRateScheduler setting learning rate to 0.010934999755583704.
sklearn
issues a warning during the tests:
sklearn.exceptions.UndefinedMetricWarning: F-score is ill-defined and being set to 0.0 in samples with no predicted labels.
This is because
csrank/tests/test_choice_functions.py:trivial_choice_problem
have no true positivesIn both of those cases the f-measure is not properly defined. sklearn
assigns 0 and 1 respectively.
How should we deal with this? A metric should be defined for these possibilities. 0
and 1
in those cases seems somewhat reasonable, so maybe we should just silence the warning?
The current standards for specifying the build of a python project, its dependencies and the configuration of various tools is pyproject.toml. It may be a good idea to adopt this. There are some holdouts for the tools. Specifying the build system without needing to run a python program (which may have dependencies itself) is the biggest benefit.
It appears from the failed build 158 that the test
cs-ranking/csrank/tests/test_ranking.py
Line 75 in bb143bc
We should ensure, that it only fails if the underlying model is broken.
If the given regularization is not "l1" or "l2", "l2" is chosen, which seems undesirable as the parameter already has "l2" as the default value so None
values do not need to be accounted for.
cs-ranking/csrank/discretechoice/multinomial_logit_model.py
Lines 68 to 71 in 49e39df
Now that we use a linter for formatting (#78), we could also use a semantic linter. Two common options are flake8
, which is more conservative (less reports) and pylint
. We would first need to address the issues those linters raise. For example for flake8
(ignoring line length, since black
takes care of that and disagrees with flake8
s limit):
$ flake8 **/*.py | grep -v 'line too long' | wc -l
36
Fixing those will likely improve the code anyway.
Updating the version string when doing a new release is error-prone. Versioneer automatically determines the version string by querying git.
This allows us to simply tag a commit using a descriptive name like v1.1
or 1.1
and it will be applied automatically.
Being able to train the models from an instance generator would be very useful.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.