dwave-examples / qboost Goto Github PK
View Code? Open in Web Editor NEWSolve a binary classification problem with Qboost
License: Apache License 2.0
Solve a binary classification problem with Qboost
License: Apache License 2.0
When running this example with Ocean 4 in Leap IDE, see the following message:
Installing collected packages: tabulate, scikit-learn, matplotlib
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
imbalanced-learn 0.8.1 requires scikit-learn>=0.24, but you have scikit-learn 0.23.1 which is incompatible.
Successfully installed matplotlib-3.3.4 scikit-learn-0.23.1 tabulate-0.8.9
In qboost.py, the WeakClassifiers
class is misleading because its fit
and predict
methods actually implement the AdaBoost algorithm. In demo.py, this method is then compared with AdaBoost from sklearn. In the demo.py output, "Adaboost" refers to the sklearn implementation and "Decision tree" refers to the Adaboost implementation from the WeakClassifiers
class. As far as I can tell, the only real difference between the two is that the demo runs the WeakClassifiers
AdaBoost method with a tree depth of 3, whereas the call to sklearn's AdaBoost model uses the default, which is a tree depth of 1.
Need to review the best way to address this. Some points to consider:
WeakClassifiers
Application
It is not obvious how to calculate the ROC_AUC and PR_AUC curves from classifiers WeakClassifiers, QBoostClassifier, and QboostPlus.
Proposed Solution
We could implement a prediction_proba
method for each classifier analogous to the prediction_proba
method from scikit learn.
As applied to the given demonstration problems, the QBoost algorithm does essentially nothing. AdaBoost is actually run first to pre-select a set of weak classifiers, which are then provided to the QBoost algorithm. With the current settings and demonstration problems, the results are simply that QBoost includes all of the classifiers that AdaBoost provides to it (confirm by seeing the list of "1" weights in the output for QBoost). In other words, for these problems, QBoost could be replaced by the following anti-algorithm:
Note that the screen output is misleading in terms of the method comparisons. For example, runs with the "mnist" data set often show that "QBoost" is performing better than Adaboost. This is misleading because what is labeled Adaboost is actually sklearn Adaboost running with weak decision tree classifiers at depth 1, whereas QBoost is pulling from a custom Adaboost implementation that uses weak decision tree classifiers with depth 3 (confusingly, this custom Adaboost implementation is labeled "DecisionTree" in the output; see Issue #13).
The key questions are what weak classifiers should be considered in the QBoost algorithm, and what actually is the definition of the QBoost algorithm? The README refers to the earlier 2008 paper (https://arxiv.org/pdf/0811.0416.pdf). Aside from not actually using the QBoost terminology, this paper presents the algorithm as drawing from all possible "degree 1 and 2 decision stumps" (basically decision rules using either a single variable or a product of two variables). As described, this produces a large number of variables if doing a one-shot global optimization: 930 variables for the 30-feature case and many more for the 784-feature case. Compare these numbers to the 35 variables being currently used because QBoost is being fed weak classifiers pre-selected by AdaBoost.
A different version of the method is described in the more recent 2012 paper (http://proceedings.mlr.press/v25/neven12/neven12.pdf), which introduces the name QBoost. This paper reduces the problem size by using "inner" and "outer" loops that pre-select the weak classifiers as detailed by Algorithms 1 and 2. Note that neither paper discusses using AdaBoost to pre-select the weak classifiers, and if one is going to do that, it is unclear what the motivation for QBoost is (the 2008 paper contrasts QBoost as a "global optimization" vs the "greedy" AdaBoost method, but this is nullified if we simply use AdaBoost first).
In conclusion, my suggestions are the following:
Both the 2008 and 2012 papers show improved performance relative to AdaBoost when using the same pool of weak classifiers. It would be nice to illustrate that through this demo as well.
Looks like scipy
is not available.
Running the examples and exploring the output weights. What is really happening?
It appears that nothing logically substantial is happening from D-Wave.
After running the examples, the D-Wave output weights are 1's [1 1 1...]
. That means nothing is really happening here. There's no value in this. Repeated executions result in varied outcomes ( due to RNG initializations ). However the D-wave final weights remain the same. This provides no additive value to the classic model for training AI.
The readme.md
file states that:
This code demonstrates the use of the D-Wave system to solve a binary classification problem using the Qboost algorithm.
However there's no clear alignment with the results of the program output.
python demo.py --wisc
python demo.py --mnist
There does not appear to be a mistake here. However the value of D-wave is unclear. We can consider the following:
demo.py
example isn't worthwhile and can't demonstrate D-wave/Quantum's value in AI/ML advancements. Would another example work?[1 1 1...]
. This represents no value in the model.When run with the latest version (scikit-learn==0.22.1
), the demo fails with:
$ python demo.py --mnist
======================================
Train#: 3333, Test: 1667
Num weak classifiers: 35
Tree depth: 3
Traceback (most recent call last):
File "demo.py", line 196, in <module>
clfs = train_model(X_train, y_train, X_test, y_test, 1.0)
File "demo.py", line 80, in train_model
X_train = centerer.fit_transform(X_train)
File "/usr/local/lib/python3.7/site-packages/sklearn/base.py", line 571, in fit_transform
return self.fit(X, **fit_params).transform(X)
File "/usr/local/lib/python3.7/site-packages/sklearn/preprocessing/_data.py", line 2033, in fit
.format(K.shape[0], K.shape[1]))
ValueError: Kernel matrix must be a square matrix. Input is a 3333x784 matrix.
There are several issues with the current preprocessing that is done in demo.py. Currently there are calls to both the sklearn StandardScaler
and the Normalizer
, although as currently written only Normalizer
is used (it overwrites the preceeding calls to StandardScaler
). However, Normalizer
is not appropriate in this context as it operates by row, re-scaling each individual sample separately, independently of all of the others. Generally speaking, the StandardScaler
would be more appropriate in this context (it scales by column (feature)), but in demo.py all of the code is ultimately using decision tree classifiers, so re-scaling the features would have no effect on the results. (Incidentally, the current usage of StandardScaler
in the code, even though it is overwritten by the normalizer, is not correct: the test data should be transformed via scaler.transform
, not scaler.fit_transform
, which is re-computing the transformation based on the test data).
The main takeaway is that both the normalizer and standard scaler preprocessing should be removed from demo.py, leaving a comment that preprocessing is not necessary because all of the weak classifiers are based on decision trees.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.