Giter VIP home page Giter VIP logo

adversarial-validation's Introduction

Adversarial validation

The santander dir holds the scripts for the Santander competition:

distinguish_train_test.py - try to distinguish train/test set examples
validate.py - get validation AUC scores for logistic regression and random forest
predict.py - output test predictions from logistic regression and random forest

Similarly, the 'numerai' dir contains the Numerai scripts:

distinguish_train_test.py - try to distinguish train/test set examples
sort_train.py - sort training examples by their similarity to test examples
validate_sorted.py - get validation scores using for most test-like examples
predict.py - output test predictions

adversarial-validation's People

Contributors

zygmuntz avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

adversarial-validation's Issues

Running the steps in python 3

Hi!
Great code and article!
Excellent idea :)!.
Just trying the code in a jupyter notebook with python 3 and i am not sure i am doing it right.
The last error i have is.

screenshot_2016-11-02_14-35-22

Issue with running distinguish_train_test.py

hi i having issue running the above script. below is the output.

andrewcz@andrewcz-PORTEGE-Z30t-B ~/adversarial-validation/numerai $ python distinguish_train_test.py
/home/andrewcz/miniconda3/lib/python3.5/site-packages/sklearn/cross_validation.py:44: DeprecationWarning: This module was deprecated in version 0.18 in favor of the model_selection module into which all the refactored classes and functions are moved. Also note that the interface of the new CV iterators are different from that of this module. This module will be removed in 0.20.
"This module will be removed in 0.20.", DeprecationWarning)
loading...
cross-validating logistic regression...
Traceback (most recent call last):
File "distinguish_train_test.py", line 44, in
scoring = 'roc_auc', cv = 5, verbose = 1 )
File "/home/andrewcz/miniconda3/lib/python3.5/site-packages/sklearn/cross_validation.py", line 1571, in cross_val_score
for train, test in cv)
File "/home/andrewcz/miniconda3/lib/python3.5/site-packages/sklearn/externals/joblib/parallel.py", line 758, in call
while self.dispatch_one_batch(iterator):
File "/home/andrewcz/miniconda3/lib/python3.5/site-packages/sklearn/externals/joblib/parallel.py", line 608, in dispatch_one_batch
self._dispatch(tasks)
File "/home/andrewcz/miniconda3/lib/python3.5/site-packages/sklearn/externals/joblib/parallel.py", line 571, in _dispatch
job = self._backend.apply_async(batch, callback=cb)
File "/home/andrewcz/miniconda3/lib/python3.5/site-packages/sklearn/externals/joblib/_parallel_backends.py", line 109, in apply_async
result = ImmediateResult(func)
File "/home/andrewcz/miniconda3/lib/python3.5/site-packages/sklearn/externals/joblib/_parallel_backends.py", line 326, in init
self.results = batch()
File "/home/andrewcz/miniconda3/lib/python3.5/site-packages/sklearn/externals/joblib/parallel.py", line 131, in call
return [func(*args, **kwargs) for func, args, kwargs in self.items]
File "/home/andrewcz/miniconda3/lib/python3.5/site-packages/sklearn/externals/joblib/parallel.py", line 131, in
return [func(*args, **kwargs) for func, args, kwargs in self.items]
File "/home/andrewcz/miniconda3/lib/python3.5/site-packages/sklearn/cross_validation.py", line 1665, in _fit_and_score
estimator.fit(X_train, y_train, **fit_params)
File "/home/andrewcz/miniconda3/lib/python3.5/site-packages/sklearn/linear_model/logistic.py", line 1173, in fit
order="C")
File "/home/andrewcz/miniconda3/lib/python3.5/site-packages/sklearn/utils/validation.py", line 521, in check_X_y
ensure_min_features, warn_on_dtype, estimator)
File "/home/andrewcz/miniconda3/lib/python3.5/site-packages/sklearn/utils/validation.py", line 407, in check_array
_assert_all_finite(array)
File "/home/andrewcz/miniconda3/lib/python3.5/site-packages/sklearn/utils/validation.py", line 58, in _assert_all_finite
" or a value too large for %r." % X.dtype)
ValueError: Input contains NaN, infinity or a value too large for dtype('float64').
andrewcz@andrewcz-PORTEGE-Z30t-B ~/adversarial-validation/numerai $

I was wondering if you have some time to assist many thanks.
Andrew

numerai dataset

The numerai data I get from the web seems to be little wrong, if there is a link to numerai dataset will be better

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.