Giter VIP home page Giter VIP logo

xuniverse's People

Contributors

sundar0989 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

xuniverse's Issues

Voting process

Can someone please explain how a vote is assigned or not assigned for each of those techniques used?
Is there a threshold in the feature importance or coefficients where it will render a vote to be assigned to a feature?

Thank you

local variable 'bins_X_grouped' referenced before assignment

xverse_titanic

I am trying to perform feature selection using xverse's VotingSelector on Titanic data set, which is a binary classification problem, and the dataset also contains categorical variables whcih i have one-hot-encoded. I am repeatedly facing this error. I am using xverse version 1.0.5 and the Python version is 3.6. Kindly help.

Multiprocessing support

Would be nice if you could specify n_jobs=-1 to f.ex. to the voting selector so that it wouldn't just run in single core.

UnboundLocalError: local variable 'bins_X_grouped' referenced before assignment

I can work around this problem by passing the entire dataframe in... along w/ the y, but I'm not sure if this is expected behavior. X implies passing dataframe without y included.

UnboundLocalError: local variable 'bins_X_grouped' referenced before assignment

df = pd.DataFrame({'x1': list(range(100)),
'x2': list(range(100)),
'x3': list(range(100)),
'y': list(range(100))})

df['y'] = np.where(df['x1']>50,1,0)

from xverse.transformer import MonotonicBinning
clf = MonotonicBinning()

X = df[['x1','x2','x3']]
y = df[['y']]

if not isinstance(X, pd.DataFrame):
print("Not a dataframe")
else:
print("Is a DataFrame")

clf.fit(X, y)


UnboundLocalError Traceback (most recent call last)
in
17 print("Is a DataFrame")
18
---> 19 clf.fit(X, y)

~/opt/anaconda3/lib/python3.7/site-packages/xverse/transformer/_binning.py in fit(self, X, y)
120
121 #apply the monotonic train function on dataset
--> 122 fit_X.apply(lambda x: self.train(x, y), axis=0)
123 return self
124

~/opt/anaconda3/lib/python3.7/site-packages/pandas/core/frame.py in apply(self, func, axis, raw, result_type, args, **kwds)
7766 kwds=kwds,
7767 )
-> 7768 return op.get_result()
7769
7770 def applymap(self, func, na_action: Optional[str] = None) -> DataFrame:

~/opt/anaconda3/lib/python3.7/site-packages/pandas/core/apply.py in get_result(self)
183 return self.apply_raw()
184
--> 185 return self.apply_standard()
186
187 def apply_empty_result(self):

~/opt/anaconda3/lib/python3.7/site-packages/pandas/core/apply.py in apply_standard(self)
274
275 def apply_standard(self):
--> 276 results, res_index = self.apply_series_generator()
277
278 # wrap results

~/opt/anaconda3/lib/python3.7/site-packages/pandas/core/apply.py in apply_series_generator(self)
288 for i, v in enumerate(series_gen):
289 # ignore SettingWithCopy here in case the user mutates
--> 290 results[i] = self.f(v)
291 if isinstance(results[i], ABCSeries):
292 # If we have a view on v, we need to make a copy because

~/opt/anaconda3/lib/python3.7/site-packages/xverse/transformer/_binning.py in (x)
120
121 #apply the monotonic train function on dataset
--> 122 fit_X.apply(lambda x: self.train(x, y), axis=0)
123 return self
124

~/opt/anaconda3/lib/python3.7/site-packages/xverse/transformer/_binning.py in train(self, X, y)
172 We still want our code to produce bins.
173 """
--> 174 if len(bins_X_grouped) == 1:
175 bins = algos.quantile(X, np.linspace(0, 1, force_bins)) #creates a new binnning based on forced bins
176 if len(np.unique(bins)) == 2:

UnboundLocalError: local variable 'bins_X_grouped' referenced before assignment

bin rate in binning

can supply a parameter to control each category rate of the train samples?
Because when the category rate is very few,the binning will be meaningless.

ValueError: The input data must be pandas dataframe. But the input provided is <class 'str'>

ValueError: The input data must be pandas dataframe. But the input provided is <class 'str'>

I tried to debug but have been unable to figure out the reasoning behind this. Here's some sample code that recreates it:

df = pd.DataFrame({'x1': list(range(100)),
'x2': list(range(100)),
'y': list(range(100))})

df['y'] = np.where(df['x1']>50,1,0)

from xverse.transformer import MonotonicBinning
clf = MonotonicBinning()

X = df[['x1','x2']]
y = df[['y']]

if not isinstance(X, pd.DataFrame):
print("Not a dataframe")
else:
print("Is a DataFrame")

clf.fit(X, y)


ValueError Traceback (most recent call last)
in
10 print("Is a DataFrame")
11
---> 12 clf.fit(X, y)

~/opt/anaconda3/lib/python3.7/site-packages/xverse/transformer/_binning.py in fit(self, X, y)
76
77 #check datatype of X
---> 78 self.check_datatype(X)
79
80 #The length of X and Y should be equal

~/opt/anaconda3/lib/python3.7/site-packages/xverse/transformer/_binning.py in check_datatype(self, X)
62
63 if not isinstance(X, pd.DataFrame):
---> 64 raise ValueError("The input data must be pandas dataframe. But the input provided is " + str(type(X)))
65 return self
66

ValueError: The input data must be pandas dataframe. But the input provided is <class 'str'>

KeyError: 'UDP_chksum'

image

i change
bins = algos.quantile(X, np.linspace(0, 1, force_bins)) #creates a new binnning based on forced bins
to
bins = X.quantile(np.linspace(0, 1, force_bins)).values # creates a new binnning based on forced bins

Transform issue on WOE


TypeError Traceback (most recent call last)
in
----> 1 clf.transform(X).head()

~/.conda/envs/most-reg/lib/python3.7/site-packages/xverse/transformer/_woe.py in transform(self, X, y)
308 Estimator has to be fitted to apply transformations.")
309
--> 310 outX[new_column_name] = tempX.replace(self.woe_bins[original_column_name])
311
312 #transformed dataframe

~/.local/lib/python3.7/site-packages/pandas/core/series.py in replace(self, to_replace, value, inplace, limit, regex, method)
4567 limit=limit,
4568 regex=regex,
-> 4569 method=method,
4570 )
4571

~/.local/lib/python3.7/site-packages/pandas/core/generic.py in replace(self, to_replace, value, inplace, limit, regex, method)
6490
6491 return self.replace(
-> 6492 to_replace, value, inplace=inplace, limit=limit, regex=regex
6493 )
6494 else:

~/.local/lib/python3.7/site-packages/pandas/core/series.py in replace(self, to_replace, value, inplace, limit, regex, method)
4567 limit=limit,
4568 regex=regex,
-> 4569 method=method,
4570 )
4571

~/.local/lib/python3.7/site-packages/pandas/core/generic.py in replace(self, to_replace, value, inplace, limit, regex, method)
6536 dest_list=value,
6537 inplace=inplace,
-> 6538 regex=regex,
6539 )
6540

~/.local/lib/python3.7/site-packages/pandas/core/internals/managers.py in replace_list(self, src_list, dest_list, inplace, regex)
612 mask = ~isna(values)
613
--> 614 masks = [comp(s, mask, regex) for s in src_list]
615
616 result_blocks = []

~/.local/lib/python3.7/site-packages/pandas/core/internals/managers.py in (.0)
612 mask = ~isna(values)
613
--> 614 masks = [comp(s, mask, regex) for s in src_list]
615
616 result_blocks = []

~/.local/lib/python3.7/site-packages/pandas/core/internals/managers.py in comp(s, mask, regex)
606
607 s = com.maybe_box_datetimelike(s)
--> 608 return _compare_or_regex_search(values, s, regex, mask)
609
610 # Calculate the mask once, prior to the call of comp

~/.local/lib/python3.7/site-packages/pandas/core/internals/managers.py in _compare_or_regex_search(a, b, regex, mask)
1966 result = tmp
1967
-> 1968 _check_comparison_types(result, a, b)
1969 return result
1970

~/.local/lib/python3.7/site-packages/pandas/core/internals/managers.py in _check_comparison_types(result, a, b)
1934
1935 raise TypeError(
-> 1936 f"Cannot compare types {repr(type_names[0])} and {repr(type_names[1])}"
1937 )
1938

TypeError: Cannot compare types 'ndarray(dtype=object)' and 'Interval'

pandas version - 1.1.0

Add random state

It's better to add random_state param to VotingSelector for ExtraTreesClassifier and RandomForestClassifier, to be able reproduce results.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.