Giter VIP home page Giter VIP logo

galeritas's People

Contributors

ariel-creditas avatar aureliosaraiva avatar brunobelluomini avatar joaooanselmo avatar joaooarthur avatar luoliveira avatar vivianyamassaki avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

galeritas's Issues

[Bug] plot_ks_classification - Redundant title and subtitle

Describe the bug

The title and subtitle are generated based on plot_title parameter, making the information redundant.

Expected Results

Only one title would be necessary.

Actual Results

The title is appearing two times:

Screenshot from 2021-12-23 14-50-54

Steps/Code to Reproduce

from galeritas.plot_ks_classification import plot_ks_classification

titanic = pd.read_csv('../../tests/data/titanic.csv')

plot_ks_classification(titanic['predict_proba'], titanic['survived'], plot_title='Título repetido')

Your environment

Packages: galeritas: 0.1.3, matplotlib: 3.4.0, seaborn: 0.11.1
Python: 3.8
OS: Ubuntu 18.04.6 LTS

Any possible solutions?

We could remove the subtitle and maintain only the title.

Any other comments?

No response

[Feature Request] Incluir parâmetro de ordenação em gráficos

Context

As vezes queremos ordenar algum eixo (em geral o eixo X) de acordo com o valor passado. As vezes é uma variavel numerica, outras vezes categórica.

Describe your proposed solution

Incluir um parâmetro de ordenação (o proprio seaborn ja tem esse parâmetro, e podemos nos aproveitar dele) para receber uma lista ou dicionario indicando a ordem desejada do eixo.

Describe alternatives you've considered

No response

Any other comments?

Atenção: Levar em conta precisaremos tbm reordenar figuras e labels ligadas ao eixo. Por exemplo, os circulos de % populacional no grafico bar_plot_with_population_distribution

[Feature Request] Include parameter normalize and scale X-axis when `predictions` outside [0, 1]

Context

Quando passamos uma coluna predictions com previsoes fora do eixo [0, 1] precisamos normalizar os dados para esse intervalo e também arrumar o eixo X da calibration_curve

Describe your proposed solution

  • utilizar o parametro normalize da sklearn.calibration_curve utilizada no plot
  • corrigir o eixo X para se adequar aos dados normalizados

Describe alternatives you've considered

No response

Any other comments?

No response

[Feature Request] Incluir opções de gráfico via kwargs (Calibragem, mas serve para outros)

Context

Hoje estamos definindo os parametros do grafico (legenda, titulos etc..) como parametros da função de cada grafico. Alguns tem uma série de opções, outros são mais limitados (ex: Calibragem).

Describe your proposed solution

Incluir parametros de configuração do matplotlib via **kwargs em todos os graficos, tornando-os mais personalizaveis sem complicar demais a definição de parametros.

Describe alternatives you've considered

No response

Any other comments?

No response

[Feature Request] Plot within For Loop

Context

It is not possible to plot the same graph within a for loop (on jupyter).

example:

for some_feature in feature_list:
    stacked_percentage_bar_plot(
        my_df,
        categorical_feature=some_feature,
        hue=target_feature)

Won't produce any plots.
Could it be related to having a return fig at the end of each plot function?

Describe your proposed solution

We just need to make sure that at each call for the function a graph is plotted. Should work within a loop or not.

Describe alternatives you've considered

No response

Any other comments?

No response

[Bug] Valores diferentes de 0 e 1 para o 'target' no grafico da calibragem

Describe the bug

atualmente é possivel passar valores diferentes de 0 e 1 para o target. Não temos certeza se é realmente um Bug, pois não está claro o que o eixo y está representando.

Expected Results

.

Actual Results

No exemplo abaixo, o y_true é um array de 1's e 2's.
image

Porém, o 'mean target value' (eixo y) se mantém no intervalo 0,1. Porque?

Esperavamos que para cada 'bin' (eixo x) fosse calculado a média do target, que nesse caso devia ser sempre maior do que 1.

Steps/Code to Reproduce

import pandas as pd
import numpy as np
from galeritas.plot_calibration_and_distribution import plot_calibration_and_distribution

data = pd.read_feather('../../../abc-score/data/i0/analysis/processed_data-exp03-test-predictions.feather')
data['y_true_toy'] = np.where(data['generic_default_6m'] == 0, 2, data['generic_default_6m'])
plot_calibration_and_distribution(data=abc_score, target='y_true_toy', predictions='y_pred', strategy='quantile', n_bins=100)

Your environment

pandas=1.1.5
numpy=1.20.3
matplotlib=3.5.0
seaborn=0.11.2

OS: Ubuntu 20.04

Any possible solutions?

No response

Any other comments?

No response

[Feature Request] Subplotting with Galeritas custom graphs

Context

Since the figure and axis object are created within each function, like:

stacked_percentage_bar_plot example
fig, ax = plt.subplots(figsize=figsize, dpi=120)

We can't produce subplots using Galeritas graphs.

Describe your proposed solution

We could include some parameters to allow any graph to be displayed as one figure or part of a customized subplot.

Describe alternatives you've considered

No response

Any other comments?

No response

[Feature Request] plot_ks_classification - add new parameters to customize plot

Context

The X-axis label can be customized as we can see below, but right now it's not possible to do the same with the Y-axis:

Screenshot from 2021-12-23 19-53-20

Describe your proposed solution

  • Add a y_label parameter to allow the user to pass a customized y-axis label (por example, maybe the person wants the label to be in Portuguese).

Describe alternatives you've considered

No response

Any other comments?

No response

[Documentation improvement] All plots

Describe the documentation issue

Os graficos só possuem a doscstring dos parametros. Para pessoas de fora pode ser um problema para entender o que cada grafico realmente faz e quais seus potenciais usos.

Suggest a potential alternative/fix

Incluir no readme do projetos descrições dos gráficos e seus usos. Incluir essa mesma descrição na doscstring do grafico.

Any other comments?

No response

[Feature Request] Rotate x-axis labels.

Context

It would’ve be interesting to allow the user to rotate the x-axis labels, to fit extended strings like full dates (yyyy-mm-dd).

Current default plot:

output3

With x-axis label rotation (45 degrees):

example_image

Describe your proposed solution

Include a new parameter to choose the rotation degree of x-axis label.

Describe alternatives you've considered

No response

Any other comments?

No response

[Feature Request] Incluir opção de subplot grafico da calibragem

Context

Não existe essa opção. Porém, como são dois graficos, temos que descartar a distribuição dos bins e só plotar a curva de calibração, acho que é o que faz mais sentido.

Describe your proposed solution

Incluir parametro 'axes' para utilizar uma figura ja pronta (ex: plt.subplots()) para plotar os graficos.

Describe alternatives you've considered

No response

Any other comments?

No response

[Bug] plot_ks_classification - Rounded values for X axis when using min_max_scale parameter

Describe the bug

When using the min_max_scale, there are some cases when the X axis values are rounded to 1.

Expected Results

The X axis ticks should shows the correct scaled values.

Actual Results

The values were rounded to 1:

Screenshot from 2021-12-23 19-41-22

Steps/Code to Reproduce

from galeritas.plot_ks_classification import plot_ks_classification

titanic = pd.read_csv('../../tests/data/titanic.csv')

plot_ks_classification(titanic['predict_proba'], titanic['survived'], min_max_scale = (0.8, 1))

Your environment

Packages: galeritas: 0.1.3, matplotlib: 3.4.0, seaborn: 0.11.1
Python: 3.8
OS: Ubuntu 18.04.6 LTS

Any possible solutions?

No response

Any other comments?

Maybe the parameter mix_max_scale name is confusing what the parameter was made for.
The original and scaled values was intended to be printed?

[Feature Request] Zoom in no grafico de calibragem

Context

Pode ser que queiramos dar um 'zoom in' em parte da curva de calibragem/distribuição.

Atualmente só é possivel fazer isso por meio do parametro x_lim, mas que se restringe ao intervalo (0, x), ou seja, nao podemos pegar um intervalo que não contenha o 0.

Describe your proposed solution

Fazer do x_lim uma tupla, permitindo esse tipo de personalização fora do intervalo (0, x).

Describe alternatives you've considered

No response

Any other comments?

No response

[Feature Request] All plots - customize font size

Context

It would've be interesting to allow the user to change the font size of the labels, since one can find it too small:

Screenshot from 2021-12-23 19-41-22

Describe your proposed solution

  • Find some way to allow the user to customize the labels font size.

Describe alternatives you've considered

No response

Any other comments?

No response

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.