Giter VIP home page Giter VIP logo

data-explorer's Introduction

Data Explorer

Launch the web app here:

Data Explorer

Demo

Screenshot (59)

Analysis and Pandas Profiling

The various functions lets us undertstand the data, it's datatypes and describe the features. We can get basic details about data as well as advanced descriptive statistcs. We can check if any null values are present, if yes we have the functionality to fill them using appropriate logic. Another automation method lets us check for duplicates and lets us remove them if desired.pandas-profiling generates profile reports from a pandas DataFrame. pandas-profiling extends pandas DataFrame with df.profile_report(), which automatically generates a standardized univariate and multivariate report for data understanding. We are proviede with an option to download the pandas profile report.

For each column, the following information (whenever relevant for the column type) is presented in an interactive HTML report:

  • Type inference: detect the types of columns in a DataFrame
  • Essentials: type, unique values, indication of missing values
  • Quantile statistics: minimum value, Q1, median, Q3, maximum, range, interquartile range
  • Descriptive statistics: mean, mode, standard deviation, sum, median absolute deviation, coefficient of variation, kurtosis, skewness
  • Most frequent and extreme values
  • Histograms: categorical and numerical
  • Correlations: high correlation warnings, based on different correlation metrics (Spearman, Pearson, Kendall, Cramér’s V, Phik)
  • Missing values: through counts, matrix, heatmap and dendrograms
  • Duplicate rows: list of the most common duplicated rows
  • Text analysis: most common categories (uppercase, lowercase, separator), scripts (Latin, Cyrillic) and blocks (ASCII, Cyrilic)
  • File and Image analysis: file sizes, creation dates, dimensions, indication of truncated images and existance of EXIF metadata

Reproducing this web app

To recreate this web app on your own computer, do the following.

Create conda environment

Firstly, we will create a conda environment called dex

conda create -n dex python=3.7.9

Secondly, we will login to the eda environment

conda activate dex

Install prerequisite libraries

Download requirements.txt file

wget https://raw.githubusercontent.com/gmayuriiii/data-explorer/main/requirements.txt

Pip install libraries

pip install -r requirements.txt

Download and unzip contents from GitHub repo

Download and unzip contents from

Launch the app

streamlit run dataexplorer.py

data-explorer's People

Contributors

gmayuri1904 avatar

Stargazers

 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.