Giter VIP home page Giter VIP logo

ds_cookie_cu12's Introduction

ds_cookie_cu12

My custom python data science cookiecutter for a poetry and pyenv user, which includes:

  1. pre-commit hooks:
    1. autopep8
    2. poetry export to sync dependencies with a requirements.txt file
  2. A local .python-version for pyenv
  3. A data Folder structure similar to kedro
  4. Optional dependencies groups for plotly
    • poetry install --with plotly
  5. Optional dependency group for rapids assuming you have cuda 12 installed and are using an NVIDIA gpu
    • poetry install --with rapids
  6. polars included

Usage

To create a new project folder that follows this cookiecutter template run:

python -m cookiecutter [email protected]:banditkings/ds_cookie_cu12.git

or if using HTTPS auth:

python -m cookiecutter https://github.com/banditkings/ds_cookie_cu12.git

Then you can navigate to the folder inside and install the library with required dependencies using:

poetry install --with dev,plotly,rapids

Project Structure

{{ cookiecutter.repo_name }}                <- Git repo name
├── .gitignore                              <- Hidden file that prevents staging of unnecessary files to `git`
├── README.md                               <- The top-level README for developers using this project
│
├── data                                    <- Store raw data, features, etc - not committed to git
│   ├── 01_raw                              <-- Raw immutable data
│   ├── 02_intermediate                     <-- Typed data
│   ├── 03_primary                          <-- Domain model data
│   ├── 04_feature                          <-- Model features
│   ├── 05_model_input                      <-- Data ready for model input
│   ├── 06_models                           <-- Serialised models
│   ├── 07_model_output                     <-- Data generated by model runs
│   └── 08_reporting                        <-- Ad hoc descriptive cuts
│
├── notebooks                               <- Jupyter notebooks. Naming convention is a number (for ordering),
│                                              the creator's initials, and a short `-` delimited description, e.g.
│                                              `1.0-jqp-initial-data-exploration`.
│
├── .pre-commit-config.yaml                 <- git pre-commit config file
│
├── .python-version                         <- Python version number for local `pyenv` environment
│
├── Makefile                                <- Makefile to quickly rerun data pipelines, etc
│
├── pyproject.toml                          <- Poetry dependency and environment file
│
├── reports                                 <- Generated analysis as HTML, PDF, LaTeX, etc
│   └── figures                             <- Generated graphics and figures to be used in reporting
│   
└── {{ cookiecutter.package_name }}         <- Source code for this project
    ├── tests                               <- All tests for this package
    ├── data                                <- Functions to ETL data and perform feature engineering
    ├── evaluation                          <- Custom metrics and evaluation criteria
    ├── experiments                         <- Experiment tracking and logging logic
    ├── modeling                            <- Helper classes to train models and use trained models to make predictions
    └── utils                               <- Generic Utility functions 

Git pre-commit hooks

The project starts off with pre-commit hooks with some autopep8 rules and trailing whitespace checks, followed by some poetry lock fixes that can take a long time to resolve dependencies.

Other project templates

ds_cookie_cu12's People

Contributors

banditkings avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.