Giter VIP home page Giter VIP logo

data-science-template's Introduction

View article View on YouTube

Data Science Cookie Cutter

Why?

It is important to structure your data science project based on a certain standard so that your teammates can easily maintain and modify your project.

This repository provides a template that incorporates best practices to create a maintainable and reproducible data science project.

Tools used in this project

How to use this project

Install Cookiecutter:

pip install cookiecutter

Create a project based on the template:

cookiecutter https://github.com/khuyentran1401/data-science-template

Resources for a detailed explanation of this template:

data-science-template's People

Contributors

khuyentran1401 avatar tapyu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

data-science-template's Issues

Invalid project setup with poetry

Hey Khuyen,

Thanks for creating this easy-to-use template. I really like the simple approach.
I tried creating a template based on the instructions provided. Thus far, it works but with some issues around this line (I think):
python = "{{ cookiecutter.author_name }}" in the project.toml file.

It's referencing the author's name as opposed to the compatible_python_versions.

Refactorize Cookie-cutter command prompt to create a heavily modularized data science template

Hi @khuyentran1401 !

I just saw in cookiecutter/cookiecutter#1881 that cookiecutter finally has the feature of adding human-readable prompts to the different variables. This enables us to create a more sophisticated data science template.

My initial thoughts is to make a step further in what I did in #18 (I didn't check exactly how it looked like since you made some modifications). My initial idea is to categorize all this giant universe of Machine Learning tools regarding its functionalities (logging, orchestration, data storage, Python linter and code formatter, etc), and then list all tools so that the user may choose one. Therefore, my initial idea is: to create a heavily modularized data science template, in which the final template structure depends on the tools opted by the user, but at the same time to ensure that the directory structure don't vary too much.

However, I am not sure if you share the same goal as me. I just saw that you removed DVC, so you may have some considerations to do regarding this goal.

What do you think about it?

Angreal as an alternative to cookiecutter + Makefile

First, awesome template. I've been doing some research on what's changed on templating data science projects over the last few years and came across yours.

Wasn't sure what the best route to reach out was so thought I'd just present it to you in an issue, I built a templating engine very similar to cookie cutter but includes the ability to include python functions as plugins for a command line interface. Thought you mind find it interesting / useful if you find maintaining / extending the Makefile annoying.

https://angreal.github.io/angreal/

Again - awesome work !

Uniforming the tools used in each branch

Hi there!

Besides dvc, pip, and poetry, I've also noticed that some tools that are used in some branches are simply not used in others. I honestly didn't see the reason for that since there isn't an alternative being used for a replacement of them...

hydra flake8 prefect
dvc-poetry
dvc-pip
prefect-poetry
  • Why doesn't prefect-poetry use hydra?
  • Why don't dvc-poetry and dvc-pip use prefect or flake8?

I understood the reason of having dvc-poetry and dvc-pip as pip and poetry are two different strategies to manage packages, and therefore two different templates are needed. However, prefect-poetry seems unnecessary as prefect doesn't overlap the other tools' goals.

It seems that you started with one model and then you recreated it, thus abandoning some tools and adopting others. I am not sure... If PR are welcoming, I propose to adopt the same set of tools, unless they are conflicting or overlapping. Let me understand what is your idea so that I can collaborate with your project :)

Mkdocs template with material

Hello, have you tried mkdocs for documentation?

I think it's more beautiful and complete. Are you accepting PR for this repo?

Thanks.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.