Giter VIP home page Giter VIP logo

rashidakanchwala / kedro Goto Github PK

View Code? Open in Web Editor NEW

This project forked from kedro-org/kedro

0.0 0.0 0.0 195.67 MB

Kedro is a toolbox for production-ready data science. It uses software engineering best practices to help you create data engineering and data science pipelines that are reproducible, maintainable, and modular.

Home Page: https://kedro.org

License: Apache License 2.0

Shell 0.58% Python 98.57% Makefile 0.11% Dockerfile 0.05% Gherkin 0.68%

kedro's Introduction

Kedro Logo Banner - Light Kedro Logo Banner - Dark Python version PyPI version Conda version License Slack Organisation Slack Archive CircleCI - Main Branch Develop Branch Build Documentation OpenSSF Best Practices Monthly downloads Total downloads

Powered by Kedro

What is Kedro?

Kedro is a toolbox for production-ready data science. It uses software engineering best practices to help you create data engineering and data science pipelines that are reproducible, maintainable, and modular. You can find out more at kedro.org.

Kedro is an open-source Python framework hosted by the LF AI & Data Foundation.

How do I install Kedro?

To install Kedro from the Python Package Index (PyPI) run:

pip install kedro

It is also possible to install Kedro using conda:

conda install -c conda-forge kedro

Our Get Started guide contains full installation instructions, and includes how to set up Python virtual environments.

What are the main features of Kedro?

Feature What is this?
Project Template A standard, modifiable and easy-to-use project template based on Cookiecutter Data Science.
Data Catalog A series of lightweight data connectors used to save and load data across many different file formats and file systems, including local and network file systems, cloud object stores, and HDFS. The Data Catalog also includes data and model versioning for file-based systems.
Pipeline Abstraction Automatic resolution of dependencies between pure Python functions and data pipeline visualisation using Kedro-Viz.
Coding Standards Test-driven development using pytest, produce well-documented code using Sphinx, create linted code with support for flake8, isort and black and make use of the standard Python logging library.
Flexible Deployment Deployment strategies that include single or distributed-machine deployment as well as additional support for deploying on Argo, Prefect, Kubeflow, AWS Batch and Databricks.

How do I use Kedro?

The Kedro documentation first explains how to install Kedro and then introduces key Kedro concepts.

You can then review the spaceflights tutorial to build a Kedro project for hands-on experience

For new and intermediate Kedro users, there's a comprehensive section on how to visualise Kedro projects using Kedro-Viz.

A pipeline visualisation generated using Kedro-Viz

Additional documentation explains how to work with Kedro and Jupyter notebooks, and there are a set of advanced user guides for advanced for key Kedro features. We also recommend the API reference documentation for further information.

Why does Kedro exist?

Kedro is built upon our collective best-practice (and mistakes) trying to deliver real-world ML applications that have vast amounts of raw unvetted data. We developed Kedro to achieve the following:

  • To address the main shortcomings of Jupyter notebooks, one-off scripts, and glue-code because there is a focus on creating maintainable data science code
  • To enhance team collaboration when different team members have varied exposure to software engineering concepts
  • To increase efficiency, because applied concepts like modularity and separation of concerns inspire the creation of reusable analytics code

Find out more about how Kedro can answer your use cases from the product FAQs on the Kedro website.

The humans behind Kedro

The Kedro product team and a number of open source contributors from across the world maintain Kedro.

Can I contribute?

Yes! We welcome all kinds of contributions. Check out our guide to contributing to Kedro.

Where can I learn more?

There is a growing community around Kedro. We encourage you to ask and answer technical questions on Slack and bookmark the Linen archive of past discussions.

We keep a list of technical FAQs in the Kedro documentation and you can find a growing list of blog posts, videos and projects that use Kedro over on the awesome-kedro GitHub repository. If you have created anything with Kedro we'd love to include it on the list. Just make a PR to add it!

How can I cite Kedro?

If you're an academic, Kedro can also help you, for example, as a tool to solve the problem of reproducible research. Use the "Cite this repository" button on our repository to generate a citation from the CITATION.cff file.

kedro's People

Contributors

921kiyo avatar ahdrameraliqb avatar andrii-ivaniuk avatar ankatiyar avatar antonymilne avatar astrojuanlu avatar carolinemlynch avatar datajoely avatar deepyaman avatar dependabot[bot] avatar dmitriideriabinqb avatar idanov avatar ignacioparicio avatar jiriklein avatar jmholzer avatar laizaparizotto avatar limdauto avatar merelcht avatar mzjp2 avatar nakhan98 avatar noklam avatar rashidakanchwala avatar sajidalamqb avatar stichbury avatar tamsanh avatar tolomea avatar tsanikgr avatar tynandebold avatar waylonwalker avatar yetudada avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.