Giter VIP home page Giter VIP logo

Comments (6)

htahir1 avatar htahir1 commented on May 18, 2024 1

Thanks for the well thought out comments @SKRohit. Here are my answers:

  1. Yes I believe that would be enough.
  2. Every datasource is connected atleast to a data pipeline, therefore if you delete all pipelines then you will delete all datasources in essence.
  3. See above - if you delete all pipelines than there should be no datasources left as each datasource produces a data pipeline per commit.

In general, internally we are preparing a big change in the next month that will rewrite a lot of this logic and make things easier. For now, please implement as simple as possible logic that goes through pipelines and deletes their artifact and metadata stores. Please try to decouple functions as after the refactor it might still be useful! Thanks!

from zenml.

harshasridhar avatar harshasridhar commented on May 18, 2024

Hi, I'd like to work on this issue.
Please help me out with the details.

from zenml.

htahir1 avatar htahir1 commented on May 18, 2024

Thank you @harshasridhar. Thank you for the contribution, it is greatly appreciated!

Here are a few pointers:

When the user uses zenml clean the following needs to happen.

For each pipeline in the pipeline_store specified in the zenml_config, you need to delete the metadata_store and the artifact_store. Here is how:

  • Deletion of the artifact store [this can be remote or local -> using path_utils is important here. This should be simple.
  • Deletion of the metadata store [this can be remote or local] -> Local its just a sqlite so using path_utils works but remote it is a MySQL, in which case a sql DROP statement needs to be made on the specific database

Finally, the pipeline_store needs to be deleted.

For each concept above the docs go into some detail: https://docs.zenml.io . I hope thats good for a starting point but might require more discussions. Please feel free to join the slack to chat directly. Thanks again for your effort!

from zenml.

SKRohit avatar SKRohit commented on May 18, 2024

@htahir1 I am looking into this issue. And this is what I understood and what I have doubts about.

  1. Every BasePipeline object has metadata_store and artifact_store attributes so deleting those for each pipeline would be enough?
  2. Also, every BasePipeline object also has datasource attribute which is a BaseDatasource object and it has its own metadata_store and artifact_store should we consider them as well for deletion? In my opinion, I think it should be deleted separately since there is a possibility that artifact_store and metadata_store of datasources and pipelines could different let me know your thoughts.
  3. Also, should zenml clean also delete datasources whaich are not related to any pipeline?

from zenml.

htahir1 avatar htahir1 commented on May 18, 2024

#540 is addressing this now in a simpler way

from zenml.

strickvl avatar strickvl commented on May 18, 2024

This issue has been implemented now in #540 so I'm going to close this.

from zenml.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.