Giter VIP home page Giter VIP logo

Comments (14)

shcheklein avatar shcheklein commented on July 22, 2024 2

The way I understand the problem (please @andronovhopf correct me if I'm wrong) is that the message does not look nice on those automated commits, it's confusing since not that many people are aware about this convention (I believe it's close to 0), it looks like some noise in the history (similar to the __test__ issue), etc.

The point is that if we can in our action somehow distinguish those w/o relaying on the message we can avoid this special mark and just use regular message.

from cml.

DavidGOrtega avatar DavidGOrtega commented on July 22, 2024

Hi Elle, this is because DVC-cml pushes back the update of running your dvc pipeline generating another push event in the CI, hence the use of [ci skip] which is a well known keyword to avoid your CI from running again. Dvc actually won't need it since the second time will not run the pipeline as everything will be up to date, however it still runs the previous steps.

This is how the workflow is:

  • You change your experiment and push your changes.
  • dvc determines if the pipeline has changed executing repro if that happens and leaving if nothing changes.
  • dvc-cml pushes back your new dvc pipeline changes after executing the repro including the commit message [ci-skip] to avoid another run if possible.
  • generates a report as a check or even a release if you allow tag_prefix parameter.

from cml.

shcheklein avatar shcheklein commented on July 22, 2024

Can it detect and skip it by the author name + commit message?

from cml.

DavidGOrtega avatar DavidGOrtega commented on July 22, 2024

@shcheklein not sure if I totally get it, but that might be not CI/CD standard. Could you please put an example?

Actually Github is not fully supporting [ci skip]
https://github.community/t5/GitHub-Actions/GitHub-Actions-does-not-respect-skip-ci/td-p/42834
but Gitlab does it as expected and it's very convenient to not start the ci again.

from cml.

DavidGOrtega avatar DavidGOrtega commented on July 22, 2024

I believe it's close to 0

[ci skip] is very well known in ci/cd 🙂 Its the regular way to do a commit and not run the pipeline.

If not used ci like gitlab will run twice until dvc repro does not have anything to do.

P.D. Interesting paper on ML applied to ci skip that I found in the search

from cml.

shcheklein avatar shcheklein commented on July 22, 2024

is very well known in ci/cd

I still think you are way over-estimating how well it is known. In 99% of cases you just don't need it. I don't remember when I saw it last time in one of the projects I've dealt with.

If not used ci like gitlab will run twice until dvc repro does not have anything to do.

that's the whole point - we can detect our own commits by other means. And [ci skip] - it'll be up to user to decide for his/her commits to use it or not.

from cml.

DavidGOrtega avatar DavidGOrtega commented on July 22, 2024

Indeed we can, but that won't stop the whole pipeline while gitlab or any other vendor wont run at all if the tag if set. Just for your consideration dvc-cml could be just only a step inside a big workflow that could be triggered because of the push.

from cml.

shcheklein avatar shcheklein commented on July 22, 2024

That's true. And usually I would prefer to CI checks for any commit in my master, for example - automated or not. But you are right, I can imagine cases when it's not necessary. Probably there should an ability to specify the prefix/suffix (and we have it now, right)? But in case I specify an empty one our system should rely on other signal rather than the message alone to avoid the loop.

(not directly related, but might be relevant) Here we come to the bigger point as well - the whole idea of committing something into my branch is far from ideal - like we discussed - a lot of users might be potentially confused by some automated commits (e.g. I try to push but instead have to pull, merge, rebase or whatever - this is advanced stuff). I would brainstorm other possible workflows - similar to restyled? dependa-bot and other tools?

from cml.

DavidGOrtega avatar DavidGOrtega commented on July 22, 2024

Probably there should an ability to specify the prefix/suffix (and we have it now, right)?

Maybe the solution would be reuse the last commit message adding skip ci? That way skip ci would be residual and more "digestible"?

I would brainstorm other possible workflows - similar to restyled? dependa-bot and other tools?

I have been experimenting with that, the idea would be a way to automatically amend the commit pulling and rebasing... like if you would have been training on your side before committing.

from cml.

shcheklein avatar shcheklein commented on July 22, 2024

Maybe the solution would be reuse the last commit message adding skip ci? That way skip ci would be residual and more "digestible"?

not sure I understand this. elaborate please?

I have been experimenting with that, the idea would be a way to automatically amend the commit pulling and rebasing... like if you would have been training on your side before committing.

rebase/amend - all destructive operations that break workflow. Take a look at how restyled operates, it can be just one of the possible ideas.

from cml.

DavidGOrtega avatar DavidGOrtega commented on July 22, 2024

Maybe the solution would be reuse the last commit message adding skip ci? That way skip ci would be residual and more "digestible"?
not sure I understand this. elaborate please?

Right now it comments dvc repro [ci skip] maybe a better solution would be your_last_comment \n[ci skip]

That way ci skip should not appear in the UI and still shutdowns the CI run.

Take a look at how restyled operates

I did, but all those tools works in pull requests and not push? Also, the result of dvc repro (changed dvc files) has to be stored somewhere, right? Anyway Im trying them

from cml.

shcheklein avatar shcheklein commented on July 22, 2024

Right now it comments dvc repro [ci skip] maybe a better solution would be your_last_comment \n[ci skip]

it can create even more confusion? not sure to be honest

has to be stored somewhere, right? Anyway Im trying them

yep, they create a separate PR with their logo, nice message, explanation etc to your branch and it's up to you to merge it or not

from cml.

dmpetrov avatar dmpetrov commented on July 22, 2024

has to be stored somewhere, right? Anyway Im trying them

yep, they create a separate PR with their logo, nice message, explanation etc to your branch and it's up to you to merge it or not

restyled-like tool can definitely improve the workflow. However, there are still a couple of issues:

  1. We need temporary storage (as David said) to keep results when the action is already completed but a user has not pushed the approval/merge button.
  2. It actually does not solve the issue it just mitigates it - you still have to pull (after approving/merging) and the user will see the additional commit. Yes, only once instead of dozens of times but the user still has to be familiar with the complicated workflow.
  3. It will be more complicated to implement a full table of experiments (#34) since Git won't have all the required information. The temporary storage (1) has to be used to build the experiments table (for a given branch for example). The temporary storage adds complexity to the system.

A more holistic approach might be implemented through dvc build-cache/run-cache (iterative/dvc#1234) and pushing all the results to DVC cache without committing it to Git. So, dvc-cache will play the role of temporary storage from the above. Yes, it adds complexity as well, but it solves the problem in a better way without any workflow complications and it also solves an additional optimization problem (the initial motivation of the build cache).

Prioritization

It seems like, run-cache approach solves the problem in the best possible way. The only concern - it requires a significant change in DVC that will take time. Also, it has a single dvc-file (iterative/dvc#1871) as a prerequisite that is under development right now. It raises the question - should we wait for this feature before CI/CD release? I'll need to discuss it with core-dvc team.

from cml.

DavidGOrtega avatar DavidGOrtega commented on July 22, 2024

Closed since right now the workflow does not push back to remote anymore.

from cml.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.