Comments (14)
The way I understand the problem (please @andronovhopf correct me if I'm wrong) is that the message does not look nice on those automated commits, it's confusing since not that many people are aware about this convention (I believe it's close to 0), it looks like some noise in the history (similar to the __test__
issue), etc.
The point is that if we can in our action somehow distinguish those w/o relaying on the message we can avoid this special mark and just use regular message.
from cml.
Hi Elle, this is because DVC-cml pushes back the update of running your dvc pipeline generating another push event in the CI, hence the use of [ci skip] which is a well known keyword to avoid your CI from running again. Dvc actually won't need it since the second time will not run the pipeline as everything will be up to date, however it still runs the previous steps.
This is how the workflow is:
- You change your experiment and push your changes.
- dvc determines if the pipeline has changed executing repro if that happens and leaving if nothing changes.
- dvc-cml pushes back your new dvc pipeline changes after executing the repro including the commit message [ci-skip] to avoid another run if possible.
- generates a report as a check or even a release if you allow
tag_prefix
parameter.
from cml.
Can it detect and skip it by the author name + commit message?
from cml.
@shcheklein not sure if I totally get it, but that might be not CI/CD standard. Could you please put an example?
Actually Github is not fully supporting [ci skip]
https://github.community/t5/GitHub-Actions/GitHub-Actions-does-not-respect-skip-ci/td-p/42834
but Gitlab does it as expected and it's very convenient to not start the ci again.
from cml.
I believe it's close to 0
[ci skip] is very well known in ci/cd 🙂 Its the regular way to do a commit and not run the pipeline.
If not used ci like gitlab will run twice until dvc repro does not have anything to do.
P.D. Interesting paper on ML applied to ci skip that I found in the search
from cml.
is very well known in ci/cd
I still think you are way over-estimating how well it is known. In 99% of cases you just don't need it. I don't remember when I saw it last time in one of the projects I've dealt with.
If not used ci like gitlab will run twice until dvc repro does not have anything to do.
that's the whole point - we can detect our own commits by other means. And [ci skip] - it'll be up to user to decide for his/her commits to use it or not.
from cml.
Indeed we can, but that won't stop the whole pipeline while gitlab or any other vendor wont run at all if the tag if set. Just for your consideration dvc-cml could be just only a step inside a big workflow that could be triggered because of the push.
from cml.
That's true. And usually I would prefer to CI checks for any commit in my master, for example - automated or not. But you are right, I can imagine cases when it's not necessary. Probably there should an ability to specify the prefix/suffix (and we have it now, right)? But in case I specify an empty one our system should rely on other signal rather than the message alone to avoid the loop.
(not directly related, but might be relevant) Here we come to the bigger point as well - the whole idea of committing something into my branch is far from ideal - like we discussed - a lot of users might be potentially confused by some automated commits (e.g. I try to push but instead have to pull, merge, rebase or whatever - this is advanced stuff). I would brainstorm other possible workflows - similar to restyled? dependa-bot and other tools?
from cml.
Probably there should an ability to specify the prefix/suffix (and we have it now, right)?
Maybe the solution would be reuse the last commit message adding skip ci? That way skip ci would be residual and more "digestible"?
I would brainstorm other possible workflows - similar to restyled? dependa-bot and other tools?
I have been experimenting with that, the idea would be a way to automatically amend the commit pulling and rebasing... like if you would have been training on your side before committing.
from cml.
Maybe the solution would be reuse the last commit message adding skip ci? That way skip ci would be residual and more "digestible"?
not sure I understand this. elaborate please?
I have been experimenting with that, the idea would be a way to automatically amend the commit pulling and rebasing... like if you would have been training on your side before committing.
rebase/amend - all destructive operations that break workflow. Take a look at how restyled operates, it can be just one of the possible ideas.
from cml.
Maybe the solution would be reuse the last commit message adding skip ci? That way skip ci would be residual and more "digestible"?
not sure I understand this. elaborate please?
Right now it comments dvc repro [ci skip]
maybe a better solution would be your_last_comment \n[ci skip]
That way ci skip should not appear in the UI and still shutdowns the CI run.
Take a look at how restyled operates
I did, but all those tools works in pull requests and not push? Also, the result of dvc repro (changed dvc files) has to be stored somewhere, right? Anyway Im trying them
from cml.
Right now it comments dvc repro [ci skip] maybe a better solution would be your_last_comment \n[ci skip]
it can create even more confusion? not sure to be honest
has to be stored somewhere, right? Anyway Im trying them
yep, they create a separate PR with their logo, nice message, explanation etc to your branch and it's up to you to merge it or not
from cml.
has to be stored somewhere, right? Anyway Im trying them
yep, they create a separate PR with their logo, nice message, explanation etc to your branch and it's up to you to merge it or not
restyled-like tool can definitely improve the workflow. However, there are still a couple of issues:
- We need temporary storage (as David said) to keep results when the action is already completed but a user has not pushed the approval/merge button.
- It actually does not solve the issue it just mitigates it - you still have to pull (after approving/merging) and the user will see the additional commit. Yes, only once instead of dozens of times but the user still has to be familiar with the complicated workflow.
- It will be more complicated to implement a full table of experiments (#34) since Git won't have all the required information. The temporary storage (1) has to be used to build the experiments table (for a given branch for example). The temporary storage adds complexity to the system.
A more holistic approach might be implemented through dvc build-cache/run-cache (iterative/dvc#1234) and pushing all the results to DVC cache without committing it to Git. So, dvc-cache will play the role of temporary storage from the above. Yes, it adds complexity as well, but it solves the problem in a better way without any workflow complications and it also solves an additional optimization problem (the initial motivation of the build cache).
Prioritization
It seems like, run-cache approach solves the problem in the best possible way. The only concern - it requires a significant change in DVC that will take time. Also, it has a single dvc-file (iterative/dvc#1871) as a prerequisite that is under development right now. It raises the question - should we wait for this feature before CI/CD release? I'll need to discuss it with core-dvc team.
from cml.
Closed since right now the workflow does not push back to remote anymore.
from cml.
Related Issues (20)
- Resource not accessible by integration Http error. HOT 4
- Invalid URL error HOT 4
- Invalid URL error HOT 3
- Error: cml comment create report.md HOT 9
- Token not found error HOT 4
- Stale secret deletion HOT 1
- Error when trying to use `latest-gpu` container inside GitHub actions workflow. HOT 1
- How to setup multiple GPU for CML runner in GCP HOT 2
- Resource not accessble while creating report
- cml runner seems to try and pull images from a quay.io repo instead of dockerhub HOT 6
- Error: URL Parsing Failed {"subject_url":"[email protected]:...\n"} HOT 2
- New Feature HOT 1
- Use of kernel 5.4 in base AWS image HOT 2
- Error: Resource not accessible by integration HOT 1
- How to set instance recreation times count (exceeded maximum number of attempts error on start up)? HOT 1
- Is it possible to provide a comment in a task that is inside an issue in gitlab? HOT 1
- Config to config output_limit in Gitlab runner
- Support bitbucket access token authorization
- Bug: cml comment --publishNative on gitlab ignores large amount of images in stacked markdown spans
- cml comment create is giving unauthorization error HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from cml.