Giter VIP home page Giter VIP logo

pkg's Introduction

pkg

godev build

GitOps Toolkit common packages.

pkg's People

Contributors

adrien-f avatar allenporter avatar aryan9600 avatar blurpy avatar chlunde avatar darkowlzz avatar dependabot[bot] avatar developer-guy avatar dieend avatar errordeveloper avatar hiddeco avatar klausenbusk avatar laozc avatar makkes avatar matheuscscp avatar mclarke47 avatar mihaiandreiratoiu avatar nstogner avatar phillebaba avatar pjbgf avatar relu avatar sbernheim avatar seaneagan avatar somtochiama avatar souleb avatar squaremo avatar stefanprodan avatar sylr avatar uvegla avatar vishal-chdhry avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pkg's Issues

kustomize/filesys: Tests are failing on macOS

--- FAIL: Test_isSecurePath (0.00s)
    --- FAIL: Test_isSecurePath/relative_symlink (0.00s)
        fs_secure_test.go:592: 
            Expected success, but got an error:
                <*errors.errorString | 0xc0001f0ef0>: {
                    s: "path '/var/folders/fm/v67j6c7n1k5g46m38yb15yvm0000gn/T/securefs-Patwt/subdir/subsubdir/symlink' is not in or below '/var/folders/fm/v67j6c7n1k5g46m38yb15yvm0000gn/T/securefs-Patwt/subdir'",
                }
                path '/var/folders/fm/v67j6c7n1k5g46m38yb15yvm0000gn/T/securefs-Patwt/subdir/subsubdir/symlink' is not in or below '/var/folders/fm/v67j6c7n1k5g46m38yb15yvm0000gn/T/securefs-Patwt/subdir'
    --- FAIL: Test_isSecurePath/absolute_symlink (0.00s)
        fs_secure_test.go:592: 
            Expected success, but got an error:
                <*errors.errorString | 0xc0001f1120>: {
                    s: "path '/var/folders/fm/v67j6c7n1k5g46m38yb15yvm0000gn/T/securefs-Pilfs/subdir/subsubdir/symlink' is not in or below '/var/folders/fm/v67j6c7n1k5g46m38yb15yvm0000gn/T/securefs-Pilfs/subdir'",
                }
                path '/var/folders/fm/v67j6c7n1k5g46m38yb15yvm0000gn/T/securefs-Pilfs/subdir/subsubdir/symlink' is not in or below '/var/folders/fm/v67j6c7n1k5g46m38yb15yvm0000gn/T/securefs-Pilfs/subdir'

Support SSH cloning.

When using gitlab enterprise onprem, we don't allow ssh cloning and thus the bootstrapping fails to clone the repo.
If we provide --ssh-hostname the cloning still attempts over https.

could we enhance the cloning to force ssh url in the case that ssh is provided?

Git Operations Test Server

The test coverage around git operations heavily relies on gitkit for self-contained tests. For E2E tests we also rely on directly accessing SaaS providers (e.g. GitHub) to ensure that our git implementation is consistently working across all out supported platforms.

Unfortunately, different SaaS providers support a different range of crypto algorithms, Git protocol versions, etc. Our upstream dependencies may introduce changes that may break specific features with said SaaS providers.

Instead of creating E2E tests for every single major SaaS provider, we could instead create or improving the existing git server to enable us to test features we want to support, and therefore quickly understand regression before releasing changes.

Below is a brain dump of specific features that are based on different issues that users have reported, and ideally a git test server would support:

These items should enable the test coverage to extend to support each one of the points above.

Send Elastic Common Schema Compliant Logs

Our EKS cluster runs flux aside a lot of other services that are sending logs in ECS Compliant format.
We face an issue where in because flux sends errors in text format, we are not able to ingest our service logs into Elasticsearch if a flux error log has already been ingested.
This is because ECS Compliant Error logs expect the error field to be an object. Once the mapping for an index has been set by a flux error log to be text, and other type for that field will result in an error while ingestion.

It would be nice to get an option to use Elastic Common Schema format for the logs.

Thanks.

git: Create auto gitImplementation

Flux supports two git implementations: libgit2 and go-git. They do not have feature parity, which the two main problems being:

  • Shallow clones is only supported by go-git. Which results in less data being transmitted through the network at every clone operation.
  • Git V2 Protocol is only supported by libgit2. Servers that only support this protocol (e.g. Azure DevOps) won't work with go-git.

We should build a third git implementation type (a.k.a auto), which removes the need from users to understand the difference between the implementations, and always try to use whichever is the most optimal.

On a high-level, the new implementation could take into account:

  • Server fingerprint: query the SSH banner and HTTP headers.
  • Protocol Awareness: find ways to identify what Git protocol version is being used, which may be easier over HTTP scheme.

author new artifact types for OCI operations

We (w/@Dentrax) thought that one of the main goals of the OCI Artifacts1 is that it allows you to author new artifact types that you want to support within your client tooling.2 The first step of writing a new artifact type is to define a unique type for it.3 Defining a unique artifact allows various tools to know how to work with the type uniquely. When writing this, Flux's pull and push operations rely on the first layer of an image.4 As we can use client tools for registry and image operations such as crane, skopeo, etc., we have to know that we should put things to the first layer of an image, so, in today's world, we can store different things within the image layers in addition to the tarball. So, if we define Flux's artifact types for OCI, we can search them within image layers. With that, we don't need to limit ourselves to only storing things at the first layer of an image.

image

For example, OPA Rego policies can be stored on the OCI registry with their own types5; also, we have planned to do the same in Kyverno CLI6.

Footnotes

  1. https://github.com/opencontainers/artifacts

  2. https://github.com/opencontainers/artifacts/blob/main/artifact-authors.md

  3. https://github.com/opencontainers/artifacts/blob/main/artifact-authors.md#defining-a-unique-artifact-type

  4. https://github.com/fluxcd/pkg/blob/8d88cfe7078e71936258a70e111bd6a429fa2bc8/oci/client/pull.go#L65

  5. https://github.com/open-policy-agent/conftest/blob/master/internal/commands/push.go

  6. https://github.com/developer-guy/oci-kyverno/blob/master/main.go

Improve OpenSSF Scorecard Score

"The Open Source Security Foundation is a cross-industry collaboration to improve the security of open source software (OSS). The Scorecard provides security health metrics for open source projects."

As of 3rd January, fluxcd/pkg scores 6.2/10. For latest score check deps.dev or manually execute scorecard.

image

Areas to focus on:

  • Token-Permissions
  • Pinned-Dependencies
  • CII-Best-Practices
  • Fuzzing

Mark events to prevent alerting spam

Taken from fluxcd/source-controller#411

To improve observability through events, without spamming the alerting system, I propose we introduce a new eventSeverity type called trace. The trace events are emitted as Kubernetes events under Normal, these events are dropped when forwarding to notification controller. Every source reconciliation should issue a single info event that contains the source URL and revision (later on we should consider adding the commit message too for GitRepositories).

MetricRecorder: Label Metrics with Resource Labels

Proposal:
I would like to group my Flux Metrics by labels or annotations (idc). The easiest way I could think of resolving that, was by labeling the metrics with the labels of the object the metric originates from.

So I guess we would have to simple label the metrics with the ref Objects labels here:

https://github.com/fluxcd/pkg/blob/main/runtime/metrics/recorder.go

This way I could create something like this

--- 
apiVersion: kustomize.toolkit.fluxcd.io/v1beta1
kind: Kustomization
metadata:
  name: backend
  namespace: tenant1-ns1
  labels:
    metric.flux.io/group: "system"
spec:
  interval: 5m
  path: "./webapp/backend/"
  prune: true
  serviceAccountName: "test"
  sourceRef:
    kind: GitRepository
    name: webapp
    namespace: tenant1-ns2

And then query it via the label over prometheus. What do you guys think? This would greatly improve the monitoring Experience for flux resources imho. I am happy to contribute the changes if you like the idea

Create workflow to build project on MacOS

Similar to the workflow created for source-controller, a new GitHub workflow could ensure that MacOS platform is always working for development purposes:

  # Runs 'make test' on macos-10.15 to assure development environment for 
  # contributors using MacOS.
  darwin-amd64:
    runs-on: macos-10.15
    steps:
      - name: Checkout
        uses: actions/checkout@v3
      - name: Setup Go
        uses: actions/setup-go@v3
        with:
          go-version: 1.17.x
      - name: Restore Go cache
        uses: actions/cache@v3
        with:
          path: /home/runner/work/_temp/_github_home/go/pkg/mod
          key: ${{ runner.os }}-go-${{ hashFiles('**/go.sum') }}
          restore-keys: |
            ${{ runner.os }}-go-
      - name: Run tests
        run: make test

source: https://github.com/fluxcd/source-controller/blob/3dcb1420760d8aa0dfb5d496910fd3f9576aeb8a/.github/workflows/e2e.yaml#L107-L126

Depends on: #275

Dedicated module per package

The current setup solves the issue for now where we were copying over internal packages from one project to another to gain the same functionalities.

The new problem we face however is the fact that if you for example only want to make use of the ssh package, you are now automagically pulling in Git related dependencies because of the git package.

We should look into creating dedicated modules per package so that they can all have their own set of dependencies and release cycle.

Review GPG touch-points with `GitRepository`

Review the current state ahead of Bootstrap GA, including:

  • Signing verification
    • Documentation
    • Implementation
    • Test coverage
  • Signing commits
    • Documentation
    • Implementation
    • Test coverage

SSH Key signing is not yet supported. As part of the review above we should consider whether the current implementation for signing and verification are extensible enough to enable the support of SSH Key signing. Implementing SSH Key signing is not part of the scope within the milestone of this issue.

tar: Add TarGzip func

Following up from refactoring the untar pkg, we could add a .TarGzip func to tar-gzip directories.

Can we add a TarGzip function that takes a directory and generates a tgz file? We have that in the tests and I think it would be nice to offer both functions in this new package.
source: #221 (comment)

Depends on #221 being merged.

Use reconcile annotation helpers in controllers

For each project that uses the reconcileAt annotation:

  • bump its version of pkg/runtime to 0.1.2 (this picks up the new Changed predicate):
go get github.com/fluxcd/pkg/[email protected]
  • ensure api/go.mod also has pkg/apis/meta v0.1.0 (this means you can use the status struct embed), with a similar go get
  • remove the LastHandledReconcileAt field from status and embed meta.ReconcileRequestStatus inline, instead:
type ThingStatus struct {
    // ...
    meta.ReconcileRequestStatus `json:",inline"`
}
  • run make manifests to update the CRD; it should not alter the schema, only a doc string
  • run make api-docs (if there is one) to update docs. This does make a material change.
  • if there's anywhere that refers to something.GetAnnotation()[meta.ReconcileAtAnnotation], change it to use meta.ReconcileAnnotationValue(something.GetAnnotations())
  • change something.Status.LastHandledReconcileAt to something.Status.GetLastHandledReconcileRequest(), and something.Status.LastHandledReconcileAt = now to something.Status.SetLastHandledReconcileRequest(now)

Projects:

Finally:

Once all of those have been updated and a minor version released, think about changing the annotation used by the command-line tool and the notification-controller, which both write the annotation. (EDIT: I think updating the CLI should wait for a minor version release, since it's a minor breaking change in the sense that the new CLI won't work with older controllers.)

Document runtime package and APIs, provide additional helpers

Goals:

  • Address long-standing technical debt
  • Add helper functions to solve common patterns in Flux controllers (introduces conditions and patch helper packages to runtime)
  • Document ALL runtime code and related API packages to a bare minimum in a hope to drive implementation efforts

Will be resolved by #101

Support using GitLab project tokens

When bootstrapping flux with GitLab, it needs to perform various actions on the GitLab instance:

  • Creating a repository (if it doesn't exist, yet)
  • Adding a deployment key

During these operations, it will try to find the correct GitLab project using the GitLab API. With the current implementation, the search is restricted to return only groups with an access level >= 10.

When using a project access token in gitlab (from a repository that has already been manually created), the gitlab API will not list project parent groups if search is performed with access level >= 10. This will cause the bootstrapping to fail:

ubuntu@myhost:~$ flux bootstrap gitlab \
>     --owner=mygroup/mysubgroup \
>     --repository=myrepo

✗ failed to list projects, error: failed to find group named 'mygroup/mysubgroup'

The relevant code sections are:

Solution
The requirement on the access level can simply be omitted. The GitLab API will only list projects/groups the provided token can see.

External event recorder should adhere to `EventRecorder` interface

The Kubernetes core has the following record.EventRecorder interface:

// EventRecorder knows how to record events on behalf of an EventSource.
type EventRecorder interface {
	// Event constructs an event from the given information and puts it in the queue for sending.
	// 'object' is the object this event is about. Event will make a reference-- or you may also
	// pass a reference to the object directly.
	// 'type' of this event, and can be one of Normal, Warning. New types could be added in future
	// 'reason' is the reason this event is generated. 'reason' should be short and unique; it
	// should be in UpperCamelCase format (starting with a capital letter). "reason" will be used
	// to automate handling of events, so imagine people writing switch statements to handle them.
	// You want to make that easy.
	// 'message' is intended to be human readable.
	//
	// The resulting event will be created in the same namespace as the reference object.
	Event(object runtime.Object, eventtype, reason, message string)

	// Eventf is just like Event, but with Sprintf for the message field.
	Eventf(object runtime.Object, eventtype, reason, messageFmt string, args ...interface{})

	// AnnotatedEventf is just like eventf, but with annotations attached
	AnnotatedEventf(object runtime.Object, annotations map[string]string, eventtype, reason, messageFmt string, args ...interface{})
}

As this interface includes everything we need, including adding "metadata" using AnnotatedEventf, it would likely be better if our own event.Recorder would adhere to the above interface.

As a second step, the controllers.Events helper could be refactored into a "DelegatingRecorder" which delegates events to all underlying recorder implementations.

Unhandled errors from deferred file close operations

From Ada Logics

Target
./source-controller/controllers/storage.go
./source-controller/pkg/sourceignore/sourceignore.go
./kustomize-controller/controllers/kustomization_impersonation.go
./kustomize-controller/internal/sops/pgp/keysource.go

Throughout the codebase there are places where file close operations are deferred within a function where a file is being written to, e.g:

localPath := s.LocalPath(*artifact)
f, err := os.Open(localPath)
if err != nil {
  return err
}
defer f.Close()

// untar the artifact
untarPath := filepath.Join(tmp, "unpack")
if _, err = untar.Untar(f, untarPath); err != nil {
  return err
}

This can lead to undefined behaviour since any errors returned by the f.Close() operation are ignored. This can have consequences in the event where a close operation failed and the data was not yet flushed to the file, which the rest of the code will assume it to be. For a detailed discussion on this, please see here.

Recommendation
Ensure that errors from f.Close() are handled.

ssa: ResourceManager.ApplyAll fails to apply ClusterRoleBinding

We've got 2 controllers inside 1 operator (kubebuilder setup) that:

  1. create some structs using client-go's https://github.com/kubernetes/client-go/tree/master/applyconfigurations. Among those resources is ClusterRole and ClusterRoleBinding
  2. convert them to *unstructured.Unstructured using "k8s.io/apimachinery/pkg/runtime".DefaultUnstructuredConverter.ToUnstructured
  3. uses ssa.SetNativeKindsDefaults on list of those resources
  4. tries to apply them using ResourceManager.ApplyAll

There are some clusters for which this operation fails, like [email protected]+rke2r1 and kind cluster created by [email protected], default image for this version so [email protected]. They fail with error message

{
	  "level": "error",
	  "ts": 1668516799.8819587,
	  "msg": "Reconciler error",
	  ...
	  "error": "failed to apply resources to the cluster: ClusterRoleBinding/crbName namespace not specified, error: clusterroles.rbac.authorization.k8s.io \"crName\" not found"
}

A bit of debugging led me to believe that the error comes from this line

if err := m.dryRunApply(ctx, dryRunObject); err != nil {
which then goes through m.validationError(dryRunObject, err) and then gets returned here

pkg/ssa/manager_diff.go

Lines 184 to 186 in 7741c5f

if apierrors.IsNotFound(err) {
return fmt.Errorf("%s namespace not specified, error: %w", FmtUnstructured(object), err)
}

I'm not really sure what's happening, I'll try to add some code in next comment that reproduces this issue.

We've encountered this problem twice, independently in my workplace. For now we've just changed usage of ResourceManager.ApplyAll to for loop that goes through slice of those resources, which we additionally sort using ssa.SortableUnstructureds, and apply them using ResourceManager.Apply.

Resource manager is created like that:

fieldManager:="some-string"
...
ResourceManager: ssa.NewResourceManager(mgr.GetClient(),
			nil, // poller is needed only for WaitForSet, we're not using it here
			ssa.Owner{
				Field: fieldManager,
				Group: fieldManager + "-group",
			}),
...

This seems like some kind of data-race, where ClusterRole is not yet fully "commited" and ClusterRoleBinding tries to reference no-yet-existing ClusterRole, but if I understand k8s it shouldn't happen 🙈. Weird this is that this error is not going away, subsequent reconciles do not succeed.

My setup:

Inconsistent code-styles and potential nil-dereferences

From Ada Logics

Flux is composed of projects across different repositories and there is often similar logic happening across the controllers but performed in quite different ways. This leads to a more complex overall codebase and can make it difficult to reason about properties of the code.

Event recording and checking status of similar elements in the controllers is performed differently. This came up as an issue through fuzzing. Each of the controllers rely on an EventRecoder, and the way these EventRecorder variables are used differs between the controllers. Some controllers check for nil-status and others do not. The HelmRelease reconciler and the Kustomization reconciler assume that the EventRecorder is not nil in their respective event() implementations, whereas the other controls do not:

Helm Release Reconciler

https://github.com/fluxcd/helm-controller/blob/e9d31e9f1f8df5149b10ce7719b2d272f617a44f/controllers/helmrelease_controller.go#L739-L745

Kustomize Reconciler

https://github.com/fluxcd/kustomize-controller/blob/72bc54477aa89aadc40d1444d0f30b1e9963806f/controllers/kustomization_controller.go#L788-L795

Image Update Automation Reconciler

https://github.com/fluxcd/image-automation-controller/blob/bc3d7b21121851ffe75ebb9c9dcd530c38db3d4e/controllers/imageupdateautomation_controller.go#L717-L720

Git Repository Reconciler

https://github.com/fluxcd/source-controller/blob/c4d7e46b90dc48aac7d5c74def2a82e7b7ea9333/controllers/gitrepository_controller.go#L427-L432

Recommendation
The same code pattern should be used across the controllers. Through our analysis we determined the EventRecorder cannot be nil using the current main.go files and thus the nil check should be removed.

Add leader election config

Upgrade the runtime pkg to v0.10.1 and bind the leader election cmd flags:

  • source-controller
  • kustomize-controller
  • helm-controller
  • image-refactor-controller
  • image-automation-controller
  • notification-controller

[kustomize] Add a dry-run option to SubstituteVariables

We would like to add a dry-run parameter to the following func in pkg/kustomize:

func SubstituteVariables(
	ctx context.Context,
	kubeClient client.Client,
	kustomization unstructured.Unstructured,
	res *resource.Resource) (*resource.Resource, error){...}

It would become

func SubstituteVariables(
	ctx context.Context,
	kubeClient client.Client,
	kustomization unstructured.Unstructured,
	res *resource.Resource,
         dry-run bool) (*resource.Resource, error){...}

When the dry-run parameter resolve to true, we would replace the kubeAPI querys to retrieve substitutable objects by mocks.

We could then use it to enabled completely disconnected consumers (e.g flux cli in constrained environments) to use this function.

Make CI smarter about which tests to run depending on changes

Since this is a monorepo, it contains a lot of different sub packages/modules. At the moment, we run tests in CI for all packages regardless of which package is being modified. This leads to extremely long wait times and decreases developer velocity.
We should make the CI figure out which packages are affected by a PR and run tests only for relevant packages. This means running tests not only the modified package but also for packages that depend on the said modified package.

request support for the non-blocking apply method

We use fluxcd to implement gitops, one application corresponds to one yaml. In use, we found that kustomize-controller will perform dry-run verification on each yaml when executing manifest apply. Once there is a problem with yaml, it will return an error directly. As a result, other normal yaml cannot be executed. See: github.com/fluxcd/pkg/ssa/manager_apply.go:125

We expect that an application yaml has a problem without affecting other application yaml to be applied。
my implementation:
image

git: Add Support for Clone No-op for Commit pinned clones

Commit pinned clones do not support the use of LastRevision to shortcirtuit the clone operation.
This means, specially for libgit2, that full clones are executed, causing higher resource consumption.

This sparkled as part of a cncf thread in which a user reported source controller hitting 6000% CPU saturation and 3600% memory saturation with the default resources requests/limits. That resource consumption was observed with 57 sources (most are pinned to a specific git hash).

The implementation could follow similar logic as checkout for branches:
https://github.com/fluxcd/source-controller/blob/3013b5fc3df274d790d017ef98596f3e95073dd9/pkg/git/libgit2/checkout.go#L103-L120

Add support for secure hash algorithms for known_hosts

The use of SHA1 is considered insecure. From a compliance perspective, this may also be an issue for flux users seeking to be "NIST-compliant".

func hashHost(hostname string, salt []byte) []byte {
mac := hmac.New(sha1.New, salt)
mac.Write([]byte(hostname))
return mac.Sum(nil)
}

Recommendation here would be to add support to secure algorithms such as SHA256 or SHA3.

Non-blocking event recorder

The event recorder currently runs in a blocking fashion due to which it can affect the reconcilers when they emit an event and the event webhook server takes time to respond. The http client waits and retries on failure. This results in the reconciler to wait until the event is posted. Sometimes reconcilers time out waiting for the event emitting to complete, resulting in failed reconciliation. A non-blocking event recorder would help prevent the reconcilers to be affected by failure in event recording.

Following are some ideas to address this issue:

non-blocking request to the event webhook server

A quick and simple change to solve the immediate issue would be to make the http request to the event webhook server non-blocking by running it in a goroutine. This would unblock the reconciler but doesn't guarantee the ordering of the requests. If the webhook server is offline and the reconciliation is failing, the reconcilers retrying may create multiple of these goroutines which may keep retrying to post the event with back-off. Due to the variation in the back-off duration, when the webhook server becomes available, the events would be posted out of order. Since there's no de-duplication at the event source level, the webhook server will have to serve all the accumulated event requests and do de-duplication on its own. This may result in the creation of too many goroutines that are trying to do the same thing. But the goroutines can be configured to fail after certain number of attempts to ensure they get cleaned up.

per controller event processor

Another approach would be to introduce event processor per controller. The events package can provide some API to run an event processor, typically in the main.go file before setting up the reconcilers. The event recorder in the reconcilers would send event to the event processor through a buffered channel. The event processor would collect the events to be posted to the event webhook server, categorize them and process them based on certain strategies. Since all the events go through a central events processor, it can be used to add more functionalities in the event source. Order of the events can be maintained. Event de-duplication can be done at the source and spamming the event server can be avoided. If the event server isn't ready, the event processor can perform one health check and hold all the event processing. If the event buffer gets filled, it can drop certain events based on certain strategies. More interesting things can be done at the source of events centrally at the controller level.
This may be similar to the events notification broadcaster in kubernetes apimachiner https://github.com/kubernetes/apimachinery/blob/fd8a60496be5e4ce19d586738adb48ac6fa66ef9/pkg/watch/mux.go#L43 .

Some other variation of event processor could be to run event processor per reconciler or even per object and that'll create opportunities to handle the events in different ways. Like a tenant configuring the events related to their objects to be sent to their own event server which they manage.

Document abstract outline of package in `fluxcd/pkg/runtime` directory

Follow up to #117

This is a top level README in fluxcd/pkg/runtime.

This should be a markdown version in the package as a GitHub landing page, which outlines and glues together the in-depth directory contents, but doesn’t give any advice on how to use them (that will be added separately to the website). To show users code examples, link to inline code examples in the go documentation. The idea is for this to be a lighter version of what we will put on the website, in the context of runtime package only.

Contains:

  • Description (in go doc directory)
  • A more human description of what’s written in the package descriptions
  • Intro
  • List of things that are in there
  • Generic list of subjects (standards, or other things to take into account if you make use of the runtime package or contributing to Flux)
  • When we mention our standards and API conventions, we can link to the packages that help with those standards

/assign @hiddeco

Consolidate git implementations

The implementation of git operations are scattered around flux2, source-controller, image-automation-controller and this repository.
Resulting in code duplication in both core functionalities and their tests.

Consolidation steps:

Add version parameter to GitHub actions

When using the pkg GitHub actions, we should able to override the binary version e.g.:

      - name: Setup Kustomize
        uses: fluxcd/pkg/actions/kustomize@master
        with:
          version: 3.8.0

runtime/conditions/check: Remove ObservedGeneration check

When using SerialPatcher the Reconciling condition will have ObservedGeneration=Generation so we need to remove the following check:

[Check-FAIL]: The root ObservedGeneration must be less than the Reconciling condition ObservedGeneration when Reconciling condition is True

git: Requirements to move back to `go-git/go-git`

The go-git/go-git dependency has recently been replaced with fluxcd/go-git`. In order to be able to move back to the upstream project, we need:

Upstream performance improvements

Upstream behaviour changes

  • go-git/go-git: Surface a way for users to change unsupported capabilities.
  • skeema/knownhosts and/or go-git/go-git: Add support for auto-populate HostKeyAlgorithms for non file-based knownhosts.
    • go-git/go-git#548 broke backwards compatibility. fluxcd/go-git reverts some of its changes so that the latest versions of go-git/go-git continue to work with Flux.
    • The auto-population of HostKeyAlgorithms is a great feature to have and will decrease the likelihood of issues such as:
  • Add support for env proxying when using HTTPS with custom CAs

Standardise logging

From Ada Logics:

Proper logging is important to ensure audit trails in case of breaches. However, throughout the code we found inconsistency in the way logging is handled, and often when errors occur there would be no error logging.

The ImageAutomationReconciler declares the following failWithError function:

https://github.com/fluxcd/image-automation-controller/blob/674f833d982628831c101a919c0154c64c455db3/controllers/imageupdateautomation_controller.go#L149-L157

This function is consistently used to return from the Reconcile function in an appropriate manner with proper logging, e.g:

https://github.com/fluxcd/image-automation-controller/blob/674f833d982628831c101a919c0154c64c455db3/controllers/imageupdateautomation_controller.go#L224-L227

This is a good way of abstracting of common logic and also ensures logging However, this theme is never used in any of the other controllers and these controllers implement the logic quite differently, e.g.

https://github.com/fluxcd/source-controller/blob/d7afc3596bdfc3818ed8987db029bea8a461c62c/controllers/gitrepository_controller.go#L175-L200

https://github.com/fluxcd/image-reflector-controller/blob/327a6ea9fd78783e4daaa2da3cc1f40dbcca0cab/controllers/imagepolicy_controller.go#L118-L131

https://github.com/fluxcd/notification-controller/blob/8ff5f75a255d7dd0677590510ab1abf1d33b4f85/controllers/alert_controller.go#L72-L79

Recommendation
● Standardise how logging should occur.
● Use helper methods for common error handling.
● Log whenever errors occur.
● Log differently depending on how each controller exits the Reconcile functions.

Create new flag to set default hash function

The controllers use hash functions to generate checksum of artifacts, which is later used for change and anti-tampering detection.
Users would benefit by being able to set what algorithms to be used across the board, which align with their performance, security and regulation requirements.

The scope of this flag is only around checksums provided by Flux itself. It won't be able to change algorithms used for things defined elsewhere, for example, the algorithm used for Hash HostNames cannot be changed, as only SHA1 is supported.

Outstanding tasks

  • Create new flag handler (fluxcd/pkg)
  • Roll-out across projects
    • source-controller
    • helm-controller
    • kustomize-controller
    • notification-controller
    • flux2 (set default across all controllers)

Add trace log level

In fluxcd/image-automation-controller#190 I added some tracing output to the image automation controller. Since that controller uses pkg/runtime/logging, and that doesn't know about --log-level=trace, I had to put the tracing in the debug log level. It would be good to recognise the trace level, though.

  • parse --log-level=trace
  • add a const for the log level so it can be referred to in log.V(CONST).Info(...)

Unable to Pull Helm Charts from ECR Repositories

This is in response to /pull/362. That have helm charts in the root of ECR repos, and flux is unable to pull them.

The error code we now get: invalid chart reference: failed to get chart version for remote reference: could not get tags for "SERVICENAME": could not fetch tags for "oci://ACCOUNTNUMBER.dkr.ecr.REGION.amazonaws.com/SERVICENAME": GET "https://ACCOUNTNUMBER.dkr.ecr.REGION.amazonaws.com/v2/SERVICENAME/tags/list":` credential required for basic auth`

source-controller version: ghcr.io/fluxcd/source-controller:v0.32.1

This is the URL of our HelmRepository : url:oci://ACCOUNTNUMBER.dkr.ecr.REGION.amazonaws.com``

Complete error: {"level":"error","ts":"2022-12-09T20:46:20.623Z","msg":"invalid chart reference: failed to get chart version for remote reference: could not get tags for \"SERVICENAME\": could not fetch tags for \"oci://ACCOUNTNUMBER.dkr.ecr.REGION.amazonaws.com/SERVICENAME\": GET \"https://ACCOUNTNUMBER.dkr.ecr.REGION.amazonaws.com/v2/SERVICENAME/tags/list\": credential required for basic auth","name":"default-SERVICENAME","namespace":"flux-system","reconciler kind":"HelmChart","annotations":null,"error":"InvalidChartReference","stacktrace":"github.com/fluxcd/pkg/runtime/events.(*Recorder).AnnotatedEventf\n\tgithub.com/fluxcd/pkg/[email protected]/events/recorder.go:137\ngithub.com/fluxcd/pkg/runtime/events.(*Recorder).Eventf\n\tgithub.com/fluxcd/pkg/[email protected]/events/recorder.go:114\ngithub.com/fluxcd/source-controller/internal/reconcile/summarize.RecordContextualError\n\tgithub.com/fluxcd/source-controller/internal/reconcile/summarize/processor.go:56\ngithub.com/fluxcd/source-controller/internal/reconcile/summarize.(*Helper).SummarizeAndPatch\n\tgithub.com/fluxcd/source-controller/internal/reconcile/summarize/summary.go:193\ngithub.com/fluxcd/source-controller/controllers.(*HelmChartReconciler).Reconcile.func1\n\tgithub.com/fluxcd/source-controller/controllers/helmchart_controller.go:228\ngithub.com/fluxcd/source-controller/controllers.(*HelmChartReconciler).Reconcile\n\tgithub.com/fluxcd/source-controller/controllers/helmchart_controller.go:263\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\tsigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:121\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\tsigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:320\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\tsigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:273\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\tsigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:234"}

We have been running flux successfully using a custom source controller built from this (1a4b57a) commit version.

git: Test Coverage

Test coverage aimed across both git implementations pre-GA:

  • Unit tests
    • Clone
    • Commit
    • Push to same branch
    • Push to different branch
    • Force push
    • Credentials over HTTP
    • HTTP/1.1
    • HTTP Smart Server
  • E2E Tests
    • GitLab SaaS (at fluxcd/pkg)
      • Private repository via HTTPS
      • Private repository via SSH
    • BitBucket Server
      • Private repository via HTTPS
      • Private repository via SSH
    • GitHub SaaS
      • Public repository via HTTPS
      • Private repository via HTTPS (at fluxcd/pkg)
      • Private repository via SSH (at fluxcd/flux2 and fluxcd/pkg)
    • Azure DevOps SaaS
      • Private repository via HTTPS
      • Private repository via SSH (at fluxcd/flux2)
    • GitLab on Prem (at fluxcd/pkg)
      • Private repository via HTTPS
      • Private repository via SSH

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.