Giter VIP home page Giter VIP logo

auditree-arboretum's Introduction

OS Compatibility Python Compatibility pre-commit Code validation Upload Python Package

auditree-arboretum

The Auditree common fetchers, checks and harvest reports library.

Introduction

Auditree Arboretum is a Python library of common compliance fetchers, checks & harvest reports built upon the Auditree compliance automation framework.

Repo content

Functionality categorization

Arboretum fetchers, checks, and Harvest reports are organized into functional grouping categories. The following categories have either been contributed to or will be contributed to in the near future. We anticipate that this list will grow as arboretum matures.

Fetchers

Please read the framework documentation for fetcher design principles before contributing a fetcher.

Fetchers must apply no logic to the data they retrieve. They must write unadulterated (modulo sorting & de-duplication) into the /raw area of the locker via the framework-provided decorators or context managers.

Fetchers must be atomic - retrieving and creating the data they are responsible for. Fetcher execution order is not guaranteed and so you must not assume that evidence already exists and is current in the locker. Use evidence dependency chaining if a fetcher depends on evidence gathered by another fetcher in order to gather its intended evidence.

Fetchers should be as fast as the API call allows. If a call is long running it should be separated into a dedicated evidence providing tool, which places data where a fetcher can retrieve it easily & quickly.

Checks

Please read the framework documentation for check design principles before contributing a check.

Checks should only use evidence from the evidence locker to perform check operations. Also, checks should not write or change evidence from the evidence locker. That is the job of a fetcher.

Jinja is used to produce reports from checks. As such each check class must have at least one associated report template in order to produce a check report. In keeping with the "DevSecOps" theme, check reports are meant to provide details on violations identified by checks. These violations are in the form of failures and warnings. They aren't meant to be used to format fetched raw evidence into a readable report. Harvest reports should be used to satisfy that need.

Harvest Reports

Harvest reports are hosted with the fetchers/checks that collect the evidence for the reports process. Within auditree-arboretum this means the harvest report code lives in reports folders throughout this repository. For more details check out harvest report development in the harvest README.

Usage

arboretum is available for download from PyPI.

Prerequisites

  • Supported for execution on OSX and LINUX.
  • Supported for execution with Python 3.6 and above.

Integration

Follow these steps to integrate auditree-arboretum fetchers and checks into your project:

  • Add this auditree-arboretum package as a dependency in your Python project.

  • The following steps can be taken to import individual arboretum fetchers and checks.

    • For a fetcher, add a fetch_<category>_common.py module, if one does not already exist, in your project's fetchers path where the <category> is the respective category folder within this repo of that fetcher. Having a separate common "category" module guards against name collisions across categories.
    • For a check, add a test_<category>_common.py module, if one does not already exist, in your project's checks path where the <category> is the respective category folder within this repo of that check. Having a separate common "category" module guards against name collisions across providers and technologies.
    • Import the desired fetcher or check class and the auditree-framework will handle the rest.

    For example to use the Abandoned Evidence fetcher from the auditree category, add the following to your fetch_auditree_common.py:

    from arboretum.auditree.fetchers.fetch_abandoned_evidence import AbandonedEvidenceFetcher
  • auditree-arboretum fetchers and checks are designed to execute as part of a downstream Python project, so you may need to setup your project's configuration in order for the fetchers and checks to execute as desired. Each category folder in this repository includes a README.md that documents each fetcher's and check's configuration.

    • In general auditree-arboretum fetchers and checks expect an org field with content that captures each fetcher's and check's configuration settings.

    For example:

    {
      "org": {
        "auditree": {
          "abandoned_evidence": {
            "threshold": 1234567,
            "exceptions": {
            "raw/path/to-evidence.json": "This is a good reason",
            "raw/path/to-evidence-2.json": "This is also a good reason"
          }
        }
      }
    }
  • Finally, for a check, be sure to add the appropriate entry into your project's controls.json file. Doing this allows you to group checks together as a control set which is useful for organizing check notifications and targeted check execution.

    For example to use the Abandoned Evidence check, add something similar to the following to your project's controls.json:

    {
      "arboretum.auditree.checks.test_abandoned_evidence.AbandonedEvidenceCheck": {
        "auditree_evidence": {
          "auditree_control": ["arboretum.auditree"]
        }
      }
    }

auditree-arboretum's People

Contributors

alfinkel avatar cletomartin avatar data-henrik avatar drsm79 avatar mic67mel avatar mlomena avatar tmishina avatar wenli200133 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

auditree-arboretum's Issues

new feature: cluster list/resource fetcher

Overview

Kubernetes resources (e.g., kubectl get pod) can be used as evidence. For example, spec of Pod, custom resource of an operator, and ConfigMap shows whether applications (pod) and kubernetes infrastructure (operator) run with correct (expected) configuration. An enterprise often uses multiple clusters operated by multiple cloud service platform (e.g., EKS of AWS, GKE of GCP, OpenShift of IBM Cloud) for its IT infrastructure. In that situation, it is not straightforward to fetch resources from the multiple clusters because their authentication/authorization mechanisms and cluster management mechanisms differ over the providers.

This issue focuses on fetching resources from multiple clusters of multiple cloud service providers. We plan to implement two fetchers; one is cluster list fetcher (per cloud service provider) and another is cluster resource fetcher.

Requirements

  • The cluster list fetcher(s) should fetch a list of cluster for vanilla kube clusters and at least one cloud service provider clusters
  • The cluster resource fetcher should fetch resources from the listed clusters

Approach

auditree_kube_design 001

To support multiple cloud service providers,

  1. cluster list fetcher is implemented per each provider
  2. cluster resource fetcher is implemented once, but login mechanism for provider is delegated to provider-specific login function

Cluster List Fetchers

  • For vanilla kube, BOM (Bill of Materials) written in an auditree config file and it will be stored into evidence locker
  • For each public cluster, the provider-specific CLI tool (e.g, eksutil for EKS, gcloud for GKE, ibmcloud for IBM Cloud) will be used to login each cloud provider, and then fetch cluster list from the provider's cluster admin API

Cluster Resource Fetcher

  • For vanilla kube, kubectl get RESOURCE_TYPE --kubeconfig path/to/kubeconfig is used to fetch resources
  • For each public cluster, the provider-specific CLI tool will be used to login each cluster, and then kubectl get RESOURCE_TYPE (neither --kubeconfig nor --token is specified because authorization token is already configured by the login command) is used to fetch resources

Security and Privacy

Cluster List fetchers

  • For vanilla kube clusters, no authorization token is used because the BOM is directly specified in an auditree config file, and thus no security concern exists.
  • For each public cluster, API key for the provider should be stored in ~/.credential to login the cluster management API of each provider. User needs to manage ~/.credentials in secure manner.

Cluster Resource fetcher

  • For vanilla kube cluster, existing kubeconfig file specified in an auditree config file is used to access the cluster. User needs to manage kubeconfig file as usual.
  • For each public cluster, kubeconfig file which is configured by login command of provider's CLI tool is used to access the cluster as similar to the list fetcher behaviour.

Test Plan

The test will be done against one public cluster service both for vanilla kube logic and public cloud logic.

Add a Zenhub workspace fetcher for GH(E) issues

Overview

Add a fetcher that will gather pipeline issues in a Zenhub workspace and gh(e) repo.

Requirements

  • see overview
  • fetcher to go in issues/fetchers/github
  • fetcher to work with Zenhub and Zenhub Enterprise
  • more details to be added...

Approach

TBD

Security and Privacy

N/A

Test Plan

TBD

Migrate evidence locker integrity f/c's

Overview

We need to migrate over the evidence locker fetchers and checks.

Requirements

  • Migrate fetchers
    • fetch locker repo details
    • fetch recent commit details
    • fetch branch protection details
  • Migrate checks
    • check repo integrity
    • check commit integrity
      • signed commits
    • check branch protection integrity
      • signed commits
    • Split signed commit integrity checks out as separate checks

Approach

TBD

Security and Privacy

N/A

Test Plan

TBD

new feature: check for compliance operator result

Overview

Compliance Operator is a tool to validate that a cluster infrastructure complies with standard such as NIST SP 800-53, HIPAA or CIS Benchmark. It performs openscap command, and the command generates result report in XML format. Compliance Operator embeds the reoprt into .spec.data of a ConfigMap resource in the cluster, and therefore a consumer of the validation result needs to parse the XML data in the ConfigMap resource to show the details of the validation result.

This issue focuses on a check which generates a report by analyzing the XML report of Compliance Operator stored in a ConfigMap resource.

Requirements

  • The check should generate a report showing compliance state of each control (identified by control ID) specified in an auditree config file linked with actual validation result (identified by XCCDF ID)
    • for example, a cluster infrastructure complies with NIST SP 800-53 control CA-3(5) if all of the following tests are PASS: xccdf_org.ssgproject.content_rule_set_firewalld_default_zone, xccdf_org.ssgproject.content_rule_configure_firewalld_ports

Approach

The check consumes ConfigMap resources fetched by cluster resource fetcher. The check extracts XML data from the ConfigMap resources, and then parses the XML to enumerate the result of each XCCDF test. Finally, the check decides whether a control is compliant or not by mapping the XCCDF results in XML to the control specified in an auditree config.

Security and Privacy

TBD

Test Plan

The test will be done against one public cluster service both for vanilla kube logic and public cloud logic.

Add repo integrity checks (and more fetchers)

Overview

We need to migrate over the current set of repo integrity checks and add additional fetchers and checks as well.

Requirements

  • Migrate repo integrity checks
  • TBD...

Approach

TBD

Security and Privacy

N/A

Test Plan

TBD

New feature: Github source code repos permissions check

Overview

Provide a check to control the access to Github repositories containing source code. Permission access should be done at the team level, and not by adding single collaborators outside of team membership.
This can be done through a check that alerts if there are single collaborators instead of teams and also alerts about forks of the source code repos.

Requirements

  • Based on the evidence collected by the existing fetcher here

For each Github repo containing source code :

  • check that all the collaborators are defined as teams and no single collaborator outside of a Github team are added to the list.
  • alert if any of the collaborators is detected to be a single user instead of a team.
  • check if there are forks of the source code repos
  • alert if there are forks of the source code repos

Approach

Implement a permissions check which includes a report template to render the results. 
Each single collaborator found in a repo will be considered a failure and reported with the following information grouped by repository:

  • GitHub User
  • A flag indicating if the single collaborator is an organization member

The permissions check reports on each fork found in a repo. Each fork found in a repo will be considered as a warning and reported with the following information grouped by repository:

  • Fork URL

As additional information the permissions check also lists the organization teams as successes.

Security and Privacy

N/A

Test Plan

TBD

Fix repo metadata filtered_content attribute collision

Overview

As of v1.14.0 of the framework the filtered_content attribute was added to all RawEvidence. RepoMetadataEvidence happens to have a property defined as filtered_content. This is causing the fetcher to error with AttributeError: can't set attribute.

Requirements

Fix the collision between the RawEvidence filtered_content attribute and the RepoMetadataEvidence filtered_content property.

Approach

  • Rename the filtered_content property in RepoMetadataEvidence to relevant_content.
  • Update all references accordingly. Both are in test_locker_repo_integrity.py.

Security and Privacy

N/A

Test Plan

  • Fetcher should run through to completion
  • Check should run

Add check results summary and python packages summary reports

Overview

Add check results summary harvest report and python packages summary harvest report.

Requirements

  • Add check_results_summary harvest report to auditree category
  • Add python_packages_summary harvest report to aidotree category

Approach

  • see req.
  • tbd

Security and Privacy

N/A

Test Plan

tbd

new feature: create OSCAL json report from compliance operator evidence

Overview

Provide a harvest report to transform Kubernetes compliance operator evidence from cluster_resource fetcher into a NIST OSCAL Assessment Results collection of Observations in JSON format.

Rationale: standardized version of evidence for multi-cloud and to facilitate creation of NIST OSCAL Assessment Results.

Requirements

  • The cluster_resource fetcher produces evidence comprising a JSON file with embedded XML in non-OSCAL format.
  • The harvest report is to produce a JSON file comprising NIST OSCAL Assessment Results Observations.
  • The harvest report is to produce an enhanced JSON file with additional Observation data when an optional oscal-metadata YAML file is specified.
  • Employ transformation technology available from compliance-trestle open source project.

Approach

Write a harvest report that consumes cluster_resource evidence and optional oscal-metadata.yaml to produce compliance_oscal_observations.json.

Steps:

  • read evidence from cluster_resource.json.
  • read enhancement data from oscal_metadata.yaml, if exists.
  • employ trestle transformer to create list of trestle Observations.
  • write trestle Observations JSON as compliance_oscal_observations.json.

Security and Privacy

N/A

Test Plan

Employ unit tests comprising representative cluster_resource.json and oscal-metadata.yaml.

Restructure f/c offerings folders

Overview

We need to simplify the organization of fetchers, checks, and reports folder layout into a more flattened set of categories and removing the notions of "provider" and "technology".

Requirements

  • Remove provider folder
  • Remove technology folder
  • Move the auditree technology to be a top level categorization as auditree folder
  • Add the following categories
    • kubernetes
    • ibm_cloud
    • chef
    • ansible
    • object_storage
    • pager_duty
    • splunk

Approach

  • Add new folder structures for all categories above
    • Add fetchers, checks, evidences, reports, templates sub-folders
    • Add README stubs for all category folders
  • Adjust current set of auditree fetchers and checks as needed

Security and Privacy

N/A

Test Plan

  • All auditree fetchers and checks should work as before
  • All unit tests should also still run as before

Migrate Python Packages fetchers and checks

Overview

We need to migrate over the Python Packages fetchers and checks.

Requirements

  • Migrate fetchers
    • fetch all current packages in virtual env
    • fetch release info for auditree-arboretum
    • fetch release info for auditree-framework
  • Migrate checks
    • check for differences between current set of packages in virtual env and last most recent set of packages (warn)
    • confirm latest arboretum release is being used (warn)
    • confirm latest framework release is being used (warn)

Approach

TBD

Security and Privacy

N/A

Test Plan

  • Fetchers should fetch the XML content and place it in the locker
  • Checks should produce a report with appropriate warnings

Add unit tests for new evidence classes

Overview

#20 unit tests were neglected for new evidence classes. They need to be added.

Requirements

Add unit tests for:

  • repo_branch_protection.py
  • repo_commit.py
  • repo_metadata.py

Approach

See req.

Security and Privacy

N/A

Test Plan

unit tests should run successfully

Check planted evidence

Overview

As evidence can be placed in the locker with plant we should have some checks, beyond abandoned evidence, against that. I think something like warnings when within a certain time period of the evidence ttl & errors when within a shorter period would be a good start.

Requirements

  • Walk all the external evidence
  • Warn if ttl is N days away (N from config, default 21 days?)
  • Error if ttl is M days away (M from config, default 7 days?)

Approach

Provide a detailed approach to satisfy all of the requirements listed in the
previous section. This level of detail may not be available at the time of
issue creation and can be completed at a later time.

Security and Privacy

Provide the impact on security and privacy as it relates to the completion of
this issue. This level of detail may not be available at the time of
issue creation and can be completed at a later time. N/A if not applicable.

Test Plan

Provide the test process that will be followed to adequately verify that the
approach above satisfies the requirements provided. This level of detail may
not be available at the time of issue creation and can be completed at a later
time.

Add an auditree evidence too large check

Overview

Similar to abandoned evidence, we need a check that flags evidence as being too large.

Requirements

  • Too large means:
    • 50MB

  • Needs to be configurable to allow for evidence exclusion.

Approach

  • I think we should add the core functionality to auditree-framework
  • The check should reside in arboretum and the exclusion logic should be in the check rather than the framework.

Security and Privacy

N/A

Test Plan

TBD

Add compliance config fetcher and check

Overview

Add compliance execution configuration fetcher and checks.

Requirements

  • Migrate the compliance execution fetcher and checks to arboretum.

Approach

  • Copy fetcher, check, and report template
  • Sanitize

Security and Privacy

N/A

Test Plan

TBD

Persistent failure report

Overview

It would be good to have a report that highlights persistent failures - checks that consistently fail for many days.

Requirements

  • configurable threshold for persistent failure, measured in time
  • report is a list of check, time period of failure pairs & last successful run (check foo has been failing for 8 days, last successful run $TIMESTAMP)
  • if we were clever, could extend to flapping tests...

Approach

Write a report in https://github.com/ComplianceAsCode/auditree-arboretum/tree/main/arboretum/auditree/reports

Security and Privacy

N/A

Test Plan

Will need some fake result data to demonstrate persistent failure.

New feature: GitHub organization, direct repo collaborators check

Overview

This issue suggests the implementation of a check to find direct repo collaborators in all the repos (or subsets of repos) of a given list of GH organizations.

Requirements

  • Direct collaborator as defined by the direct affiliation type here
  • Based on the evidence collected by the existing fetcher here
  • The check should run against the evidences of all those organizations which collaborator_types field contains the value direct in their configuration

Approach

Implement a check in permissions, including a report template to render the results.
Each direct collaborator found in a repo will be considered as a failure and reported with the following information:

  • GitHub User
  • GitHub Organization
  • Repository

Security and Privacy

N/A

Test Plan

TBD

Migrate IBM Cloud Databases fetchers with multiple resource_group_ids enhancement

Overview

We need to migrate the IBM Cloud Databases list and backups list fetchers.

Requirements

  • Should not depend on ibmcloud_tools
  • Fetcher should handle multiple resource_group_id's per account while being backward compatible

Approach

TBD

Security and Privacy

N/A

Test Plan

Local environment to be set up and evidence fetched

Add a Github issues fetcher

Overview

Add a fetcher that will gather Github (Enterprise) issues as evidence given a repo and search criteria.

Requirements

  • see overview
  • fetcher to go in issues/fetchers/github
  • fetcher will work for Github and Github enterprise
  • more details to be added...

Approach

TBD

Security and Privacy

N/A

Test Plan

TBD

New feature: GitHub organization, repo collaborators, forks and teams fetcher

Overview

Add a fetcher that will gather collaborators, forks and teams list for each repo of a Github organization.

Requirements

  • see overview
  • fetcher to go in /permissions/fetchers/github

Approach

For each Github organization specified, an evidence file is stored in
the locker containing collaborators, forks and teams for the specified repositories in the organization. The default is to
retrieve all forks from all repositories in each specified Github organization. TTL is set to 1
day.

Security and Privacy

N/A

Test Plan

N/A

Fix Org Collaborators check for missing evidence

Overview

The current behavior of the OrgCollaboratorsCheck is that it skips check processing if an org config does not contain a collaborator_type of direct. Unfortunately, this can lead to confusion if no org config is provided with a direct collaborator type. When there are no org configs containing a collaborator_type of direct the check should ERROR instead.

Requirements

  • Raise an error if there are no org configs with a direct collaborator_type.
  • Update README to include collaborator_type == "direct" wording.

Approach

  • In OrgCollaboratorsCheck.test_org_direct_collaborators, raise an compliance.utils.exceptions.EvidenceNotFoundError error if there are no orgs with a direct collaborator_type.
  • Update README to include collaborator_type == "direct" wording.

Security and Privacy

N/A

Test Plan

  • Ensure current intended functionality isn't compromised
  • Ensure that when there are no orgs with direct collaborator_type config the check errors

PyYaml version needs an update

Overview

Dependency PyYaml is declared as pyyaml<5.4 and the latest version before that upper bound is over 3y old: https://pypi.org/project/PyYAML/5.3.1/
That version also has vulnerabilities GHSA-6757-jp84-gxfx which got only fixed with 5.4.1, see commit message in kubernetes-client/python@cd15076

In general I was wondering why that dependency has an upper bound while all others use a lower bound. Was the PR review suggestion at #54 (comment) maybe a typo and it should have been pyyaml>5.4? @alfinkel @cletomartin

Requirements

N/A

Approach

N/A

Security and Privacy

N/A

Test Plan

N/A

Add an auditree empty evidence check

Overview

Similar to abandoned evidence, we need a check that flags evidence as being empty.

Requirements

  • Empty means:
    • No content
    • In the case of JSON evidence, empty means any content that resolves to False when it loaded via json.loads(). For example both {} and [] would be considered empty.
  • Needs to be configurable to allow for evidence exclusion.

Approach

  • I think we should add the core functionality to auditree-framework
  • The check should reside in arboretum and the exclusion logic should be in the check rather than the framework.

Security and Privacy

N/A

Test Plan

TBD

Use get_historical_evidence helper in checks

Overview

Once the ComplianceCheck.get_historical_evidence lands in the framework, we should change all uses of self.locker.get_evidence in the checks in this repo to self.get_historical_evidence. This will ensure that historical evidence metadata is stored as part of the report metadata and check_results.json.

Requirements

  • Update Abandoned Evidence check
  • Update Python Packages check
  • Update Locker Repo Integrity check

Approach

Switch uses of self.locker.get_evidence to self.get_historical_evidence

Security and Privacy

N/A

Test Plan

  • All checks should run as before
  • index.json for reports generated by those checks should now include historical evidence metadata
  • check_results.json should now include historical evidence metadata for those checks

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.