safedep / vet Goto Github PK

View Code? Open in Web Editor NEW

178.0 178.0 15.0 8.74 MB

Tool to achieve policy driven vetting of open source dependencies

License: Apache License 2.0

Go 95.41% Makefile 0.86% Dockerfile 0.17% Python 2.94% Shell 0.62%

devsecops policy-as-code security software-composition-analysis supply-chain-security

vet's People

Contributors

Stargazers

Watchers

Forkers

madhuakula securestep9 magnologan hemanthkumarak strongjz shivamsk prajalk acumenix dverbeeck josegron syllogy websecresearch sumodgeorge tvhoang2012 shivaramcognativ

vet's Issues

Support Exception Management Workflow

Requirement

As a security engineer, I want to adopt vet in CI/CD as a security gate to prevent introduction of new packages that violate my policy so that I can prevent increasing security & technical debt while I work on mitigating the existing problem

This basically means we need to provide a way to add the existing findings at the time of adoption into an exception list. The filters can ignore packages in exception list to prevent vet failing in CI for existing packages.

Proposed Workflow

Use vet query command to generate exception list

vet query --from /path/to/json-dump --exception-add --exception-file /path/to/exceptions.yml

Subsequently, when filter query is executed, the packages in exception list should be ignored. vet should also load a default exceptions file if available from:

$PWD/.vet/exceptions.yml
$scanDirectory/.vet/exceptions.yml

CI Integration

The generated exceptions.yml should be pushed to repository in a standard path for autoload such as .vet/exceptions.yml or explicitly passed as param during scan

vet scan -D /path/to/repo --exception-file /path/to/exception.yml --filter '...' --filter-fail

Support PURL Data Source for Single Package Scanning

Package URL is a standard for representing a package using an URL notation. Example:

pkg:github/package-url/purl-spec@244fd47e07d1004f0aed9c
pkg:golang/google.golang.org/genproto#googleapis/api/annotations
pkg:maven/org.apache.xmlgraphics/[email protected]?packaging=sources
pkg:maven/org.apache.xmlgraphics/[email protected]?repository_url=repo.spring.io%2Frelease
pkg:npm/%40angular/[email protected]
pkg:npm/[email protected]

Having the ability to scan package URL is useful for vetting single packages. Example:

vet scan --purl pkg:npm/[email protected] --purl pkg:maven/org.apache.xmlgraphics/[email protected]

We need to support a list of PURL as input and scanning them as a single virtual manifest.

This should be implemented as a reader implemented in #53

Support package.json, yarn.json to extract dependecies in case lockfile is not available

Many projects especially legacy ones, still use package.json or yarn.json without lock files. As a result, Vet tool does not detect the dependencies and vulnerabilities too.

Migrate to Using buf For Protocol Buffers Spec Management

Problem

Our proto3 spec management is poor and causes developer friction because we are not using any package manager to manage external proto files such as https://buf.build/

Requirement

Adopt https://buf.build/ for managing proto files in api/. May be refactor all proto files to api/proto

Handle API Error 401 Unauthorized

Insight API will return 401 in case of bad or expired API Key. Handle and print correct error message from backend

Vet json report Protobuf lib has issue that for some of the vulnerabilities, title is empty

whenever id starts with PYSEC-***, the title is empty. otherwise it is not.

023-11-21T10:21:25.896+0530 DEBUG vet/vet2events.go:139 Found vuln with empty title id:"PYSEC-2022-19" aliases:"BIT-2022-22818" aliases:"BIT-django-2022-22818" aliases:"CVE-2022-22818" aliases:"GHSA-95rw-fx8r-36v6" {"service": "sd-github-app", "l": "zap"} 2023-11-21T10:21:25.896+0530 DEBUG vet/vet2events.go:139 Found vuln with empty title id:"PYSEC-2022-190" aliases:"BIT-2022-28346" aliases:"BIT-django-2022-28346" aliases:"CVE-2022-28346" aliases:"GHSA-2gwj-7jmv-h26r" {"service": "sd-github-app", "l": "zap"} 2023-11-21T10:21:25.896+0530 DEBUG vet/vet2events.go:139 Found vuln with empty title id:"PYSEC-2022-191" aliases:"BIT-2022-28347" aliases:"BIT-django-2022-28347" aliases:"CVE-2022-28347" aliases:"GHSA-w24h-v9qh-8gxj" {"service": "sd-github-app", "l": "zap"} 2023-11-21T10:21:25.896+0530 DEBUG vet/vet2events.go:139 Found vuln with empty title id:"PYSEC-2022-2" aliases:"BIT-2021-45116" aliases:"BIT-django-2021-45116" aliases:"CVE-2021-45116" aliases:"GHSA-8c5j-9r9f-c6w8" {"service": "sd-github-app", "l": "zap"} 2023-11-21T10:21:25.896+0530 DEBUG vet/vet2events.go:139 Found vuln with empty title id:"PYSEC-2022-20" aliases:"BIT-2022-23833" aliases:"BIT-django-2022-23833" aliases:"CVE-2022-23833" aliases:"GHSA-6cw3-g6wv-c2xv" {"service": "sd-github-app", "l": "zap"} 2023-11-21T10:21:25.896+0530 DEBUG vet/vet2events.go:139 Found vuln with empty title id:"PYSEC-2022-213" aliases:"BIT-2022-34265" aliases:"BIT-django-2022-34265" aliases:"CVE-2022-34265" aliases:"GHSA-p64x-8rxx-wf6q" {"service": "sd-github-app", "l": "zap"} 2023-11-21T10:21:25.896+0530 DEBUG vet/vet2events.go:139 Found vuln with empty title id:"PYSEC-2022-245" aliases:"BIT-2022-36359" aliases:"BIT-django-2022-36359" aliases:"CVE-2022-36359" aliases:"CVE-2022-45442" aliases:"GHSA-2x8x-jmrp-phxw" aliases:"GHSA-8x94-hmjh-97hq" {"service": "sd-github-app", "l": "zap"} 2023-11-21T10:21:25.896+0530 DEBUG vet/vet2events.go:139 Found vuln with empty title id:"PYSEC-2022-3" aliases:"BIT-2021-45452" aliases:"BIT-django-2021-45452" aliases:"CVE-2021-45452" aliases:"GHSA-jrh2-hc4r-7jwx" {"service": "sd-github-app", "l": "zap"} 2023-11-21T10:21:25.896+0530 DEBUG vet/vet2events.go:139 Found vuln with empty title id:"PYSEC-2022-304" aliases:"BIT-2022-41323" aliases:"BIT-django-2022-41323" aliases:"CVE-2022-41323" aliases:"GHSA-qrw5-5h28-6cmg" {"service": "sd-github-app", "l": "zap"}

Other example

2023-11-21T10:21:25.897+0530 DEBUG vet/vet2events.go:128 Found vuln id:"GHSA-72xf-g2v4-qvf3" title:"tough-cookie Prototype Pollution vulnerability" aliases:"CVE-2023-26136" severities:{type:CVSSV3 score:"CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:L/I:L/A:N" risk:MEDIUM} {"service": "sd-github-app", "l": "zap"} 2023-11-21T10:21:25.897+0530 DEBUG vet/vet2events.go:128 Found vuln id:"GHSA-wgfq-7857-4jcc" title:"Uncontrolled Resource Consumption in json-bigint" aliases:"CVE-2020-8237" severities:{type:CVSSV3 score:"CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:H" risk:HIGH} {"service": "sd-github-app", "l": "zap"} 2023-11-21T10:21:25.897+0530 DEBUG vet/vet2events.go:128 Found vuln id:"GHSA-gwg9-rgvj-4h5j" title:"Code Injection in morgan" aliases:"CVE-2019-5413" severities:{type:CVSSV3 score:"CVSS:3.0/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H" risk:CRITICAL}{"service": "sd-github-app", "l": "zap"}

Vet Crash on one of the SBOM generate from Github

Attached screenshot for the reference

Comprehensive SBOM and Dependencies detection

Vet should detect dependencies in a comprehensive manner as compared to other open source tools such as cdxgen

Show Package Information in Markdown Report

Currently we do not have a good way to visualize all information about a package. While showing this in markdown report is possible, it would be a problem if there are a lot of packages.

We need to identify the best way to show meta information, including vulnerability information of packages in a meaningful and useful manner

Configurable ignoreable directory

Currently ignorable directories are hard coded, it be nice if they were either

exported to a yaml file, so that anyone can add to it and compile and run
Or extendable via command line

For eg: It considers node_modules but does not yet consider python venv. It considers dir for git but not for svn

[npm] Recommended action is for tertiary dependencies which cant be touched.

Ran it over an npm codebase. and the recommendations are specifically pointing to packages that are 3 or 5 level deep in dependencies.

we should be marking them as such that its not a direct dependency.
we should be de prioritizing them or rolling them up on the top level dependency that this has following faults.

Support CSV Output for Filter Results

Show Risk Tag for Each Upgrade Advice

Problem

The current summary table as well as markdown report generator sorts the table by Risk Score. This is causing confusion since Risk Score is undefined. Also it is not known why the risk score came up to be high for a few packages.

Proposed Solution

Associate tags with each package during risk score calculation to help identify what type of risks (vulnerability, popularity, version drift) has caused the risk score.

Support Scanning Dependency Changes in Pull Request

Problem

It may be desirable to use vet to build a security guardrail to prevent introducing new insecure dependencies while the existing ones (backlog) being worked up. To achieve this, we need the ability to scan only dependency changes across two version of code base.

Solution

We can implement a package manifest reader specifically to read diff of dependencies across two versions of code. This can be implemented by:

Using native git commands to find diff in package manifests across base and head branch
Using platform specific APIs

Github Dependency Graph API can be used to fetch dependency changes across base and head branches in a PR.

Analyse Integration of OSV and Deps.dev API to Decouple vet from SafeDep Backend

Current State

vet is currently dependent on Insights API for package enrichment. This requires an API key.

Proposed State

Based on user feedback, we want to support an option to use vet without strongly coupling with SafeDep Insights API (backend). This will help ease adoption by removing the must-have need to get an API Key

This is partly possible using OSV and Deps.dev API directly

Goal

Explore feasibility of using Deps.dev and OSV API directly and identify tasks required to support this as an optional feature incrementally. We will start with following experience

vet auth configure --community

Subsequent scans will use public data sources directly.

Use stderr for UI related display to allow dumping only output to file

Refactor Package Manifest Parsing into a Reader Interface

Currently different operations are performed to read package manifests from:

Directory
Lockfiles
JSON Dump

In future, we may need to be able to read from SBOM (SPDX, CycloneDX). To be able to ensure separation of concerns, we should

Define a reader interface for reading package manifests
Have implementation for different sources of package manifest

Incorrect Package Ecosystem in SPDX SBOM Scanning

SPDX SBOM scanning, such as, what is used while scanning Github Dependency Insight API output (SPDX SBOM data) results in incorrect package ecosystem detection. In this case, a Manifest type is considered to be SPDX but Package ecosystem type should be based on detected ecosystem and should not be same as Manifest's ecosystem (SPDX)

Example:

vet scan --github https://github.com/safedep/vet --report-json /tmp/vet.json

cat /tmp/vet.json| jq '.packages[5]'
{
  "package": {
    "ecosystem": "SpdxSBOM",
    "name": "postcss-minify-params",
    "version": "5.1.4"
  },
  "manifests": [
    "55b2cc8afd01bd61"
  ]
}

Summary Report has Duplicate Findings

Consider upgrading the following libraries for maximum impact:

┌──────────────────────────┬───────────┬────────┐
│ PACKAGE                  │ UPDATE TO │ IMPACT │
├──────────────────────────┼───────────┼────────┤
│ [email protected]       │ 39.0.1    │ 13     │
│  vulnerability   drift   │           │        │
├──────────────────────────┼───────────┼────────┤
│ [email protected]       │ 39.0.1    │ 13     │
│  vulnerability   drift   │           │        │
└──────────────────────────┴───────────┴────────┘

Evaluate ko for Building vet Container Image

https://github.com/ko-build/ko

What is the benefit?
Is there any trade-off?
Update container release github action to use ko

Support Cross Compilation of Go Releaser

The goreleaser-action is broken due to CGO_ENABLED=1.
Examples: https://github.com/safedep/vet/actions/workflows/goreleaser.yml

This is because we want to cross-compile for MacOS which needs native compiler tool chain when CGO_ENABLED=1. We need CGO because we are now using Tree Sitter for static code analysis (parsing).

We need to explore using goreleaser cross compilation tool chain or look at using Github Action's MacOS build environment.

Dependency Tree Resolution

Requirements

To be able to accurately analyse Open Source dependencies for a project, it is important to built an accurate dependency tree to identify all external (OSS) components that will be included in the final deployable artifact (build output). This list of components need to be hierarchical (tree) to maintain the knowledge of direct and transitively introduced components. This is required for effective querying because from a remediation perspective, direct dependencies are what matters even if a transitive dependency has a critical issue.

Problem

Lockfile (such as package-lock.json, gradle.lockfile) etc. base dependency identification is easy and gives coverage of all components that may be included in the build. However most lockfiles do not retain the direct / transitive relationship of dependencies. They appear as flat list of dependencies.

Package manifests like pom.xml, build.gradle etc. when resolved by the appropriate package manager, internally builds a dependency tree and can optionally write it as output as well. But parsing such package manifests is complex and is very specific to package managers in terms of behavior spec.

What is needed

We need to explore the right approach that will allow us to:

Accurately resolve OSS dependencies of an app while retaining the relationships
Aim to be as runtime independent as possible i.e. do not assume that a package manager (maven, gradle etc.) is available

Show Filter Failure Results in Markdown Report

Markdown report should include packages that have identified as violation using filter query or filter suites.

Show Usage when vet auth is invoked without required parameters

Show usage along with error msg when vet auth is invoked without required params

Generate SARIF Report for Integration with Github Code Scanning

Explore Fury.io for Publishing OS Native Packages

https://gemfury.com/

Explore (Open) VEX Statement Generation

What is VEX

Vulnerability Exploitability eXchange (VEX) is a form of a security advisory where the goal is to communicate the exploitability of components with known vulnerabilities in the context of the product in which they are used.

https://cyclonedx.org/capabilities/vex/

Why do we need VEX

Security scanners will detect and flag components in software that have been identified as being vulnerable. Often, software is not necessarily affected as signaled by security scanners for many reasons such as: the vulnerable component may have been already patched, may not be present, or may not be able to be executed. To turn off false alerts like these, a scanner may consume VEX data from the software supplier.

https://github.com/openvex/spec#about-vex

VEX in `vet` Context

vet is a tool intended to identify OSS dependencies and subsequently identify risks in such dependencies using configured policies. While generating SBOM is not an absolute requirement, vet can do that using its data model.

From a user's perspective, it may be useful to continuously generate SBOM using vet and maintain an inventory of SBOMs associated with each release of a software component. This may also be useful in audit use-cases, where an auditor persona uses vet to generate an SBOM for an application which in turn is used for risk assessment.

In this context, it may be very useful for such user persona to generate VEX statements to associate additional information with the vulnerabilities / risks identified by vet and included in SBOM. Particularly, vulnerabilities can be marked as fixed or not applicable as required.

User Experience

More thought and user survey is required to define this. But at a high level, it should be:

Create an YAML document with issues to be marked as fixed / NA etc. with justifcation
Ingest the YAML document during SBOM building phase to auto-generate VEX statements in SBOM

Reference

Client Side Caching for Insight API Response

Problem

Dependency on external services may increase the time and API cost of tool usage.

Solution

Implement a client side cache for Insight API responses. This can be implemented using an sqlite3 DB by re-using the caching interface implemented for Insights API Service

Expired Trial Key Fails Silently without Error

Problem

When the configured trial key is expired, scan runs without error but enrichment fails and hence no result is displayed. Only when we run with verbose logging can we know that the trial key has expired.

Solution

Introduce key verification and verify key before starting scan. Fail if key is invalid. This should be done only for scan command and not for offline analysis like query command

Generate Report as SBOM

Requirement

Generate an exportable software bill of materials (SBOM) in the NTIA-approved data formats (i.e., SPDX, CycloneDX, and SWID tags)

Github Reader Fails if Dependency Graph Not Available

Solution

vet can clone the repo but we must ensure:

Depth is set to 1 to reduce size
Show warning

Fix Linter Issues and Enable `golint` Guard Rail

We currently have quite a few linter issues.go

✗ golint ./...  | wc -l
124

We need to fix them and introduce a linter guard rail in
https://github.com/safedep/vet/blob/main/.github/workflows/ci.yml

The guard rail can be as simple as integrating
https://github.com/golangci/golangci-lint-action

Create Github Container Action

Requirements

Create Github container action for using vet as a Github action.

Guiding Principal

Use good defaults
Allow overriding parameters
Allow supplying custom filters & filter suites
Provide workflow experience instead of tool experience

Workflow over Tool

This is a tool experience

./vet scan ...

This is a workflow experience

steps:
  - name: OSS Vet
    uses: safedep/vet
    with:
      fail_on_match: true  # This is default
      suite: default             # or .vet/suites/custom.yml
      exceptions: .vet/exceptions.yml

Support Vulnerability Reachability Analysis to Reduce False Positive

How do you know if a vulnerability in method-X in library-Y is actually reachable from your application and therefore has a real impact and not just another noise generated by scanning tools

This is a real problem for most SCA tools because of how they operate based on version matching algorithms. Implementing reachability analysis will greatly reduce false positives related to vulnerability detection. However, doing this, especially in a language agnostic manner is challenging, if not impossible.

We should explore this problem in two stages:

Define a model for performing vulnerability reachability analysis based on OSV database specific information (symbols)
Implement language specific parsing and analysis infrastructure to identify control flow paths

Doing [2] is not easy as it requires having source code of all 3rd party dependencies as well to identify paths that are reachable indirectly from the target application.

Support Auth Verify Command

Implement vet auth verify to verify validity of the configured API key.

This can be done in two ways:

Enrich a dummy package to verify
Support a backend API (e.g. whoami) to get details of the key along with verifying the key (introspection)

[2] is probably the way forward given other use-cases in future.

Support Version Drift with Delta in Filters

Introduce version drift in filter input spec to allow filters such as:

drifts.major > 0
drifts.major > 2
drifts.minor > 3

Baseline Filters for Common Use-cases

Provide baseline filters that can be used to get started.

Incorrect Update Recommendation

running vet on gohugo repository gives odd recommendations

felt like it wants to say the libs are low popularity but then its showing update to a version which already exists in repo the view is not very clear. even the markdown format is doing the same

also low popularity, drift kind of flags are missing in the md file.

Implement E2E Behavior Testing

We need to build a framework for running behavior tests on generated vet binary to ensure we don't end up breaking cli contract. We need to explore if there are any cli testing frameworks. Alternatively we can just use something like RSpec and write custom helpers to wrap vet execution with params and package manifests (fixture files).

Key flows that need to be tested:

Scan on directory
Scan with lockfiles
Scan with lockfiles and lockfile-as
Query on JSON dump
Filters
Filter Suite
Exception generator
Exceptions
Reporting modules

Refactor: Exceptions Management at Per Scan

Exceptions management is currently implemented as a global package. This is bad because we can't use vet as a package and run concurrent scans. We need to refactor exceptions management into an object of its own with an optional global instance to maintain current feature parity

Include Aggregated OpenSSF Scorecard Score in Filter Input

Fix Typo in Scan Summary Output

Use vet to Implement Safe Consumption of OSS Components for vet

Dogfood vet :)

Setup a vetting working for this repository using vet. This should include creating an appropriate policy, exceptions configuration and a Github action that runs on PR to identify issues

Config Spec Driven Scan Execution

Overview

vet currently executes a scan based on command line arguments. While this is flexible, there are quite a lot of args and it will increase as the tool evolves. This will make CI integration complex, particularly building a Github Action runner while considering all args will not be a good experience. We have already identified this as a problem in #23

Requirements

Define a config file spec for SafeDep
Implement a file based config repository
Enforce schema validation while read config from file
Support YAML based file format

User Experience

A scan specification for a repository can be defined in a file .vet/scan.yml
vet automatically decodes the scan spec and executes the scan based on it without command line args

Show Ecosystem Name in Summary Report

The summary reporter presents a table of packages that are recommended for upgrade. It does not show the ecosystem of the package and only shows the name and version. While this is fine for a scan where only a single lockfile was scanned, this is a problem where multiple lockfiles were scanned with different ecosystem

To start improving the UX, we should start by showing the Ecosystem name in the report report table.

Improve Remediation Advice

Problem

Any real-life application will depend on frameworks & other direct dependencies which in turn introduces multiple layers of transitive dependencies. The number of effective (direct & transitive) dependencies for any real-life application can be easily 100+.

When we scan dependencies, we end up finding issues (vulnerability / popularity / security posture) in a lot of dependencies, thus increasing the remediation cost significantly. Many a times, the remediation is infeasible or painful due to the sheer volume of issues produced by a tool, vet included.

Solution

Our goal is to improve the user experience when it comes to remediating issues in OSS dependencies while ensuring that we do not provide a false sense of security by missing critical issues. To do this, we need to do provide a paved path for remediation journey instead of dumping issues to the user and having the user make the decision / prioritisation / plan.

We need an user experience like this

Provide Top 5 libraries that will mitigate maximum OSS risk in the application
Identify and ignore false positives
Provide remediation advice that are actually doable by the user i.e. direct dependencies and NOT transitive dependencies
Provide a way to see the impact of risk mitigated by following the remediation advice
Provide configurability to ignore false positives (already implemented through #13)

Related issues

#8
#94
#80

https://docs.google.com/presentation/d/14tTZlnHP26dqAd2mDUyYsIhlVmZWrBc4/edit#slide=id.g24f292dc4d0_0_660

Disable Dependency Enrichment for Specific Lockfiles

Lockfiles like gradle.lockfile, package-lock.json, Gemfile.lock etc. already contains locked version of direct and transitive dependencies that actually compose the deployable app.

For these lockfiles, we do not need to depend on Insights API for resolving dependencies.

Support SBOM as an Input Format for vet

Ability to run vet on SBOM generated by github and give a single policy violation report

As a user, I want to perform a dependency scan on all/partial projects on GitHub org to generate the most critical risks such as license risks in one shot.

Optionally, I should be able to perform dependency scanning of selected projects in my orgs

The example command can be

vet scan https://github.com/OrgName --github-token ....

The scan should generate violations in a report

Possible behavior:
The tool can utilize the SBOM provided by Github to perform the assessment.

Support Integration with SCM and Dependency Track

Problem

Dependency Track is a continuous SBOM management and analysis platform. For DT to be effective, it is important to continuously import SBOMs into DT. We want vet to make it very easy for an organization to continuously sync there repositories into DT by generating SBOM and using DT's REST API to upload to DT

Solution

We will start by supporting Github and eventually may be Gitlab. For the Github integration, we will provide an experience on top of the existing --github scan option to scan a remote Github repository. The scan will look like

vet scan --github-org https://github.com/safedep

For syncing results to DependencyTrack, we will build a new reporting module that syncs to DependencyTrack instance.

VET_DT_BASE_URL="..." VET_DT_TOKEN="..." \
vet scan --github-org https://github.com/safedep --report-dependency-track

safedep / vet Goto Github PK

vet's People

Contributors

Stargazers

Watchers

Forkers

vet's Issues

Requirement

Proposed Workflow

CI Integration

Problem

Requirement

Problem

Proposed Solution

Problem

Solution

Current State

Proposed State

Goal

Requirements

Problem

What is needed

What is VEX

Why do we need VEX

VEX in vet Context

User Experience

Reference

Problem

Solution

Problem

Solution

Requirement

Solution

Requirements

Guiding Principal

Workflow over Tool

Overview

Requirements

User Experience

Problem

Solution

Related issues

Problem

Solution

Recommend Projects

Recommend Topics

Recommend Org

VEX in `vet` Context