Giter VIP home page Giter VIP logo

guacsec / guac Goto Github PK

View Code? Open in Web Editor NEW
1.2K 42.0 154.0 10.92 MB

GUAC aggregates software security metadata into a high fidelity graph database.

Home Page: https://guac.sh

License: Apache License 2.0

Go 99.55% Nix 0.01% Makefile 0.30% Shell 0.13% Starlark 0.01%
security software-supply-chain software-supply-chain-security supply-chain supply-chain-security supply-chain-visibility supply-chain-analytics

guac's Issues

Interested in Dev/Contributing to GUAC?

Welcome! This thread is on expressing interest in contributing to GUAC! We are glad to welcome our fellow open source contributors! As the project is starting up, we will be creating issues that folks can pick up and work on. In the meantime, as the code base is forming up, we'd like to engage directly with our contributors!

BTW we now have a slack channel: https://openssf.slack.com/archives/C03U677QD46

If you are interested in contributing, it would be very helpful to provide the following details (copy and paste into your comment):

1. I am interested in contributing to:
- [ ] Development
- [ ] Documentation
- [ ] Issue triage and community
- [ ] Technical advisory (review [governance document](https://github.com/artifact-ff/artifact-ff/blob/main/GOVERNANCE.md#technical-advisory-members))

2. I am here because:
- [ ] Personal interest
- [ ] My company/orgs i work with are interested in this

3. What is your associated company/org if you're contributing in their capacity? _________

4. Depending on how things go, I may be interested in becoming a maintainer of the project
- [ ] Yes

5. (optional) I have expertise in:
- [ ] Neo4j
- [ ] Cypher
- [ ] GraphQL
- [ ] Intoto
- [ ] SPDX
- [ ] CycloneDX
- [ ] Others (fill in):

SLSA parser digest format contains stray quotes

Current SLSA parser digest string has additional quotes

{
  "identity": 568482,
  "labels": [
    "Artifact"
  ],
  "properties": {
"name": "git+https://github.com/kubernetes/kubernetes",
"digest": "sha1:'3c7da84d8fc03c30d3409e9c846ae4bc2de0b4d5'"
  }
}

Design and implement full integration with pub/sub for GUAC flow for collector/processor/assembler

Collectors that obtain documents need somewhere to emit them to. The processor, which is the next part of the pipeline needs to gather the documents and process them..

There are a couple options naturally:

  1. Processor runs as a gRPC server
  2. Processor obtains documents from a Pub/Sub queue (e.g. kafka, nats.io, etc.)
  3. Processor ingests from STDIN or file
  4. Processor and Collector are part of the same process.

This boils down to we collectors and processors want to be run in the architecture. The ingestor will most likely be tied to the assembler.

Deliberation:

  • Will all the collectors be run in a single executable? I.e. the processor will cache duplicate documents so it is beneficial to have an n:m relationship (where n>m) between collectors and executables. If the answer is no, this excludes option 3 and 4.
    • I think it is likely that this answer is no, given the access of collectors to need credentials and not a single account/team would have all credentials
  • Options 1 and 2 are similar, with a trade-off between simplicity and scale.

fyi: ingestor tree parsing

We agreed that in the long term, the ingestor would need to have a way to communicate information up/down the tree in order to make edges and annotations between the elements of each node in the document tree.

However, to get started with an e2e poc, we decided to defer the implementation of the recursive processing model.

Relevant Conversation:
#39 (comment)
#39 (comment)
#39 (comment)

Add performance warning in README

Note performance warning in README that the current proof of concept does not include optimizations to neo4j and may see some degradation of performance. Create a separate PERFORMANCE.md file to provide some ideas to increase performance in the time being.

IdentityFor edge should be generic

The identity for edge should apply to almost any type of document/node, and thus should be able to be defined on any GuacNode. This should be done as well as any other clean up required around identity for graphBuilder

Refactor for SLSA parser tests

EDIT: upon reading the tests more, it seems like it just moves a lot of the test case checks outside the test definition

The SLSA parser tests should specify expected edges and nodes within the test itself rather than having it just be purely part of the body (explicitly being linked to a test case)

chore: reconcile CI with local makefile

Right now, there is a bit of difference between the CI and makefile

  • some tests require neo4j to run (perhaps tag them to not run locally)
  • Linting rules in makefile and CI are different

task: map keys to identities and trust

Identities should be considered separate from any given key material, as its potentially a many to many situation. One identity might have multiple keys and one key might be potentially associated with multiple identities.

Note: Identity in this context is still abstract. It should not be tied back to a specific person especially anonymous/pseudonymous folks. The primary goal is to associate keys identities associated with a project, most likely organizations or known maintainers.

Docs:

Write document guesser test to make sure that other guessers are not accidentally misguessing another document type

Certain document type importers may not have sufficient heuristics to determine if a document is indeed the type guessed. For example, if the fields in the JSON are optional for that document type then it may mistake any JSON document as its document type. (This happens in certain cases in SPDX thus the requirement to check for existence of field).

We should write a unit test to make sure that no other document guesser misguesses a document.

task: [assembler] create graphDB package for neo4j

Create a graph DB package that will be to create an instance of the neo4j/cypher driver to talk to the graph. No need for plug-ability of graph DB for now, since we currently do not foresee supporting additional graph DBs in the near future.

task: [processor] create DocumentUnknown pre-processor

Create a DocumentUnknown pre-processor that takes in a document blob and guess the format and document type between each iteration of the processor.

i.e. given a Document with a blob, tell me what the type and the format is

#27 added the initial foundations
TODO

  • add call from processor

SPDX Heuristic for Syft SPDX SBOMs

Right now, syft isnt putting the top level package as SPDX objects

I think for now we can add a PURL OCI reference type by heuristics based on the name in the document. But ill open an issue in Syft to include this as well (anchore/syft#1241).

The checksum is not currently stored, but would be good to also include "name" as the package ref

{
 "SPDXID": "SPDXRef-DOCUMENT",
 "name": "gcr.io/google-containers/kube-addon-manager-v8.9",
 "spdxVersion": "SPDX-2.2",
 "creationInfo": {
  "created": "2022-10-03T14:41:17.720701835Z",
  "creators": [
   "Organization: Anchore, Inc",
   "Tool: syft-0.58.0"
  ],

Logging library not being developed anymore

For logging, the logrus library is being used. However is it not actively developed anymore:

Logrus is in maintenance-mode. We will not be introducing new features. It's simply too hard to do in a way that won't break many people's projects, which is the last thing you want from your Logging library (again...).

They recommend using other libraries:

Check out, for example, Zerolog, Zap, and Apex.

SLSA parser crashes on multiple subjects OR multiple hashes

SLSA parser crashes with multiple subjects or multiple hashes

Multiple digests error:
SLSA multiple digests example:

  "subject": [
        {
      "name": "gs://kubernetes-release/release/v1.25.2/bin/linux/arm64/kube-apiserver",
      "digest": {
        "sha256": "5522c9bcd76863fa24a658d9faeb6fa2ca999d022806e301e922efca747043f6",
        "sha512": "aa989e60525ac208bc1a7469b486eecb02bf4e7ceb3530c97bae5e0cbc8d4361ce040a8899fa7d9eb56f573fdfc605325e4fcaf956f5efa930cf1a52cb5ebb10"
      }
    }
      ],

Error:

panic: runtime error: index out of range [342] with length 342

goroutine 1 [running]:
github.com/guacsec/guac/pkg/assembler.StoreGraph({{0xc00017c800, 0x156, 0x180}, {0xc000372000, 0x3f9, 0x400}}, {0x1d75098?, 0xc0000c6f20?})
	/Users/lumb/go/src/github.com/guacsec/guac/pkg/assembler/graphdb.go:62 +0xa1d
github.com/guacsec/guac/cmd/guacone/cmd.getAssembler.func1({0xc0005a7a70?, 0x1?, 0xc00019e540?})
	/Users/lumb/go/src/github.com/guacsec/guac/cmd/guacone/cmd/files.go:178 +0xc5
github.com/guacsec/guac/cmd/guacone/cmd.glob..func1.1(0xc00019f020)
	/Users/lumb/go/src/github.com/guacsec/guac/cmd/guacone/cmd/files.go:109 +0x13b
github.com/guacsec/guac/pkg/handler/collector.Collect({0x1d73be8?, 0xc0008001e0}, 0xc00035dcc8, 0xc00035dc58)
	/Users/lumb/go/src/github.com/guacsec/guac/pkg/handler/collector/collector.go:84 +0x2f0
github.com/guacsec/guac/cmd/guacone/cmd.glob..func1(0x2488a80?, {0xc000800180, 0x1, 0x3})
	/Users/lumb/go/src/github.com/guacsec/guac/cmd/guacone/cmd/files.go:125 +0x56a
github.com/spf13/cobra.(*Command).execute(0x2488a80, {0xc000800120, 0x3, 0x3})
	/Users/lumb/go/pkg/mod/github.com/spf13/[email protected]/command.go:876 +0x67b
github.com/spf13/cobra.(*Command).ExecuteC(0x2488d00)
	/Users/lumb/go/pkg/mod/github.com/spf13/[email protected]/command.go:990 +0x3b4
github.com/spf13/cobra.(*Command).Execute(...)
	/Users/lumb/go/pkg/mod/github.com/spf13/[email protected]/command.go:918
github.com/guacsec/guac/cmd/guacone/cmd.Execute()
	/Users/lumb/go/src/github.com/guacsec/guac/cmd/guacone/cmd/root.go:35 +0x25
main.main()
	/Users/lumb/go/src/github.com/guacsec/guac/cmd/guacone/main.go:23 +0x17

Multiple Subjects error:
SLSA subject section example:

  "subject": [
    {
      "name": "gs://kubernetes-release/release/v1.25.2/bin/windows/amd64/kubectl-convert.exe",
      "digest": {
        "sha512": "aa989e60525ac208bc1a7469b486eecb02bf4e7ceb3530c97bae5e0cbc8d4361ce040a8899fa7d9eb56f573fdfc605325e4fcaf956f5efa930cf1a52cb5ebb10"
      }
    },
        {
      "name": "gs://kubernetes-release/release/v1.25.2/bin/linux/arm64/kube-apiserver",
      "digest": {
        "sha256": "5522c9bcd76863fa24a658d9faeb6fa2ca999d022806e301e922efca747043f6"
      }
    }
      ],
panic: runtime error: index out of range [5] with length 5

goroutine 1 [running]:
github.com/guacsec/guac/pkg/assembler.StoreGraph({{0xc00003c0f0, 0x5, 0x5}, {0xc0001049c0, 0x6, 0x6}}, {0x1d75098?, 0xc000486f20?})
	/Users/lumb/go/src/github.com/guacsec/guac/pkg/assembler/graphdb.go:62 +0xa1d
github.com/guacsec/guac/cmd/guacone/cmd.getAssembler.func1({0xc00030ce70?, 0x1?, 0xc0001046c0?})
	/Users/lumb/go/src/github.com/guacsec/guac/cmd/guacone/cmd/files.go:178 +0xc5
github.com/guacsec/guac/cmd/guacone/cmd.glob..func1.1(0xc000104780)
	/Users/lumb/go/src/github.com/guacsec/guac/cmd/guacone/cmd/files.go:109 +0x13b
github.com/guacsec/guac/pkg/handler/collector.Collect({0x1d73be8?, 0xc00010fb30}, 0xc00063fcc8, 0xc00063fc58)
	/Users/lumb/go/src/github.com/guacsec/guac/pkg/handler/collector/collector.go:84 +0x2f0
github.com/guacsec/guac/cmd/guacone/cmd.glob..func1(0x2488a80?, {0xc00010fad0, 0x1, 0x3})
	/Users/lumb/go/src/github.com/guacsec/guac/cmd/guacone/cmd/files.go:125 +0x56a
github.com/spf13/cobra.(*Command).execute(0x2488a80, {0xc00010fa70, 0x3, 0x3})
	/Users/lumb/go/pkg/mod/github.com/spf13/[email protected]/command.go:876 +0x67b
github.com/spf13/cobra.(*Command).ExecuteC(0x2488d00)
	/Users/lumb/go/pkg/mod/github.com/spf13/[email protected]/command.go:990 +0x3b4
github.com/spf13/cobra.(*Command).Execute(...)
	/Users/lumb/go/pkg/mod/github.com/spf13/[email protected]/command.go:918
github.com/guacsec/guac/cmd/guacone/cmd.Execute()
	/Users/lumb/go/src/github.com/guacsec/guac/cmd/guacone/cmd/root.go:35 +0x25
main.main()
	/Users/lumb/go/src/github.com/guacsec/guac/cmd/guacone/main.go:23 +0x17

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.