openscanhub / openscanhub Goto Github PK

View Code? Open in Web Editor NEW

13.0 6.0 11.0 6.26 MB

OpenScanHub is a service for static and dynamic analysis.

Home Page: https://openscanhub.dev

License: GNU General Public License v3.0

Makefile 0.51% Dockerfile 0.52% Shell 5.90% Python 83.59% CSS 2.38% HTML 7.10%

dynamicanalysis sast staticanalysis openscanhub

openscanhub's Introduction

OpenScanHub

OpenScanHub is a service for static and dynamic analysis. You can find the latest source code at:

https://github.com/openscanhub/openscanhub

OpenScanHub is licensed under GPLv3+, see LICENSE for details. Please report bugs and feature requests on GitHub using the above URL.

Dependencies

hub:

csdiff
gzip
httpd
koji
mod_auth_gssapi
mod_ssl
python3-bugzilla
python3-csdiff
python3-django >= 3.2.0
python3-jira
python3-kobo-client
python3-kobo-django >= 0.35.0
python3-kobo-hub >= 0.35.0
python3-kobo-rpmlib
python3-mod_wsgi
python3-psycopg2
python3-qpid-proton
xz

worker:

csmock
file
koji
python3-kobo-client
python3-kobo-rpmlib
python3-kobo-worker >= 0.32.0

client:

koji
python3-kobo-client

Development Guide

See development docs for instructions.

RPM-based Installation

Latest development RPM packages can be found in Copr.

Documentation

TODO

openscanhub's People

Contributors

Stargazers

Watchers

Forkers

siteshwar jamacku isinyaaa kdudka lzaoral rhyw jperezdealgaba hanchuntao psimovec i386x praiskup

openscanhub's Issues

docs: finish the `README.md` file

Right now, the file still contains two sections that are marked as TODO.

hub: JavaScript TypeError on waiving details page for unfinished scans

Schedule an ErrataDiffBuild task.
Immediately go to the waiving details page for the associated scan.
The console in you browser will show the following error:

Uncaught TypeError: document.getElementById(...) is null
    onload http://0.0.0.0:8000/waiving/1/:238
    EventHandlerNonNull* http://0.0.0.0:8000/waiving/1/:203
    [1:238:12](http://0.0.0.0:8000/waiving/1/)
    onload http://0.0.0.0:8000/waiving/1/:238
    (Async: EventHandlerNonNull)
    <anonymous> http://0.0.0.0:8000/waiving/1/:203

hub: task list page should show number of defects found

The task list page should show number of defects found, instead of worker that was assigned to the task.

hub: make results easy to interpret for users

The result page for each task contains different logs that are collected by osh worker. It could be confusing for new users to read and interpret these logs. The user interface for results page should be redesigned to make the logs easier to interpret.

hub: do not guess or hardcode koji profiles

The corresponding koji profile should be selected when submitting the task from the client. Hub should not be doing any guessing or overriding on its own.

This should also make it possible to select, e.g. CentOs Stream Koji as the source for scanning.

hub: repurpose the `results` field for `Task` result summary

Right now, all successful tasks (and a lot of failed ones) have an empty results field. It could be useful to repurpose this field to show the results statistics (e.g. the output of csgrep --mode=stat) for successful tasks and a short summary of a failure for the failed tasks.

hub: add column ordering to paginations

All paginations with the exception of waiving do not support ordering by selected column.

Waiving (expected behaviour): http://0.0.0.0:8000/waiving/
Packages: http://0.0.0.0:8000/scan/packages/

vcs-diff-lint: PyLint and Mypy differential linter

Since this is mainly a Python-based project, you might be interested in the vcs-diff-lint tool. It has also GitHub Action available at the marketplace.

hub: add some checker groups to initial fixtures

We should ship initial fixtures with some basic checker groups added, if possible. Otherwise, all results will be marked as Unsorted:

client: Add completions for the fish shell

We only have osh-cli completions for bash and we are in the process of adding completions for zsh. This issue tracks adding osh-cli completions for the fish shell.

Shall we add a development workflow check for macOS in GitHub Actions CI?

As we have started working towards stabilizing development workflow on macOS, we should consider adding a check for it on GitHub Actions CI.

Add a pull request template

We should add a PR template with a checklist of common mistakes or shortcomings to simplify and possibly shorten the review process.

hub+worker: we should try to make the selection of mock profiles architecture agnostic

Right now, all errata related mock profiles are hard-coded to use x86_64 buildroots. This complicates development on aarch64 machines because I often to forget to change these values in the Django Admin interface (and even when I don't, it's still an unnecessary work to do.)

Shall we use resalloc to dynamically provision workers?

I am opening this issue after discussing dynamic provisioning of workers with @lzaoral and @praiskup.

resalloc is a resource manager for resources like virtual machines, containers etc. It is used inside Copr infrastrucutre to manage builders (workers). It currently support AWS, OpenStack, IBM cloud, kubernetes etc. and is extensible to other infrastrucutres. We can possibly use resalloc to dynamically allocate and manage workers in OpenScanHub deployments. This issue tracks if we should have a hard dependency on resalloc. If not, how resalloc could be used with OpenScanHub for dynamic provisioning of workers.

Project page:

https://github.com/praiskup/resalloc

Fedora Copr configuration:

The current "state" of the systems (overview page):

https://copr-be.cloud.fedoraproject.org/resalloc/

Available in Fedora and Fedora EPEL:

https://src.fedoraproject.org/rpms/resalloc/branches

Helper scripts for starting VMs:

Example: vm_alloc.py
Internal GitLab issue: https://gitlab.cee.redhat.com/covscan/covscan/-/issues/95

hub: move Package details view generation from `scan/models.py` to `scan/view.py`

As noted by @kdudka in one of your private conversations, scan/models.py is not the right place for this kind of functionality.

worker: recent failed scans of `sssd` upstream PR leave ligering tasks

All recently failed scans of sssd upstream pull requests leave lingering tasks on workers, e.g.:

...
2023-08-07 10:02:56 [WARNING ] Lingering task 288766 (pid 848322)
2023-08-07 10:03:16 [WARNING ] Lingering task 288766 (pid 848322)
2023-08-07 10:03:37 [WARNING ] Lingering task 288766 (pid 848322)
2023-08-07 10:03:58 [WARNING ] Lingering task 288766 (pid 848322)
...

This is almost surely a bug in kobo's process handling which may be caused by csmock not cleaning-up after itself properly.

git: update `.git-blame-ignore-revs`

While the whole codebase seems to be reformatted properly now, all commits making such changes since February 2023 are not included in the .git-blame-ignore-revs file. This issue tracks the progress to fix that.

UMB messaging should be disabled on the development instance

As originally reported by Juan Perez de Algaba Sierra, we should update the fixtures such that SEND_BUS_MESSAGE is disabled on the development instance of OpenScanHub. We have no working UMB configuration in the development environment yet and the unnecessary SSL-related tracebacks in the logs may confuse newcomers.

release `osh-0.9.5`

There has been no stable release since the --nvr option of osh-cli was introduced in commit 2a9e399. It is difficult to encourage users to use the now preferred --nvr option when there is no stable release where the option would work.

Hence I propose to release osh-0.9.5 as soon as possible. Any objections?

hub: Shall we cleanup the fixtures?

Right now, the fixtures contain a massive amount of RHEL related data that cannot be easily used unless the user/developer has a valid RHEL subscription.

I propose to clean them up and replace them with CentOS and CentOS Stream where possible.

hub: simpify registration of admin models

This and probably other decorator-like functions like add_link_field can be replaced by modern Django features (provided via ModelAdmin class metadata) in the future.

ci: Get coverage reports for subprocesses in Python

It's not trivial to get coverage reports for subprocesses started in python, and it is currently affecting coverage reports for integration tests. Find a way to get coverage reports for subprocesses started in python.

ci: drop `codecov` from the default CI jobs

It is really annoying if all upstream commits have red CI because of codecov. Let's remove codecov from the CI until we find out how to integrate it in a more useful way.

docs: what shall we use for OpenScanHub documentation?

We currently have a very limited documentation of OpenScanHub in the git repository. We should choose a platform for documentation to make it easier for the users to learn and use OpenScanHub.

hub: manual scan notification request crashes

Request Method: GET
Request URL: https://0.0.0.0/osh/admin/scan/scan/93903/change/notify/

Django Version: 3.2.19
Python Version: 3.9.16
Installed Applications:
('django.contrib.auth',
 'kobo.django.auth.apps.AuthConfig',
 'django.contrib.contenttypes',
 'django.contrib.sessions',
 'django.contrib.sites',
 'django.contrib.messages',
 'django.contrib.admin',
 'django.contrib.staticfiles',
 'django.contrib.humanize',
 'kobo.django.upload',
 'kobo.django.xmlrpc',
 'kobo.hub',
 'osh.hub.errata',
 'osh.hub.scan',
 'osh.hub.waiving',
 'osh.hub.stats')
Installed Middleware:
('django.contrib.sessions.middleware.SessionMiddleware',
 'django.middleware.common.CommonMiddleware',
 'django.middleware.csrf.CsrfViewMiddleware',
 'django.contrib.sessions.middleware.SessionMiddleware',
 'django.contrib.auth.middleware.AuthenticationMiddleware',
 'django.contrib.messages.middleware.MessageMiddleware',
 'kobo.django.auth.middleware.LimitedRemoteUserMiddleware',
 'kobo.hub.middleware.WorkerMiddleware',
 'kobo.django.menu.middleware.MenuMiddleware')



Traceback (most recent call last):
  File "/usr/lib/python3.9/site-packages/django/core/handlers/exception.py", line 47, in inner
    response = get_response(request)
  File "/usr/lib/python3.9/site-packages/django/core/handlers/base.py", line 181, in _get_response
    response = wrapped_callback(request, *callback_args, **callback_kwargs)
  File "/usr/lib/python3.9/site-packages/django/utils/decorators.py", line 130, in _wrapped_view
    response = view_func(request, *args, **kwargs)
  File "/usr/lib/python3.9/site-packages/django/views/decorators/cache.py", line 44, in _wrapped_view_func
    response = view_func(request, *args, **kwargs)
  File "/usr/lib/python3.9/site-packages/django/contrib/admin/sites.py", line 232, in inner
    return view(request, *args, **kwargs)
  File "/usr/lib/python3.9/site-packages/osh/hub/scan/admin.py", line 63, in notify
    result = send_scan_notification(request, scan_id)
  File "/usr/lib/python3.9/site-packages/osh/hub/scan/notify.py", line 278, in send_scan_notification
    message = mg.generate_regular_scan_text()
  File "/usr/lib/python3.9/site-packages/osh/hub/scan/notify.py", line 215, in generate_regular_scan_text
    return self.generate_general_text() % {
  File "/usr/lib/python3.9/site-packages/osh/hub/scan/notify.py", line 185, in generate_general_text
    "New defects count: %d" % get_scans_new_defects_count(self.scan.id),

Exception Type: TypeError at /admin/scan/scan/93903/change/notify/
Exception Value: %d format: a number is required, not NoneType

Triage and fix `noqa` directives

This project now contains the following noqa directives obtained using git grep -i 'noqa'. This issue serves as a tracking bug for all of them (triaged/fixed instances are marked using ✅):

containers: make the `hub` container compatible with `aiosmtdp`

Right now, the run.sh script for the hub container depends on smptd Python module which is no longer available on Fedora 39.

As suggested by @frenzymadness in a private conversation, we should make the script compatible with both aiosmtpd and smtpd and prefer the former one if possible.

edit: typo

hub: a task without subtaks may show an incorrect number of them

Sometimes a Task without any subtasks may show a bogus value for the number of its subtasks. Possibly a bug in kobo.

Add `CODEOWNERS` file?

Shall we add a CODEOWNERS file to automatise reviewer selection and to know who is responsible for which part of the codebase?

ci: enable Coverity integration in the upstream CI

It would be nice to use Coverity (the scanning service freely available for open-source project) in the upstream CI of OpenScanHub.

hub: add a task filter to show failed tasks

Now, the web interface has filters to show all, only running or only finished tasks that we inherit from kobo:
https://github.com/release-engineering/kobo/blob/master/kobo/hub/urls/task.py

It would be convenient to also have a filter only for failed tasks so that we can more easily find them and triage the cause of their failure. We should try to implement this feature in kobo first.

edit: reword

hub: `scan_order`, `scans_count` and button targets in `waiving/result.html` may be incorrect

Schedule three errata scans:

First with NEW_PACKAGE as the base.
Immediately schedule the second one with a proper package as the baseline instead and let it finish.
Finally, schedule the last one with a with NEW_PACKAGE as the base and let it finish as well.

http://0.0.0.0:8000/waiving/1/ will show incorrect scan_order value:
http://0.0.0.0:8000/waiving/2/ will show incorrect scan_order value (or correct if you only care about successful scans) and correct scan_count (or incorrect if you only care about successful scans):
The Next button will go directly to http://0.0.0.0:8000/waiving/4/ (scan number 3).
http://0.0.0.0:8000/waiving/3/ is a base scan and it does not show Previous and Next buttons:
http://0.0.0.0:8000/waiving/4/ will show incorrect scan_order value (or correct if you only care about successful scans) and correct scan_count (or incorrect if you only care about successful scans):
The Previous button will go directly to http://0.0.0.0:8000/waiving/2/ (scan number 1).

The feature is definitely broken but I'm not sure what the expected behaviour should be. Do we want to list failed scans, do we want to list base scans, ...?

packaging: use a PEP 440 compliant version string

Right now, setuptools issues the following deprecation warning:

/tmp/build-env-ep_0a0rc/lib/python3.11/site-packages/setuptools/dist.py:510: SetuptoolsDeprecationWarning: Invalid version: '0.9.3.git.20230710.144836.0797932'.
!!

        ********************************************************************************
        The version specified is not a valid version according to PEP 440.
        This may not work as expected with newer versions of
        setuptools, pip, and PyPI.

        By 2023-Sep-26, you need to update your project and remove deprecated calls
        or your builds will no longer be supported.

        See https://peps.python.org/pep-0440/ for details.
        ********************************************************************************

!!

This issue should be fixed sooner than later because as it says itself:

By 2023-Sep-26, you need to update your project and remove deprecated calls or your builds will no longer be supported.

ci: What shall we do with `scripts/cli_sanity_test.sh`?

This script seems to be unfinished and is not executed in the CI.

client: `download-results` for an `AnalyzerVersionRetriever` tasks crashes

Schedule an errata scan.
Try to download results for the associated AnalyzerVersionRetriever task.
Witness the following traceback:

Traceback (most recent call last):
  File "/Users/lzaoral/redhat/OpenScanHub/osh/client/osh-cli", line 79, in <module>
    main()
  File "/Users/lzaoral/redhat/OpenScanHub/osh/client/osh-cli", line 72, in main
    parser.run()
  File "/Users/lzaoral/redhat/OpenScanHub/kobo/kobo/cli.py", line 296, in run
    cmd.run(*cmd_args, **cmd_kwargs)
  File "/Users/lzaoral/redhat/OpenScanHub/osh/client/commands/cmd_download_results.py", line 51, in run
    fetch_results(self.hub, results_dir, task_id)
  File "/Users/lzaoral/redhat/OpenScanHub/osh/client/commands/shortcuts.py", line 128, in fetch_results
    tarball = _get_result_filename(task_info['args']) + '.tar.xz'
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/lzaoral/redhat/OpenScanHub/osh/client/commands/shortcuts.py", line 116, in _get_result_filename
    nvr = task_args['build']
          ~~~~~~~~~^^^^^^^^^
KeyError: 'build'

hub: check appropriateness of `on_delete` for `ForeignKey` fields

During the migration to Django 2.0, all ForeignKey fields were migrated to use the on_delete=models.CASCADE behaviour. While it made sense then to not change the deletion behaviour from older Django version, some instanced of on_delete=models.CASCADE could actually to more more harm than benefit.

We should check all usages of on_delete and set them to the most appropriate value for given ForeignKey.

See release-engineering/kobo#224 for detailed explanation.

ci: Avoid duplicate runs of workflows

GitHub Actions CI is configured to run on push and pull_request events on the main branch. This causes duplicated runs of the workflows when a PR gets merged. We should avoid duplicating workflow runs in such scenarios.

hub: user with a name containing the `/` character cannot be modified in Django Admin

I am not able to update user permissions through web UI for a user name that contains /:

Enter a valid username. This value may contain only letters, numbers and @/./+/-/_ characters.

In this case, the user is a Kerberos service principal. However, last week I hit a similar problem with a user named worker/..., which OSH/kobo automatically creates for each worker. Any idea if we could somehow relax this check at /osh/admin/kobo_auth/user/?

Apparently, such users exist in OSH and anything but Django Admin is able to work with them.

Originally reported in: #168 (comment)

hub: base scans may exist without a `ScanBinding`

Sometimes, a base scans may be created without a corresponding ScanBinding leading to a crash during scan finish and when userels try to access the given waiver.

Reported-by: @kdudka

hub: triage and fix ShellCheck reports

See https://github.com/openscanhub/openscanhub/security/code-scanning.

ci: enable Snyk Code integration in the upstream CI

It would be nice to use Snyk Code (the scanning service freely available for open-source project) in the upstream CI of OpenScanHub.

hub: shall we use Patternfly to redesign web user interface?

PatternFly is an Open Source design system recommended by Red Hat. I am opening this issue to discuss if we should redesign web user interface to use it.

Thanks @xsuchy for bringing this issue to our attention.

hub: exceptions in `upload_task_log` pause worker logging

This might be an issue in kobo. While experimenting in the worker, I found out that if the upload_task_log XML-RPC method raises an exception, it will not be logged and the worker output logging will appear as frozen.

packaging: Build staging packages

We are currently building only devel packages from main branch of GitHub.

Internally, staging packages are built from merging existing open merge requests on GitLab. So "staging" packages are ahead of the main branch. We should investigate with Copr and Testing farm teams if we can build a "private testing" cluster for each merge requests and share it with existing users.

We should investigate the process of building staging packages for this project.

Build a homepage for this project

We should have a homepage for this project on openscanhub.dev domain name. Here are a few questions that should be discussed:

Shall we use GitHub pages for this project?
Shall we use Jekyll, some other static site generator or static html pages?
If we use Jekyll (or some other static site generator), which theme shall we use?
Shall we use any other extra functionality provided by GitHub for Jekyll?

Setup issue forms and automatic triage for better user experience

Issue forms allow you to define web-like input forms using YAML syntax. It allows you to guide the reporter to get the required information.

Example of issue form - systemd

Since each issue follows the same format, you can parse it, and based on the provided information, you can label issues (e.g. hub, worker, cli, documentation, etc.). GitHub Action Advanced Issue Labeler can help you with labeling issues.

Links:

hub: `osh.hub.scan.service.get_latest_binding` seem to return wrong results

I believe that the implementation of the osh.hub.scan.service.get_latest_binding function is wrong.

It is used in ~~two distinct~~ contexts:

~~To select the latest scan to rescheduling from the admin interface.~~
To select the newest base scan for errata scanning.

The current implementation is the following:

def get_latest_binding(scan_nvr, show_failed=False):
    """Return latest binding for specified nvr"""
    if show_failed:
        query = ScanBinding.objects.filter(scan__nvr=scan_nvr)
    else:
        query = ScanBinding.objects.filter(
            scan__nvr=scan_nvr,
            result__isnull=False).exclude(
                scan__state=SCAN_STATES['FAILED'])
    if query:
        # '-date' -- latest; 'date' -- oldest
        latest_submitted = query.order_by('-scan__date_submitted')[0]
        if (latest_submitted.scan.state == SCAN_STATES['QUEUED']
                or latest_submitted.scan.state == SCAN_STATES['SCANNING']
                or latest_submitted.result is None):
            return latest_submitted
        else:
            return query.latest()
    else:
        return None

which can be simplified to the following equivalent but easier to read implementation:

def get_latest_binding(scan_nvr, show_failed=False):
    """Return latest binding for specified nvr"""
    query = ScanBinding.objects.filter(scan__nvr=scan_nvr)
    if not show_failed:
        query = ScanBinding.objects.filter(result__isnull=False).exclude(
                    scan__state=SCAN_STATES['FAILED'])

    if not query:
        return None

    # '-date' -- latest; 'date' -- oldest
    latest_submitted = query.order_by('-scan__date_submitted')[0]
    if (latest_submitted.scan.state == SCAN_STATES['QUEUED']
            or latest_submitted.scan.state == SCAN_STATES['SCANNING']
            or latest_submitted.result is None):
        return latest_submitted

    return query.latest()

The code can be divided into two parts:

First we want to select all bindings with given NVR and if required, exclude failed scans (missing results or failed scan) and return None if the query is empty. No issue there.
Then, we want to select the latest binding:
a. We only want non-failed scans:
1. First look for a binding with latest scan submission date and return it if does not have results or if it is queued or in progress.
Q: Is is possible to have a scan binding with results which a scan which is queued or in progress?

Q: What if the binding has results but the scan is base scanning?

Q: Do we want to search through all bindings when we look for a base?
1. If such binding does not exist, we will return a binding with the newest results (because ScanBinding specifies get_latest_by = "result__date_submitted").
Q: Do we want to use a binding with latest results even though it may not be the newest submitted, e.g. newer scan finished sooner that the one with latest results?

b. ~~We want all scans:~~
1. ~~The behaviour should be the same. Therefore, same questions apply.~~
2. This makes no sense for failed scans whatsoever. We want to return the latest scan binding regardless of its state, but we will still return a binding with latest results which failed scans generally do not have.

edits: reflect recent changes in main

Triage and fix TODOs, FIXMEs and XXXs

This project now contains the following FIXMEs, TODOs and XXXs obtained using git grep -IPi '\b(TODO|XXX|FIXME)\b'. This issue serves as a tracking bug for all of them (fixed instances are marked using ✅):

hub: exported XML-RPC methods contain unexpected entries

We should check and mark explicitly methods and types that should be exported over XML-RPC. For example, the http://0.0.0.0:8000/xmlrpc/worker/ page lists the following types (among others) that are exported but shouldn't be:

...
worker.AnalyzerVersion
worker.AppSettings
worker.BaseNotValidException
worker.FileUpload
worker.Scan
worker.ScanBinding
worker.ScanningSession
worker.Task
worker.TaskResultsProcessor
...

ci: Testing Farm is not executed for jobs triggered by a push to `main`

Testing Farm tests are not executed for jobs triggered by a push to main.