Giter VIP home page Giter VIP logo

ibutsu-server's People

Contributors

bsquizz avatar dependabot[bot] avatar eliadcohen avatar griffin-sullivan avatar jjaquish avatar john-dupuy avatar jyejare avatar lightofheaven1994 avatar mrpetrbartos avatar mshriver avatar obaranov avatar patchkez avatar psav avatar ronnypfannschmidt avatar rsnyman avatar victoremepunto avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

ibutsu-server's Issues

Generic table view or widget

Several teams in Insights have expressed interest in a matrix and/or table widget that displays specific extra metadata about the run.

An example table could be something like:
Screenshot from 2021-01-19 15-43-43

In the screenshot, the job could correspond to a specific run or groups of runs. The result could be a pass percentage. And the other columns could be some user-specified metadata (like RHEL_VERSION, EGG_VERSION).

This could be accomplished in two ways:

  1. A widget that can added to a dashboard
  2. A view that can be navigated to via the left nav

As there are potentially several columns that would be useful to display here, I am leaning towards option 2). However, option 2) forgoes using dashboards, which means this new table would be visible to everybody.

Regardless of whether we decided to add a widget or view, we'd need to do the following:
a) Add a new widget BE query file to this directory https://github.com/ibutsu/ibutsu-server/tree/master/backend/ibutsu_server/widgets
b) Add the new widget to contants.py
c) Write the frontend for the new widget by adding a new component to this directory: https://github.com/ibutsu/ibutsu-server/tree/master/frontend/src/widgets

cc @eduardocerqueira @prinewgirl

Pagination is slow as the collection size grows

We've been experiencing slowdown with our pagination (particularly with our ever-growing results collection) and it can be explained by the use of two functions:

  1. mongo.results.count
  2. the use of skip in mongo.results.find
    Neither of these can efficiently make use of the indexes on the collection. Therefore, as the collection size grows, they become extremely inefficient. https://docs.mongodb.com/v3.6/reference/method/cursor.skip/#using-cursor-skip cf. in particular:
The cursor.skip() method requires the server to scan from the beginning of the input results set before beginning to return results. As the offset increases, cursor.skip() will become slower.

We have been talking about moving to PSQL, and while I agree that we will experience some performance boost, I think we will hit the same issues IF we leave the implementation of pagination as is. Based on the PSQL docs, we'd be implementing pagination with https://www.postgresql.org/docs/current/queries-limit.html, cf. in particular:

The rows skipped by an OFFSET clause still have to be computed inside the server; therefore a large OFFSET might be inefficient.
  • Therefore, I think we need to rethink how we do pagination
  • Options for pagination
    1. A last ID approach like the one here
      - this would force us to traverse pages in a linear manner (1->2->3-> etc)
    2. A hard limit approach where if the number of documents is greater than some cutoff (say 1e4) we have a button to load the next 1e4 if the user goes to the last page

(ii) is my preferred option, because I think it is rare when someone actually wants to navigate to arbitrary pages of results.

In the implementation of (ii) we could introduce some (low) timeout for the mongo.results.count query, if it hits the timeout, we limit the documents to some MAX_DOCUMENTS.

In this way we circumvent the inherent slowness of count and because skip will never have to skip more than MAX_DOCUMENTS. For loading the next batch of documents we could implement something akin to (i).

Some information is found in http://www.ovaistariq.net/404/mysql-paginated-displays-how-to-kill-performance-vs-how-to-improve-performance/

Support filtering on the arguments of a marker

Thanks @mshriver for reporting this.

Some teams have a need for filtering on not only the name of a marker but also the argument. E.g.

        "markers": [
          {
            "args": [],
            "kwargs": {},
            "name": "tier2"
          },
          {
            "args": [
              "Critical"
            ],
            "kwargs": {},
            "name": "importance"
          },
          {
            "args": [
              "ContentViews"
            ],
            "kwargs": {},
            "name": "component"
          },
          {
            "args": [
              "High"
            ],
            "kwargs": {},
            "name": "importance"
          }
        ]

It would be nice if Ibutsu could handle these more complicated markers.

get_import does not populate run_id

When running get_import it never populates the run_id:

{'filename': 'results.xml',
 'format': '',
 'id': 'aaf35f01-5711-4699-a40f-c8f0499ac776',
 'run_id': None,
 'status': 'done'}

But trying to get direct from the API the following is returned:

{
  "filename": "results.xml",
  "format": "",
  "id": "aaf35f01-5711-4699-a40f-c8f0499ac776",
  "metadata": {
    "run_id": [
      "66d5ff81-5e75-4df1-a497-41a094285ea3"
    ]
  },
  "status": "done"
}

It would be great if the run_id was populated accordingly when get_import is used.

Better load balancing for the backend

When the API gets too many requests at once (particularly when they are async) I've noticed that backend pods can give up and exit:

 - - [09/Mar/2021:13:41:31 +0000] "GET /api/result?filter=component%3Dhost_inventory&filter=env%3Dstage&filter=start_time%3E2021-02-07&page=20&pageSize=30 HTTP/1.1" 200 89298 "-" "Python/3.8 aiohttp/3.7.4.post0"
[2021-03-09 13:41:31 +0000] [1] [INFO] Handling signal: term
 - - [09/Mar/2021:13:41:31 +0000] "GET /api/result?filter=component%3Ddrift&filter=env%3Dstage&filter=start_time%3E2021-02-07&page=16&pageSize=30 HTTP/1.1" 200 91776 "-" "Python/3.8 aiohttp/3.7.4.post0"
[2021-03-09 13:41:31 +0000] [37] [INFO] Worker exiting (pid: 37)
 - - [09/Mar/2021:13:41:32 +0000] "GET /api/result?filter=component%3Dcompliance&filter=env%3Dstage&filter=start_time%3E2021-02-07&page=19&pageSize=30 HTTP/1.1" 200 86346 "-" "Python/3.8 aiohttp/3.7.4.post0"
[2021-03-09 13:41:32 +0000] [38] [INFO] Worker exiting (pid: 38)
 - - [09/Mar/2021:13:41:32 +0000] "GET /api/result?filter=component%3Dpatchman&filter=env%3Dstage&filter=start_time%3E2021-02-07&page=18&pageSize=30 HTTP/1.1" 200 83171 "-" "Python/3.8 aiohttp/3.7.4.post0"
[2021-03-09 13:41:32 +0000] [39] [INFO] Worker exiting (pid: 39)
[2021-03-09 13:41:33 +0000] [1] [INFO] Shutting down: Master

We should have some better load balancing in place to deal with this. Ping me if you need code to reproduce the issue.

Support xfail/xpass in widgets/aggregations

Currently, the xfail/xpass test state is accepted as a valid result in runs/results.

However, these are not considered in widget calculations. We should include these as viable states in all widgets.

Endpoint for pass rate calculation

I have a task to calculate a pass rate for a test_id for a given period of time. It would be nice to have this endpoint in Ibutsu API.

Support metadata in Import API

To make it easier to add metadata to a run created via the Import API, add a metadata field which can be applied to the run and the results.

  • Create metadata field in Import API spec
  • Add support for metadata field in Import controller
  • Apply metadata to run
  • Apply metadata to results

Foreign key violations when uploading artifacts

On prod Ibutsu in both the DB pod and BE pod, I am seeing:

[2020-10-27 16:59:31,129] ERROR in app: Exception on /api/artifact [POST]
Traceback (most recent call last):
  File "/opt/app-root/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1277, in _execute_context
    cursor, statement, parameters, context
  File "/opt/app-root/lib/python3.6/site-packages/sqlalchemy/engine/default.py", line 593, in do_execute
    cursor.execute(statement, parameters)
psycopg2.errors.ForeignKeyViolation: insert or update on table "artifacts" violates foreign key constraint "artifacts_result_id_fkey"
DETAIL:  Key (result_id)=(8715bbd4-de7b-4881-864d-3bf164c35a18) is not present in table "results".


The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/opt/app-root/lib/python3.6/site-packages/flask/app.py", line 2447, in wsgi_app
    response = self.full_dispatch_request()
  File "/opt/app-root/lib/python3.6/site-packages/flask/app.py", line 1952, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/opt/app-root/lib/python3.6/site-packages/flask_cors/extension.py", line 165, in wrapped_function
    return cors_after_request(app.make_response(f(*args, **kwargs)))
  File "/opt/app-root/lib/python3.6/site-packages/flask/app.py", line 1821, in handle_user_exception
    reraise(exc_type, exc_value, tb)
  File "/opt/app-root/lib/python3.6/site-packages/flask/_compat.py", line 39, in reraise
    raise value
  File "/opt/app-root/lib/python3.6/site-packages/flask/app.py", line 1950, in full_dispatch_request
    rv = self.dispatch_request()
  File "/opt/app-root/lib/python3.6/site-packages/flask/app.py", line 1936, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "/opt/app-root/lib/python3.6/site-packages/connexion/decorators/decorator.py", line 48, in wrapper
    response = function(request)
  File "/opt/app-root/lib/python3.6/site-packages/connexion/decorators/uri_parsing.py", line 144, in wrapper
    response = function(request)
  File "/opt/app-root/lib/python3.6/site-packages/connexion/decorators/validation.py", line 184, in wrapper
    response = function(request)
  File "/opt/app-root/lib/python3.6/site-packages/connexion/decorators/parameter.py", line 121, in wrapper
    return function(**kwargs)
  File "/opt/app-root/src/ibutsu_server/controllers/artifact_controller.py", line 127, in upload_artifact
    session.commit()
  File "/opt/app-root/lib/python3.6/site-packages/sqlalchemy/orm/scoping.py", line 163, in do
    return getattr(self.registry(), name)(*args, **kwargs)
  File "/opt/app-root/lib/python3.6/site-packages/sqlalchemy/orm/session.py", line 1042, in commit
    self.transaction.commit()
  File "/opt/app-root/lib/python3.6/site-packages/sqlalchemy/orm/session.py", line 504, in commit
    self._prepare_impl()
  File "/opt/app-root/lib/python3.6/site-packages/sqlalchemy/orm/session.py", line 483, in _prepare_impl
    self.session.flush()
  File "/opt/app-root/lib/python3.6/site-packages/sqlalchemy/orm/session.py", line 2536, in flush
    self._flush(objects)
  File "/opt/app-root/lib/python3.6/site-packages/sqlalchemy/orm/session.py", line 2678, in _flush
    transaction.rollback(_capture_exception=True)
  File "/opt/app-root/lib/python3.6/site-packages/sqlalchemy/util/langhelpers.py", line 70, in __exit__
    with_traceback=exc_tb,
  File "/opt/app-root/lib/python3.6/site-packages/sqlalchemy/util/compat.py", line 182, in raise_
    raise exception
  File "/opt/app-root/lib/python3.6/site-packages/sqlalchemy/orm/session.py", line 2638, in _flush
    flush_context.execute()
  File "/opt/app-root/lib/python3.6/site-packages/sqlalchemy/orm/unitofwork.py", line 422, in execute
    rec.execute(self)
  File "/opt/app-root/lib/python3.6/site-packages/sqlalchemy/orm/unitofwork.py", line 589, in execute
    uow,
  File "/opt/app-root/lib/python3.6/site-packages/sqlalchemy/orm/persistence.py", line 245, in save_obj
    insert,
  File "/opt/app-root/lib/python3.6/site-packages/sqlalchemy/orm/persistence.py", line 1136, in _emit_insert_statements
    statement, params
  File "/opt/app-root/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1011, in execute
    return meth(self, multiparams, params)
  File "/opt/app-root/lib/python3.6/site-packages/sqlalchemy/sql/elements.py", line 298, in _execute_on_connection
    return connection._execute_clauseelement(self, multiparams, params)
  File "/opt/app-root/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1130, in _execute_clauseelement
    distilled_params,
  File "/opt/app-root/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1317, in _execute_context
    e, statement, parameters, cursor, context
  File "/opt/app-root/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1511, in _handle_dbapi_exception
    sqlalchemy_exception, with_traceback=exc_info[2], from_=e
  File "/opt/app-root/lib/python3.6/site-packages/sqlalchemy/util/compat.py", line 182, in raise_
    raise exception
  File "/opt/app-root/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1277, in _execute_context
    cursor, statement, parameters, context
  File "/opt/app-root/lib/python3.6/site-packages/sqlalchemy/engine/default.py", line 593, in do_execute
    cursor.execute(statement, parameters)
sqlalchemy.exc.IntegrityError: (psycopg2.errors.ForeignKeyViolation) insert or update on table "artifacts" violates foreign key constraint "artifacts_result_id_fkey"
DETAIL:  Key (result_id)=(8715bbd4-de7b-4881-864d-3bf164c35a18) is not present in table "results".

Fairly regularly. I think in some instances, the plugin may be trying to upload an artifact before the result has been uploaded.

Run GET response can't be used directly into a PUT request

Considering the following run GET response (one provided after a jUnit import):

{
  "component": null,
  "created": "2020-12-08T14:09:33.739307Z",
  "data": null,
  "duration": 0.067,
  "env": null,
  "id": "c3d3ea87-ca0b-4255-8d14-510e6d2cb0b1",
  "project_id": null,
  "source": null,
  "start_time": "2020-11-17T14:51:48.114645Z",
  "summary": {
    "errors": "1",
    "failures": "1",
    "skips": "1",
    "tests": "4",
    "xfailures": null,
    "xpasses": null
  }
}

Then when trying to use it on a PUT the following 400 response body is returned:

{
  "detail": "None is not of type 'string' - 'source'",
  "status": 400,
  "title": "Bad Request",
  "type": "about:blank"
}

So it seems that source is expected to be an empty string "" instead of null but the GET response for a run is not providing a valid value. If I just drop the source field it works as expected.

500 error when running a report

On prod ibutsu, select the "Satellite QE" project and try to run a report:

{
  "detail": "The server encountered an internal error and was unable to complete your request. Either the server is overloaded or there is an error in the application.",
  "status": 500,
  "title": "Internal Server Error",
  "type": "about:blank"
}

Results tree results in page crash for Junit XML imported runs

Steps to reproduce:

  • Import a Junit XML
  • Navigate to the run details page after it's finished importing
  • Click "Results tree"
Error fetching result data: TypeError: Cannot read property 'split' of null
    at utilities.js:349
    at run.js:156
    at Array.forEach (<anonymous>)
    at Ba.buildTree (run.js:155)
    at Ba.<anonymous> (run.js:369)
    at fo (react-dom.production.min.js:131)
    at ol (react-dom.production.min.js:212)
    at ps (react-dom.production.min.js:255)
    at t.unstable_runWithPriority (scheduler.production.min.js:19)
    at Va (react-dom.production.min.js:122)

Update runs that have not been gracefully exitted

When a pytest session is interrupted, e.g. by an outage or by the pod being killed, the run is never updated. This means that the counts, metadata, etc will never be added to the run.

We should have a periodic task that goes through and checks for these runs - updating them with the partial results.

jUnit XML Importer does not handle empty test suites

When an XML file has no testcase children in a testsuite, then the following exception occurs:

Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/celery/app/trace.py", line 385, in trace_task
    R = retval = fun(*args, **kwargs)
  File "/app/ibutsu_server/tasks/__init__.py", line 30, in __call__
    return super().__call__(*args, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/celery/app/trace.py", line 648, in __protected_call__
    return self.run(*args, **kwargs)
  File "/app/ibutsu_server/tasks/importers.py", line 95, in run_junit_import
    for testcase in testsuite.testcase:
  File "src/lxml/objectify.pyx", line 231, in lxml.objectify.ObjectifiedElement.__getattr__
  File "src/lxml/objectify.pyx", line 450, in lxml.objectify._lookupChildOrRaise
AttributeError: no such child: testcase

Better Error Reporting in Imports

If an import fails, there usually isn't a way to see that it has failed, nor is there a way to tell what the error was.

We need to catch errors and report them to the user.

Sometimes must refresh on direct links to Ibutsu

Sometimes when I click a direct link to a Run or Result details page in Ibutsu, I must refresh the page in order for it to load properly.

In the console I can see this error,

Uncaught (in promise) TypeError: Cannot read property 'result' of null
    at Ot.render (result.js:165)
    at Ri (react-dom.production.min.js:182)
    at Ii (react-dom.production.min.js:181)
    at gl (react-dom.production.min.js:263)
    at cs (react-dom.production.min.js:246)
    at ls (react-dom.production.min.js:246)
    at Ql (react-dom.production.min.js:239)
    at react-dom.production.min.js:123
    at t.unstable_runWithPriority (scheduler.production.min.js:19)
    at Va (react-dom.production.min.js:122)

It seems like we need to fix the loading state of these pages. It seems like it only happens with the details pages.

Support legacy junit xml from pytest

Recently importing this Junit XML file https://github.com/SatelliteQE/betelgeuse/blob/master/sample_project/results/sample-junit-result.xml failed with the traceback:

Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/celery/app/trace.py", line 385, in trace_task
    R = retval = fun(*args, **kwargs)
  File "/app/ibutsu_server/tasks/__init__.py", line 30, in __call__
    return super().__call__(*args, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/celery/app/trace.py", line 648, in __protected_call__
    return self.run(*args, **kwargs)
  File "/app/ibutsu_server/tasks/importers.py", line 76, in run_junit_import
    for testsuite in tree.testsuite:
  File "src/lxml/objectify.pyx", line 231, in lxml.objectify.ObjectifiedElement.__getattr__
  File "src/lxml/objectify.pyx", line 450, in lxml.objectify._lookupChildOrRaise
AttributeError: no such child: testsuite

This appears to be due to a legacy pytest xml version https://docs.pytest.org/en/stable/reference.html?highlight=junit#confval-junit_family.

Ideally we should support both. At the very least we should specify which XML version we support.

[OCP] Postgresql DB backup failing when copying results table

pg_dump: dumping contents of table "public.dashboards"
pg_dump: dumping contents of table "public.groups"
pg_dump: dumping contents of table "public.import_files"
pg_dump: dumping contents of table "public.imports"
pg_dump: dumping contents of table "public.meta"
pg_dump: dumping contents of table "public.projects"
pg_dump: dumping contents of table "public.report_files"
pg_dump: dumping contents of table "public.reports"
pg_dump: dumping contents of table "public.results"
pg_dump: WARNING:  terminating connection because of crash of another server process
DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
HINT:  In a moment you should be able to reconnect to the database and repeat your command.
pg_dump: error: Dumping the contents of table "results" failed: PQgetCopyData() failed.
pg_dump: error: Error message from server: server closed the connection unexpectedly
	This probably means the server terminated abnormally
	before or while processing the request.
pg_dump: error: The command was: COPY public.results (id, component, data, duration, env, params, project_id, result, run_id, source, start_time, test_id) TO stdout;
Backup failed

TypeError opening project selection dropdown

With a project is already selected, a TypeError occurs and the page goes blank when the project dropdown is clicked.

image

A refresh reloads the page without error.

I can remove the selected project from the dropdown with the X in the dropdown, and then the dropdown behaves as expected, listing the available projects.

UI jUnit Import is not setting project and showing undefined on the notification

Steps to reproduce:

  1. Select a project to make it active
  2. Import a jUnit XML file (here you can see the undefined for the run created notification:

ibutsu-import

  1. Find the related run (use the developer tools on the network tab to find the run ID) and check the API GET response and verify that the project_id is null:
{
  "component": null,
  "created": "2020-12-08T14:09:33.739307Z",
  "data": null,
  "duration": 0.067,
  "env": null,
  "id": "c3d3ea87-ca0b-4255-8d14-510e6d2cb0b1",
  "project_id": null,
  "source": null,
  "start_time": "2020-11-17T14:51:48.114645Z",
  "summary": {
    "errors": "1",
    "failures": "1",
    "skips": "1",
    "tests": "4",
    "xfailures": null,
    "xpasses": null
  }
}

Another suggestion would be to allow the user to close the run created notification so they would have time to click if it provides the run link. Or increase its timeout.

Update docs for postgresql

Our README and hosted documentation still contain references to mongo (and no references to postgresql).

We should update the README and getting started guide with instructions about deploying PSQL.

Update to SqlAlchemy 1.4

SqlAlchemy 1.4 breaks Ibutsu Server. For now we have pinned SqlAlchemy to 1.3.23, #158, but we should upgrade in the near future.

The immediate exception that is hit from the upgrade is:

  File "/app/ibutsu_server/controllers/result_controller.py", line 7, in <module>
    from ibutsu_server.util.count import get_count_estimate
  File "/app/ibutsu_server/util/count.py", line 7, in <module>
    from ibutsu_server.db.util import Explain
  File "/app/ibutsu_server/db/util.py", line 5, in <module>
    from sqlalchemy.sql.expression import _literal_as_text
ImportError: cannot import name '_literal_as_text' from 'sqlalchemy.sql.expression' (/usr/local/lib/python3.7/site-packages/sqlalchemy/sql/expression.py)

There may be other failures however.

cc @rsnyman

Run large queries in tasks

Large results from queries to the API can timeout and crash. It may be a good idea to look into a method of running queries in tasks.

Steps:

  1. Client submits a query to the API
  2. API plans the query and determines a large amount of rows will be queried/returned
  3. API returns 201/202 and a query ID
  4. Query is run in a task
  5. Client polls API with query ID
  6. On completion, the API returns the results

Automated deploys based on a tag

Currently the Ibutsu deploy system looks like:

  1. Tag a release in ibutsu-server
  2. Build Ibutsu's OpenShift images either via the UI or the oc command
  • once a build completes, the deployment configs we use are set to auto-deploy

Rather than having to go in and manually kickoff the builds of the OpenShift images, it'd be nice if they auto-triggered when there is a new tag in this repo.

To achieve this we can use webhooks:
https://docs.openshift.com/container-platform/4.7/cicd/builds/triggering-builds-build-hooks.html

However, since the OCP cluster where ibutsu is deploy is behind a VPN, we must use something to bypass the VPN like e.g. smee.io:
https://www.jenkins.io/blog/2019/01/07/webhook-firewalls/

cc @rsnyman

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.