ibutsu / ibutsu-server Goto Github PK
View Code? Open in Web Editor NEWIbutsu is a test result aggregator
Home Page: https://ibutsu-project.org
License: MIT License
Ibutsu is a test result aggregator
Home Page: https://ibutsu-project.org
License: MIT License
Figure out how to get the API controllers of a plugin visible to the OpenAPI spec system in Connexion.
Several teams in Insights have expressed interest in a matrix and/or table widget that displays specific extra metadata about the run.
An example table could be something like:
In the screenshot, the job could correspond to a specific run
or groups of runs
. The result
could be a pass percentage. And the other columns could be some user-specified metadata (like RHEL_VERSION, EGG_VERSION).
This could be accomplished in two ways:
As there are potentially several columns that would be useful to display here, I am leaning towards option 2). However, option 2) forgoes using dashboards, which means this new table would be visible to everybody.
Regardless of whether we decided to add a widget or view, we'd need to do the following:
a) Add a new widget BE query file to this directory https://github.com/ibutsu/ibutsu-server/tree/master/backend/ibutsu_server/widgets
b) Add the new widget to contants.py
c) Write the frontend for the new widget by adding a new component to this directory: https://github.com/ibutsu/ibutsu-server/tree/master/frontend/src/widgets
cc @eduardocerqueira @prinewgirl
Refactor the core of Ibutsu server into a plugin
We've been experiencing slowdown with our pagination (particularly with our ever-growing results
collection) and it can be explained by the use of two functions:
mongo.results.count
skip
in mongo.results.find
The cursor.skip() method requires the server to scan from the beginning of the input results set before beginning to return results. As the offset increases, cursor.skip() will become slower.
We have been talking about moving to PSQL, and while I agree that we will experience some performance boost, I think we will hit the same issues IF we leave the implementation of pagination as is. Based on the PSQL docs, we'd be implementing pagination with https://www.postgresql.org/docs/current/queries-limit.html, cf. in particular:
The rows skipped by an OFFSET clause still have to be computed inside the server; therefore a large OFFSET might be inefficient.
(ii) is my preferred option, because I think it is rare when someone actually wants to navigate to arbitrary pages of results.
In the implementation of (ii) we could introduce some (low) timeout for the mongo.results.count
query, if it hits the timeout, we limit the documents to some MAX_DOCUMENTS
.
In this way we circumvent the inherent slowness of count
and because skip
will never have to skip more than MAX_DOCUMENTS
. For loading the next batch of documents we could implement something akin to (i).
Some information is found in http://www.ovaistariq.net/404/mysql-paginated-displays-how-to-kill-performance-vs-how-to-improve-performance/
Build the plugin object with support for entry points/hooks.
Entry points:
Thanks @mshriver for reporting this.
Some teams have a need for filtering on not only the name of a marker but also the argument. E.g.
"markers": [
{
"args": [],
"kwargs": {},
"name": "tier2"
},
{
"args": [
"Critical"
],
"kwargs": {},
"name": "importance"
},
{
"args": [
"ContentViews"
],
"kwargs": {},
"name": "component"
},
{
"args": [
"High"
],
"kwargs": {},
"name": "importance"
}
]
It would be nice if Ibutsu could handle these more complicated markers.
When running get_import
it never populates the run_id:
{'filename': 'results.xml',
'format': '',
'id': 'aaf35f01-5711-4699-a40f-c8f0499ac776',
'run_id': None,
'status': 'done'}
But trying to get direct from the API the following is returned:
{
"filename": "results.xml",
"format": "",
"id": "aaf35f01-5711-4699-a40f-c8f0499ac776",
"metadata": {
"run_id": [
"66d5ff81-5e75-4df1-a497-41a094285ea3"
]
},
"status": "done"
}
It would be great if the run_id was populated accordingly when get_import is used.
We should probably just use junitparser instead of our internal homegrown solution for parsing j/xUnit XML. This may help with issues like #92
When a user attempts to load an object that doesn't exist, the Ibutsu frontend has a "white page of death". We should handle these missing objects better with an "[object] not found" message.
For example:
https://server/run/this-run-does-not-exist
We could use the EmptyState component to show the error message: https://www.patternfly.org/v4/components/empty-state
Create a new repository for the dashboard API and create a new plugin for it.
Implement the database upgrades hook.
When the API gets too many requests at once (particularly when they are async) I've noticed that backend pods can give up and exit:
- - [09/Mar/2021:13:41:31 +0000] "GET /api/result?filter=component%3Dhost_inventory&filter=env%3Dstage&filter=start_time%3E2021-02-07&page=20&pageSize=30 HTTP/1.1" 200 89298 "-" "Python/3.8 aiohttp/3.7.4.post0"
[2021-03-09 13:41:31 +0000] [1] [INFO] Handling signal: term
- - [09/Mar/2021:13:41:31 +0000] "GET /api/result?filter=component%3Ddrift&filter=env%3Dstage&filter=start_time%3E2021-02-07&page=16&pageSize=30 HTTP/1.1" 200 91776 "-" "Python/3.8 aiohttp/3.7.4.post0"
[2021-03-09 13:41:31 +0000] [37] [INFO] Worker exiting (pid: 37)
- - [09/Mar/2021:13:41:32 +0000] "GET /api/result?filter=component%3Dcompliance&filter=env%3Dstage&filter=start_time%3E2021-02-07&page=19&pageSize=30 HTTP/1.1" 200 86346 "-" "Python/3.8 aiohttp/3.7.4.post0"
[2021-03-09 13:41:32 +0000] [38] [INFO] Worker exiting (pid: 38)
- - [09/Mar/2021:13:41:32 +0000] "GET /api/result?filter=component%3Dpatchman&filter=env%3Dstage&filter=start_time%3E2021-02-07&page=18&pageSize=30 HTTP/1.1" 200 83171 "-" "Python/3.8 aiohttp/3.7.4.post0"
[2021-03-09 13:41:32 +0000] [39] [INFO] Worker exiting (pid: 39)
[2021-03-09 13:41:33 +0000] [1] [INFO] Shutting down: Master
We should have some better load balancing in place to deal with this. Ping me if you need code to reproduce the issue.
Create a github action to build and push docker images to quay.io on release. (https://github.com/marketplace/actions/push-to-registry)
Edit: we can just use build-triggers in quay rather than using a github action here.
https://github.com/marketplace/actions/push-to-registry
Currently, the xfail/xpass test state is accepted as a valid result in runs/results.
However, these are not considered in widget calculations. We should include these as viable states in all widgets.
I have a task to calculate a pass rate for a test_id
for a given period of time. It would be nice to have this endpoint in Ibutsu API.
I noticed that some counts in our production instance of Ibutsu are wildly inaccurate (e.g. filter on the Satellite QE project and look at results).
We should provide a small switch or checkbox so users can optionally disable the estimate.
To make it easier to add metadata to a run created via the Import API, add a metadata
field which can be applied to the run and the results.
metadata
field in Import API specmetadata
field in Import controllerSometimes it might be useful to know all test ids for a component and given period of time.
On prod Ibutsu in both the DB pod and BE pod, I am seeing:
[2020-10-27 16:59:31,129] ERROR in app: Exception on /api/artifact [POST]
Traceback (most recent call last):
File "/opt/app-root/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1277, in _execute_context
cursor, statement, parameters, context
File "/opt/app-root/lib/python3.6/site-packages/sqlalchemy/engine/default.py", line 593, in do_execute
cursor.execute(statement, parameters)
psycopg2.errors.ForeignKeyViolation: insert or update on table "artifacts" violates foreign key constraint "artifacts_result_id_fkey"
DETAIL: Key (result_id)=(8715bbd4-de7b-4881-864d-3bf164c35a18) is not present in table "results".
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/opt/app-root/lib/python3.6/site-packages/flask/app.py", line 2447, in wsgi_app
response = self.full_dispatch_request()
File "/opt/app-root/lib/python3.6/site-packages/flask/app.py", line 1952, in full_dispatch_request
rv = self.handle_user_exception(e)
File "/opt/app-root/lib/python3.6/site-packages/flask_cors/extension.py", line 165, in wrapped_function
return cors_after_request(app.make_response(f(*args, **kwargs)))
File "/opt/app-root/lib/python3.6/site-packages/flask/app.py", line 1821, in handle_user_exception
reraise(exc_type, exc_value, tb)
File "/opt/app-root/lib/python3.6/site-packages/flask/_compat.py", line 39, in reraise
raise value
File "/opt/app-root/lib/python3.6/site-packages/flask/app.py", line 1950, in full_dispatch_request
rv = self.dispatch_request()
File "/opt/app-root/lib/python3.6/site-packages/flask/app.py", line 1936, in dispatch_request
return self.view_functions[rule.endpoint](**req.view_args)
File "/opt/app-root/lib/python3.6/site-packages/connexion/decorators/decorator.py", line 48, in wrapper
response = function(request)
File "/opt/app-root/lib/python3.6/site-packages/connexion/decorators/uri_parsing.py", line 144, in wrapper
response = function(request)
File "/opt/app-root/lib/python3.6/site-packages/connexion/decorators/validation.py", line 184, in wrapper
response = function(request)
File "/opt/app-root/lib/python3.6/site-packages/connexion/decorators/parameter.py", line 121, in wrapper
return function(**kwargs)
File "/opt/app-root/src/ibutsu_server/controllers/artifact_controller.py", line 127, in upload_artifact
session.commit()
File "/opt/app-root/lib/python3.6/site-packages/sqlalchemy/orm/scoping.py", line 163, in do
return getattr(self.registry(), name)(*args, **kwargs)
File "/opt/app-root/lib/python3.6/site-packages/sqlalchemy/orm/session.py", line 1042, in commit
self.transaction.commit()
File "/opt/app-root/lib/python3.6/site-packages/sqlalchemy/orm/session.py", line 504, in commit
self._prepare_impl()
File "/opt/app-root/lib/python3.6/site-packages/sqlalchemy/orm/session.py", line 483, in _prepare_impl
self.session.flush()
File "/opt/app-root/lib/python3.6/site-packages/sqlalchemy/orm/session.py", line 2536, in flush
self._flush(objects)
File "/opt/app-root/lib/python3.6/site-packages/sqlalchemy/orm/session.py", line 2678, in _flush
transaction.rollback(_capture_exception=True)
File "/opt/app-root/lib/python3.6/site-packages/sqlalchemy/util/langhelpers.py", line 70, in __exit__
with_traceback=exc_tb,
File "/opt/app-root/lib/python3.6/site-packages/sqlalchemy/util/compat.py", line 182, in raise_
raise exception
File "/opt/app-root/lib/python3.6/site-packages/sqlalchemy/orm/session.py", line 2638, in _flush
flush_context.execute()
File "/opt/app-root/lib/python3.6/site-packages/sqlalchemy/orm/unitofwork.py", line 422, in execute
rec.execute(self)
File "/opt/app-root/lib/python3.6/site-packages/sqlalchemy/orm/unitofwork.py", line 589, in execute
uow,
File "/opt/app-root/lib/python3.6/site-packages/sqlalchemy/orm/persistence.py", line 245, in save_obj
insert,
File "/opt/app-root/lib/python3.6/site-packages/sqlalchemy/orm/persistence.py", line 1136, in _emit_insert_statements
statement, params
File "/opt/app-root/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1011, in execute
return meth(self, multiparams, params)
File "/opt/app-root/lib/python3.6/site-packages/sqlalchemy/sql/elements.py", line 298, in _execute_on_connection
return connection._execute_clauseelement(self, multiparams, params)
File "/opt/app-root/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1130, in _execute_clauseelement
distilled_params,
File "/opt/app-root/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1317, in _execute_context
e, statement, parameters, cursor, context
File "/opt/app-root/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1511, in _handle_dbapi_exception
sqlalchemy_exception, with_traceback=exc_info[2], from_=e
File "/opt/app-root/lib/python3.6/site-packages/sqlalchemy/util/compat.py", line 182, in raise_
raise exception
File "/opt/app-root/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1277, in _execute_context
cursor, statement, parameters, context
File "/opt/app-root/lib/python3.6/site-packages/sqlalchemy/engine/default.py", line 593, in do_execute
cursor.execute(statement, parameters)
sqlalchemy.exc.IntegrityError: (psycopg2.errors.ForeignKeyViolation) insert or update on table "artifacts" violates foreign key constraint "artifacts_result_id_fkey"
DETAIL: Key (result_id)=(8715bbd4-de7b-4881-864d-3bf164c35a18) is not present in table "results".
Fairly regularly. I think in some instances, the plugin may be trying to upload an artifact before the result has been uploaded.
Considering the following run GET response (one provided after a jUnit import):
{
"component": null,
"created": "2020-12-08T14:09:33.739307Z",
"data": null,
"duration": 0.067,
"env": null,
"id": "c3d3ea87-ca0b-4255-8d14-510e6d2cb0b1",
"project_id": null,
"source": null,
"start_time": "2020-11-17T14:51:48.114645Z",
"summary": {
"errors": "1",
"failures": "1",
"skips": "1",
"tests": "4",
"xfailures": null,
"xpasses": null
}
}
Then when trying to use it on a PUT the following 400 response body is returned:
{
"detail": "None is not of type 'string' - 'source'",
"status": 400,
"title": "Bad Request",
"type": "about:blank"
}
So it seems that source is expected to be an empty string ""
instead of null but the GET response for a run is not providing a valid value. If I just drop the source
field it works as expected.
The count estimate -- especially when filtering on a small subset of results (e.g. a run) is sometimes extremely inaccurate.
We should find a way to disable the count estimate when it is known that we are filtering on a small subset of results which have indexes.
On prod ibutsu, select the "Satellite QE" project and try to run a report:
{
"detail": "The server encountered an internal error and was unable to complete your request. Either the server is overloaded or there is an error in the application.",
"status": 500,
"title": "Internal Server Error",
"type": "about:blank"
}
Provide a way for Ibutsu clients to find out which plugins are installed.
Build the database models hook implementation
Steps to reproduce:
Error fetching result data: TypeError: Cannot read property 'split' of null
at utilities.js:349
at run.js:156
at Array.forEach (<anonymous>)
at Ba.buildTree (run.js:155)
at Ba.<anonymous> (run.js:369)
at fo (react-dom.production.min.js:131)
at ol (react-dom.production.min.js:212)
at ps (react-dom.production.min.js:255)
at t.unstable_runWithPriority (scheduler.production.min.js:19)
at Va (react-dom.production.min.js:122)
Using an existing framework is probably the best way to go about implementing this. pytest uses the pluggy framework, and personally I think we should use that.
When a pytest session is interrupted, e.g. by an outage or by the pod being killed, the run is never updated. This means that the counts, metadata, etc will never be added to the run.
We should have a periodic task that goes through and checks for these runs - updating them with the partial results.
When an XML file has no testcase
children in a testsuite
, then the following exception occurs:
Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/celery/app/trace.py", line 385, in trace_task
R = retval = fun(*args, **kwargs)
File "/app/ibutsu_server/tasks/__init__.py", line 30, in __call__
return super().__call__(*args, **kwargs)
File "/usr/local/lib/python3.7/site-packages/celery/app/trace.py", line 648, in __protected_call__
return self.run(*args, **kwargs)
File "/app/ibutsu_server/tasks/importers.py", line 95, in run_junit_import
for testcase in testsuite.testcase:
File "src/lxml/objectify.pyx", line 231, in lxml.objectify.ObjectifiedElement.__getattr__
File "src/lxml/objectify.pyx", line 450, in lxml.objectify._lookupChildOrRaise
AttributeError: no such child: testcase
Make a new repository for the frontend.
If an import fails, there usually isn't a way to see that it has failed, nor is there a way to tell what the error was.
We need to catch errors and report them to the user.
We often run parallel and sequential set of tests in a single jenkins job. The problem is that both runs get different run id
and the heatmap in dashboard is showing result just for one of runs.
Sometimes when I click a direct link to a Run or Result details page in Ibutsu, I must refresh the page in order for it to load properly.
In the console I can see this error,
Uncaught (in promise) TypeError: Cannot read property 'result' of null
at Ot.render (result.js:165)
at Ri (react-dom.production.min.js:182)
at Ii (react-dom.production.min.js:181)
at gl (react-dom.production.min.js:263)
at cs (react-dom.production.min.js:246)
at ls (react-dom.production.min.js:246)
at Ql (react-dom.production.min.js:239)
at react-dom.production.min.js:123
at t.unstable_runWithPriority (scheduler.production.min.js:19)
at Va (react-dom.production.min.js:122)
It seems like we need to fix the loading state of these pages. It seems like it only happens with the details pages.
Recently importing this Junit XML file https://github.com/SatelliteQE/betelgeuse/blob/master/sample_project/results/sample-junit-result.xml failed with the traceback:
Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/celery/app/trace.py", line 385, in trace_task
R = retval = fun(*args, **kwargs)
File "/app/ibutsu_server/tasks/__init__.py", line 30, in __call__
return super().__call__(*args, **kwargs)
File "/usr/local/lib/python3.7/site-packages/celery/app/trace.py", line 648, in __protected_call__
return self.run(*args, **kwargs)
File "/app/ibutsu_server/tasks/importers.py", line 76, in run_junit_import
for testsuite in tree.testsuite:
File "src/lxml/objectify.pyx", line 231, in lxml.objectify.ObjectifiedElement.__getattr__
File "src/lxml/objectify.pyx", line 450, in lxml.objectify._lookupChildOrRaise
AttributeError: no such child: testsuite
This appears to be due to a legacy pytest xml version https://docs.pytest.org/en/stable/reference.html?highlight=junit#confval-junit_family.
Ideally we should support both. At the very least we should specify which XML version we support.
Implement authentication in the API. See https://swagger.io/docs/specification/authentication/ for more details.
pg_dump: dumping contents of table "public.dashboards"
pg_dump: dumping contents of table "public.groups"
pg_dump: dumping contents of table "public.import_files"
pg_dump: dumping contents of table "public.imports"
pg_dump: dumping contents of table "public.meta"
pg_dump: dumping contents of table "public.projects"
pg_dump: dumping contents of table "public.report_files"
pg_dump: dumping contents of table "public.reports"
pg_dump: dumping contents of table "public.results"
pg_dump: WARNING: terminating connection because of crash of another server process
DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
HINT: In a moment you should be able to reconnect to the database and repeat your command.
pg_dump: error: Dumping the contents of table "results" failed: PQgetCopyData() failed.
pg_dump: error: Error message from server: server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
pg_dump: error: The command was: COPY public.results (id, component, data, duration, env, params, project_id, result, run_id, source, start_time, test_id) TO stdout;
Backup failed
With a project is already selected, a TypeError occurs and the page goes blank when the project dropdown is clicked.
A refresh reloads the page without error.
I can remove the selected project from the dropdown with the X
in the dropdown, and then the dropdown behaves as expected, listing the available projects.
Implement the hook to add a plugin's API to the OpenAPI spec
Steps to reproduce:
project_id
is null
:{
"component": null,
"created": "2020-12-08T14:09:33.739307Z",
"data": null,
"duration": 0.067,
"env": null,
"id": "c3d3ea87-ca0b-4255-8d14-510e6d2cb0b1",
"project_id": null,
"source": null,
"start_time": "2020-11-17T14:51:48.114645Z",
"summary": {
"errors": "1",
"failures": "1",
"skips": "1",
"tests": "4",
"xfailures": null,
"xpasses": null
}
}
Another suggestion would be to allow the user to close the run created notification so they would have time to click if it provides the run link. Or increase its timeout.
Our README and hosted documentation still contain references to mongo (and no references to postgresql).
We should update the README and getting started guide with instructions about deploying PSQL.
SqlAlchemy 1.4 breaks Ibutsu Server. For now we have pinned SqlAlchemy to 1.3.23
, #158, but we should upgrade in the near future.
The immediate exception that is hit from the upgrade is:
File "/app/ibutsu_server/controllers/result_controller.py", line 7, in <module>
from ibutsu_server.util.count import get_count_estimate
File "/app/ibutsu_server/util/count.py", line 7, in <module>
from ibutsu_server.db.util import Explain
File "/app/ibutsu_server/db/util.py", line 5, in <module>
from sqlalchemy.sql.expression import _literal_as_text
ImportError: cannot import name '_literal_as_text' from 'sqlalchemy.sql.expression' (/usr/local/lib/python3.7/site-packages/sqlalchemy/sql/expression.py)
There may be other failures however.
cc @rsnyman
Large results from queries to the API can timeout and crash. It may be a good idea to look into a method of running queries in tasks.
Steps:
Convert our unit tests to run as part of github actions rather than using travis
Create a new repo and a new plugin for the reports.
Currently the Ibutsu deploy system looks like:
ibutsu-server
oc
commandRather than having to go in and manually kickoff the builds of the OpenShift images, it'd be nice if they auto-triggered when there is a new tag in this repo.
To achieve this we can use webhooks:
https://docs.openshift.com/container-platform/4.7/cicd/builds/triggering-builds-build-hooks.html
However, since the OCP cluster where ibutsu is deploy is behind a VPN, we must use something to bypass the VPN like e.g. smee.io:
https://www.jenkins.io/blog/2019/01/07/webhook-firewalls/
cc @rsnyman
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.