Giter VIP home page Giter VIP logo

paddles's Introduction

paddles

A very simple JSON based API to store and report back on test results from Ceph tests.

Setup

To install and use paddles:

  1. Install the following packages (names provided are based on an Ubuntu install): git python-dev python-virtualenv postgresql postgresql-contrib postgresql-server-dev-all supervisor
  2. Install and configure PostgreSQL on your system.
  3. Create a database. Ours is called 'paddles'
  4. Clone the repository
  5. Inside the repository, create a virtualenv: virtualenv ./virtualenv
  6. Create a copy of the configuration template: cp config.py.in config.py
  7. Edit config.py to reflect your hostnames, database info, etc.
  8. Activate the virtualenv: source ./virtualenv/bin/activate
  9. Install required python packages: pip install -r requirements.txt
  10. Run python setup.py develop
  11. Populate the database tables: pecan populate config.py
  12. Create a copy of the alembic configuration template: cp alembic.ini.in alembic.ini
  13. Edit alembic.ini to reflect your database information.
  14. Tell alembic that you have the latest database version: alembic stamp head
  15. To start the server for testing purposes, you may use pecan serve config.py - though for production use it's wise to use a real server. We use gunicorn managed by supervisord. Sample config files are provided for gunicorn and supervisord.
  16. To get teuthology talking to paddles add a line like this to your ~/.teuthology.yaml: results_server: http://paddles.example.com/

/runs/

Read

On GET operations it will display the latest 100 runs with a JSON object of all recent jobs reported.

{
    "teuthology-2013-09-25_23:00:06-rados-master-testing-basic-plana": {
        "href": "http://paddles/runs/teuthology-2013-09-25_23:00:06-rados-master-testing-basic-plana/",
        "status": "running",
        "results": {
            "pass": 10,
            "running": 15,
            "fail": 0
        }
    },
    "teuthology-2013-09-26_01:30:26-upgrade-fs-next-testing-basic-plana": {
        "href": "http://paddles/teuthology-2013-09-26_01:30:26-upgrade-fs-next-testing-basic-plana",
        "status": "running",
        "results": {
            "pass": 3,
            "running": 10,
            "fail": 1
        }
    },
    "teuthology-2013-09-26_01:30:26-rados-next-testing-basic-plana": {
        "href": "http://paddles/runs/teuthology-2013-09-26_01:30:26-rados-next-testing-basic-plana/",
        "status": "finished",
        "results": {
            "pass": 8,
            "running": 0,
            "fail": 2
        }
    }
}

The example above gives returns the three result types available for jobs: pass, fail, and running with its respective links. These are built from the information for every run as the results come in. They are read-only values. It will also report on the overall status of the run: running or finished.

Create

These operations create new entries for runs, it is only required to POST a JSON object that has a name key and the actual name of the run as the value:

{ "name": "teuthology-2013-09-01_23:59:59-rados-master-testing-basic-plana" }

HTTP responses:

  • 200: Success.
  • 400: Invalid request.

/runs/{name}/

Read

To read information for a specific run a GET needs to be requested. On valid requests (for existing runs) a JSON object with all the jobs scheduled for that specific run are returned. Below is an example of a valid request:

{
    "1500": {
        "href": "http://paddles/runs/teuthology-2013-09-01_23:59:59-rados-master-testing-basic-plana/1500/",
        "status": "running",
        "results": {
            "pass": 8,
            "running": 13,
            "fail": 2
        }
    },
    "1501": {
        "href": "http://paddles/runs/teuthology-2013-09-01_23:59:59-rados-master-testing-basic-plana/1501/",
        "status": "finished",
        "results": {
            "pass": 8,
            "running": 0,
            "fail": 4
        }
    },
    "1502": {
        "href": "http://paddles/runs/teuthology-2013-09-01_23:59:59-rados-master-testing-basic-plana/1502/",
        "status": "finished",
        "results": {
            "pass": 3,
            "running": 0,
            "fail": 17
        }
    }
}

/runs/{name}/jobs/

Read

GET requests will return a full list of all the jobs associated with the current run.

If no jobs exist, an empty array is returned, otherwise this is how a single object would look like:

[

    {
        "archive_path": null,
        "kernel": null,
        "teuthology_branch": null,
        "tasks": null,
        "verbose": null,
        "description": null,
        "roles": null,
        "overrides": null,
        "pid": null,
        "success": null,
        "name": null,
        "targets": null,
        "owner": null,
        "last_in_suite": null,
        "os_type": null,
        "machine_type": null,
        "nuke_on_error": null,
        "duration": null,
        "flavor": null,
        "email": null,
        "job_id": "1"
    }

]

Create

POST requests with valid metadata for a job can create new jobs. Keys that are not part of the schema will be ignored. Keys that are saved to the database are:

  • name
  • email
  • archive_path
  • description
  • duration
  • flavor
  • job_id
  • kernel
  • last_in_suite
  • machine_type
  • mon.a_kernel_sha1 (note this key gets transformed to underscores)
  • mon.b_kernel_sha1 (note this key gets transformed to underscores)
  • nuke_on_error
  • os_type
  • overrides
  • owner
  • pid
  • roles
  • success
  • targets
  • tasks
  • teuthology_branch
  • verbose
  • branch
  • sha1
  • suite_sha1
  • pcp_grafana_url

For initial creation of a job associated to its run a job_id key is required. It is the only key in the JSON body that must exist, otherwise a 400 error is returned.

HTTP responses:

  • 200: Success.
  • 400: Invalid request.
  • 404: The requested run was not found.

Note

updates for the results of these runs are programatically calculated from individual jobs

/runs/{name}/jobs/{job_id}/

Read

On GET requests an object with all metadata saved from the actual job will be returned.

Update

PUT requests can contain any of the keys accepted for metadata, they get updated accordingly except for job_id. That is the one key that can never be changed.

  • 200: Success.
  • 400: Invalid request.
  • 404: The requested run was not found.

paddles's People

Contributors

akraitman avatar alfredodeza avatar amathuria avatar andrewschoen avatar dependabot[bot] avatar dmick avatar jdurgin avatar kshtsk avatar ldachary avatar tchaikov avatar toabctl avatar zmc avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

paddles's Issues

enable querying of `runs`

We need to be able to query paddles to give us the "latest N runs"

I am thinking that: /runs/latest/{number} would be a nice API

Ideally we would want to limit the number of items to be returned because this can seriously overload the DB queries that it needs to do to grab the info for the JSON, maybe no more than 40?

worked as lock_server and result_server,only support concurrent job <5

I follows README.rst to setup my paddles server,when 5 teuthology-worker was started,one job connect will be reset,the log print as follows:

2014-11-21T22:50:34.281 INFO:teuthology.run:Found tasks at /root/src/ceph-qa-suite-master/tasks
2014-11-21T22:50:34.282 INFO:teuthology.run_tasks:Running task internal.lock_machines...
2014-11-21T22:50:34.282 INFO:teuthology.task.internal:Locking machines...
2014-11-21T22:51:24.961 DEBUG:teuthology.lock:lock_many request: {'count': 2, 'os_type': 'centos', 'locked_by': '[email protected]', 'description': '/var/www/html/teuthology2/root-2014-11-21_15:59:07-rgw-master-distro-basic-plana/10', 'machine_type': 'plana'}
2014-11-21T22:53:08.189 ERROR:teuthology.run_tasks:Saw exception from tasks.
Traceback (most recent call last):
File "/root/src/teuthology-master/virtualenv/lib/python2.7/site-packages/teuthology-0.1.0-py2.7.egg/teuthology/run_tasks.py", line 55, in run_tasks
manager.enter()
File "/root/.pyenv/versions/2.7.8/lib/python2.7/contextlib.py", line 17, in enter
return self.gen.next()
File "/root/src/teuthology-master/virtualenv/lib/python2.7/site-packages/teuthology-0.1.0-py2.7.egg/teuthology/task/internal.py", line 105, in lock_machines
ctx.archive, os_type, os_version, arch)
File "/root/src/teuthology-master/virtualenv/lib/python2.7/site-packages/teuthology-0.1.0-py2.7.egg/teuthology/lock.py", line 357, in lock_many
headers={'content-type': 'application/json'},
File "/root/src/teuthology-master/virtualenv/lib/python2.7/site-packages/requests/api.py", line 94, in post
return request('post', url, data=data, json=json, *_kwargs)
File "/root/src/teuthology-master/virtualenv/lib/python2.7/site-packages/requests/api.py", line 49, in request
return session.request(method=method, url=url, *_kwargs)
File "/root/src/teuthology-master/virtualenv/lib/python2.7/site-packages/requests/sessions.py", line 457, in request
resp = self.send(prep, *_send_kwargs)
File "/root/src/teuthology-master/virtualenv/lib/python2.7/site-packages/requests/sessions.py", line 569, in send
r = adapter.send(request, *_kwargs)
File "/root/src/teuthology-master/virtualenv/lib/python2.7/site-packages/requests/adapters.py", line 407, in send
raise ConnectionError(err, request=request)
ConnectionError: ('Connection aborted.', error(104, 'Connection reset by peer'))

Provide an API grouping runs by suite

It seems likely that this feature is going to rely on parsing the suite name out of the run name - unless we want to modify teuthology to keep track of which suite each job is generated by. I'm not sure we do.

  • Add Run.suite property
  • Add Run.ceph_branch or Run.ceph_sha1 property?
  • Add Run.machine_type property?
  • Add some sort of API for "what is the next most recent run of this suite (probably on the same branch and machine_type"

Next step is comparing run results, à la:

$ ./compare.py teuthology-2013-10-22_15:11:33-rgw-next-testing-basic-plana teuthology-2013-10-22_15:11:26-rgw-next-testing-basic-plana
[...]
P => P  rgw/multifs/{clusters/fixed-2.yaml fs/ext4.yaml tasks/rgw_s3tests.yaml}
P => P  rgw/multifs/{clusters/fixed-2.yaml fs/ext4.yaml tasks/rgw_swift.yaml}
P => F  rgw/multifs/{clusters/fixed-2.yaml fs/xfs.yaml tasks/rgw_readwrite.yaml}
[...]

... But that can be a separate ticket.

ERROR:teuthology.report:Could not report results to

The master branch appears

pecan log:
(virtualenv)[paddles@teuthology paddles]$ pecan serve config.py
Starting server in PID 23712
serving on http://192.168.0.32:8080
No handlers could be found for logger "sqlalchemy.engine.base.Engine"
2017-09-27 23:22:17,407 INFO [paddles.controllers.jobs] Creating job: yujiang/17
2017-09-27 23:22:27,200 INFO [paddles.controllers.jobs] Job yujiang/17 status changed from queued to running
2017-09-27 23:22:27,264 INFO [paddles.controllers.jobs] Job yujiang/17 status changed from running to waiting
2017-09-27 23:22:27,282 DEBUG [paddles.controllers.nodes] Locking 1 plana nodes for [email protected]
2017-09-27 23:22:27,284 INFO sqlalchemy.engine.base.OptionEngine BEGIN (implicit)
2017-09-27 23:22:27,285 INFO sqlalchemy.engine.base.OptionEngine SELECT nodes.id AS nodes_id, nodes.name AS nodes_name, nodes.description AS nodes_description, nodes.up AS nodes_up, nodes.machine_type AS nodes_machine_type, nodes.arch AS nodes_arch, nodes.is_vm AS nodes_is_vm, nodes.os_type AS nodes_os_type, nodes.os_version AS nodes_os_version, nodes.vm_host_id AS nodes_vm_host_id, nodes.locked AS nodes_locked, nodes.locked_by AS nodes_locked_by, nodes.locked_since AS nodes_locked_since, nodes.mac_address AS nodes_mac_address, nodes.ssh_pub_key AS nodes_ssh_pub_key
FROM nodes
WHERE nodes.machine_type = %(machine_type_1)s AND nodes.up IS true AND nodes.locked IS false
LIMIT %(param_1)s
2017-09-27 23:22:27,285 INFO sqlalchemy.engine.base.OptionEngine {'machine_type_1': u'plana', 'param_1': 1}
2017-09-27 23:22:27,287 INFO sqlalchemy.engine.base.OptionEngine UPDATE nodes SET description=%(description)s, locked=%(locked)s, locked_by=%(locked_by)s, locked_since=%(locked_since)s WHERE nodes.id = %(nodes_id)s
2017-09-27 23:22:27,287 INFO sqlalchemy.engine.base.OptionEngine {'nodes_id': 2, 'locked_by': u'[email protected]', 'locked': True, 'description': u'/home/teuthworker/archive/yujiang/17', 'locked_since': datetime.datetime(2017, 9, 27, 15, 22, 27, 286230)}
2017-09-27 23:22:27,288 INFO sqlalchemy.engine.base.OptionEngine COMMIT
2017-09-27 23:22:27,292 INFO sqlalchemy.engine.base.OptionEngine BEGIN (implicit)
2017-09-27 23:22:27,293 INFO sqlalchemy.engine.base.OptionEngine SELECT nodes.id AS nodes_id, nodes.name AS nodes_name, nodes.description AS nodes_description, nodes.up AS nodes_up, nodes.machine_type AS nodes_machine_type, nodes.arch AS nodes_arch, nodes.is_vm AS nodes_is_vm, nodes.os_type AS nodes_os_type, nodes.os_version AS nodes_os_version, nodes.vm_host_id AS nodes_vm_host_id, nodes.locked AS nodes_locked, nodes.locked_by AS nodes_locked_by, nodes.locked_since AS nodes_locked_since, nodes.mac_address AS nodes_mac_address, nodes.ssh_pub_key AS nodes_ssh_pub_key
FROM nodes
WHERE nodes.id = %(param_1)s
2017-09-27 23:22:27,293 INFO sqlalchemy.engine.base.OptionEngine {'param_1': 2}
2017-09-27 23:22:27,294 INFO [paddles.controllers.nodes] Locked for [email protected] with description /home/teuthworker/archive/yujiang/17
2017-09-27 23:22:27,295 INFO sqlalchemy.engine.base.OptionEngine COMMIT
2017-09-27 23:22:27,337 INFO [paddles.controllers.jobs] Job yujiang/17 status changed from waiting to running
2017-09-27 23:22:27,658 INFO [paddles.controllers.nodes] Updating : {u'name': u'plana004.lenovo.com', u'up': True, u'os_version': u'7.3', u'user': u'ubuntu', u'os_type': u'centos', u'arch': u'x86_64', u'ssh_pub_key': u'ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCi9riDqGdUpCOd9pajdmVEWncYtvt5nB9FUECkodBjhGtqEjuc0aQrETptmCNlhcMK0/s6KPTgGe+JiD31wO4GZbW7hdOSDLf7DTRtwQzXplgqcMJH/FWY5NccG5ZLzsDuNaPJE36NNFFmL4/Dj/IfjWfLsAGOErfVQoZl1OgMnYf6SdUnHsH/ObFTlCJ4sJWWsGseBJyVT4OPA1P7CKaSnQsAzPGdgNLUOQmymIuA7+dHFGM3RzUqie10IMypcd4j+Msaa1aqQAHT21oh0NsjW+bDs950t+dCFOXRmKpFQCPeIXB5QlNAqxsrigDycV35bGWtudVOMeDiWY6REWsv'}
2017-09-27 23:22:37,015 DEBUG [paddles.controllers.nodes] Unlocking for [email protected] with description /home/teuthworker/archive/yujiang/17
2017-09-27 23:22:37,019 INFO [paddles.controllers.nodes] Unlocked for [email protected] with description /home/teuthworker/archive/yujiang/17
2017-09-27 23:22:37,113 INFO [paddles.stats] Could not find statsd configuration; disabling statsd. Error message was: 'module' object has no attribute 'Connection'
Traceback (most recent call last):
File "/usr/lib64/python2.7/wsgiref/handlers.py", line 85, in run
self.result = application(self.environ, self.start_response)
File "/home/paddles/github/paddles/virtualenv/lib/python2.7/site-packages/pecan/middleware/recursive.py", line 56, in call
return self.application(environ, start_response)
File "/home/paddles/github/paddles/virtualenv/lib/python2.7/site-packages/pecan/core.py", line 810, in call
return super(Pecan, self).call(environ, start_response)
File "/home/paddles/github/paddles/virtualenv/lib/python2.7/site-packages/pecan/core.py", line 659, in call
self.invoke_controller(controller, args, kwargs, state)
File "/home/paddles/github/paddles/virtualenv/lib/python2.7/site-packages/pecan/core.py", line 559, in invoke_controller
result = controller(*args, **kwargs)
File "/home/paddles/github/paddles/paddles/controllers/jobs.py", line 53, in index_post
self.job.update(request.json)
File "/home/paddles/github/paddles/paddles/models/jobs.py", line 206, in update
self.set_or_update(json_data)
File "/home/paddles/github/paddles/paddles/models/jobs.py", line 148, in set_or_update
counter = get_statsd_client().get_counter('jobs.status')
File "/home/paddles/github/paddles/paddles/stats.py", line 24, in get_client
statsd.Connection.set_defaults(
AttributeError: 'module' object has no attribute 'Connection'
2017-09-27 23:24:26,240 INFO [paddles.stats] Could not find statsd configuration; disabling statsd. Error message was: 'module' object has no attribute 'Connection'
Traceback (most recent call last):
File "/usr/lib64/python2.7/wsgiref/handlers.py", line 85, in run
self.result = application(self.environ, self.start_response)
File "/home/paddles/github/paddles/virtualenv/lib/python2.7/site-packages/pecan/middleware/recursive.py", line 56, in call
return self.application(environ, start_response)
File "/home/paddles/github/paddles/virtualenv/lib/python2.7/site-packages/pecan/core.py", line 810, in call
return super(Pecan, self).call(environ, start_response)
File "/home/paddles/github/paddles/virtualenv/lib/python2.7/site-packages/pecan/core.py", line 659, in call
self.invoke_controller(controller, args, kwargs, state)
File "/home/paddles/github/paddles/virtualenv/lib/python2.7/site-packages/pecan/core.py", line 559, in invoke_controller
result = controller(*args, **kwargs)
File "/home/paddles/github/paddles/paddles/controllers/jobs.py", line 53, in index_post
self.job.update(request.json)
File "/home/paddles/github/paddles/paddles/models/jobs.py", line 206, in update
self.set_or_update(json_data)
File "/home/paddles/github/paddles/paddles/models/jobs.py", line 148, in set_or_update
counter = get_statsd_client().get_counter('jobs.status')
File "/home/paddles/github/paddles/paddles/stats.py", line 24, in get_client
statsd.Connection.set_defaults(
AttributeError: 'module' object has no attribute 'Connection'

teuthology log:
2017-09-27T23:22:37.054 DEBUG:teuthology.report:Pushing job info to http://10.100.46.208:8080
2017-09-27T23:22:37.124 ERROR:teuthology.report:Could not report results to http://10.100.46.208:8080
Traceback (most recent call last):
File "/home/teuthworker/src/teuthology_master/teuthology/report.py", line 466, in try_push_job_info
push_job_info(run_name, job_id, job_info)
File "/home/teuthworker/src/teuthology_master/teuthology/report.py", line 431, in push_job_info
reporter.report_job(run_name, job_id, job_info)
File "/home/teuthworker/src/teuthology_master/teuthology/report.py", line 320, in report_job
response.raise_for_status()
File "/home/teuthworker/src/teuthology_master/virtualenv/lib/python2.7/site-packages/requests/models.py", line 935, in raise_for_status
raise HTTPError(http_error_msg, response=self)
HTTPError: 500 Server Error: Internal Server Error for url: http://10.100.46.208:8080/runs/yujiang/jobs/17/
2017-09-27T23:22:37.125 INFO:teuthology.run:pass

Provide more information about run/job timing

99% of the runs we're looking at have timestamps in the name. It would be great to use that instead of, or in addition to, the time it was posted to paddles. I'm now thinking we should track times for:

Jobs

  • Job.posted: just like Job.timestamp is now - recorded when the job is created
  • Job.updated: gets set each time the job is updated

Runs

  • Run.scheduled: parsed from Run.name; use Run.posted as fallback
  • Run.posted: just like Run.timestamp is now - recorded when the run is created
  • Run.updated: returns the latest Job.updated

Thoughts?

Bug: "unknown" status of runs and jobs

Some jobs and runs seem to have an "unknown" status. These runs/jobs are actually still queued and haven't started yet.

Example:
image
(Log links lead to 404 not found)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.