galaxyproject / gxadmin Goto Github PK

View Code? Open in Web Editor NEW

23.0 21.0 27.0 2.08 MB

Handy command line utility for Galaxy administrators :rocket:

Home Page: https://galaxyproject.github.io/gxadmin/#/

License: GNU General Public License v3.0

Shell 83.28% Makefile 0.34% Python 16.09% Awk 0.29%

sysadmin galaxy-project automation sql postgres-databases galaxy hacktoberfest

gxadmin's Issues

Deprecate uwsgi commands

Now that galaxy et al move to gunicorn.

gxadmin query user-infor - `Largest Histories`

Do show histories that are already deleted.

make sure [--commit] is consistent across all commands

Idea for new queue-detail query with drm dependent information on cores and mem.

In Galaxy Australia we use Slurm as the DRM for all of our nodes. We have a local query which shows the number of cores and mem requested for each job as well as the handler id, runner id and destination id.

It would be nice to be able to add this into gxadmin with a flag for different DRMs such as torque or HTCondor.

Here is the query we currently use:

local_query-queue() { ##? [--all] [--seconds] [--since-update]: Detailed overview of running and queued jobs with cores/mem info
        # this is a copy of gxadmin query queue-detail with job destination info (cores/mem/partition) added and runner_id, count removed
        handle_help "$@" <<-EOF
                        $ gxadmin local query-queue
                          state  |  id  | extid |                          tool_id                          | username | time_since_creation |       handler       | cores | mem  | partition | destination_id
                        ---------+------+-------+-----------------------------------------------------------+----------+---------------------+---------------------+-------+------+-----------+-----------------
                         running | 4385 | 4011  | upload1                                                   | cat      | 00:01:01.518932     | main.job-handlers.2 | 2     | 6144 |           | slurm
                         queued  | 4387 | 4012  | toolshed.g2.bx.psu.edu/repos/devteam/bwa/bwa_mem/0.7.17.2 | cat      | 00:00:24.377336     | main.job-handlers.2 | 1     | 3072 |           | slurm
                         queued  | 4388 | 4388  | toolshed.g2.bx.psu.edu/repos/devteam/bwa/bwa_mem/0.7.17.2 | cat      | 00:00:13.254505     | main.job-handlers.1 | 1     | 3072 |           | pulsar-nci-test
                         queued  | 4389 | 4013  | toolshed.g2.bx.psu.edu/repos/devteam/bwa/bwa_mem/0.7.17.2 | cat      | 00:00:01.834048     | main.job-handlers.2 | 1     | 3072 |           | slurm
        EOF

        fields="count=9"
        tags="state=0;id=1;extid=2;tool_id=3;username=4;handler=6;job_runner_name=7;destination_id=8"

        d=""
        nonpretty="("
        time_column="job.create_time"
        time_column_name="time_since_creation"

        if [[ -n "$arg_all" ]]; then
                d=", 'new'"
        fi
        if [[ -n "$arg_seconds" ]]; then
                fields="$fields;time_since_creation=5"
                nonpretty="EXTRACT(EPOCH FROM "
        fi
        if [[ -n "$arg_since_update" ]]; then
                time_column="job.update_time"
                time_column_name="time_since_update"
        fi

        username=$(gdpr_safe galaxy_user.username username "Anonymous User")

        read -r -d '' QUERY <<-EOF
                SELECT
                        job.state,
                        job.id,
                        job.job_runner_external_id as extid,
                        job.tool_id,
                        $username,
                        $nonpretty now() AT TIME ZONE 'UTC' - $time_column) as $time_column_name,
                        job.handler,
                        (REGEXP_MATCHES(encode(job.destination_params, 'escape'), 'ntasks=(\d+)'))[1] as cores,
                        (REGEXP_MATCHES(encode(job.destination_params, 'escape'), 'mem=(\d+)'))[1] as mem,
                        (REGEXP_MATCHES(encode(job.destination_params, 'escape'), 'partition=(\d+)'))[1] as partition,
                        COALESCE(job.destination_id, 'none') as destination_id
                FROM job
                FULL OUTER JOIN galaxy_user ON job.user_id = galaxy_user.id
                WHERE
                        state in ('running', 'queued'$d)
                ORDER BY
                        state desc,
                        $time_column_name desc
        EOF
}

useful: "query hda-info <hda_id>"

Which information about the history, which dataset, and more importantly which job_id.

gxadmin report cwd <job_id>

A command that returns the job_working_directory would be pretty cool for debugging.

Alternatively, this could be added to gxadmin report job-info.

Project link seems invalid

usegalaxy-eu.github.io/gxadmin/#/

query latest-users syntax error

I got a syntax error running a query tool

galaxy@sn04:~$ /usr/bin/gxadmin query latest-users
ERROR:  syntax error at or near "username"
LINE 5: COALESCE(username::text, '__UNKNOWN__') as  as username,

report info about a dataset on disc

It would be nice to get some information about a dataset on disc. This could work with the UUID e.g. a2fab2r9-a694-49f4-823f-74a52b3307aa.

report data-info or report uuid-info could be an entrypoint to get this data.

gxadmin query --help searches functions rather than showing help.

Drop people directly into man page, instead of printing help, maybe

seen during the GAT, that people with small terminals were confused by the help / examples, as they didn't see the error message.

show tool in fail.sh

interface stability guarantees

currently there is, approximately zero stability guarantees, we can change anything at any time, functions might disappear, etc.

Given that admins might use gxadmin in pipelines I think I should implement something that will give them more stability. The current idea is something like:

query_blah-blah_v1() { ... }
query_blah-blah_v2() { ... }

which could be called as query blah-blah (for the 'latest' version) and maybe blah-blah_v2 for a specific version? When you're writing a pipeline that you want to include a specific version in you'd then use the versioned identifier of the function name, and we'd be free to update/change behaviour of functions.

And for the 'older' version of these functions we'd just move them to a 'graveyard' since we'll never touch them again.

Add a command to decode an ID

Maybe a wrapper around python server/scripts/secret_decoder_ring.py -c config/galaxy.ini decode 4asdfdfadsf.

Thanks @mvdbeek

Incorporate help2man

https://www.gnu.org/software/help2man/

maybe that's useful, the current --help flag is kinda ugly and a weird markdown mix.

psql request which could join gxadmin

Hi there,
At the imaging meeting we were wondering if we could monitor the usage of a list of tools both in terms of CPU hours and in term of number of different users.
I came with this request (which I tested for bowtie2 and cufflinks):

tool_list="('bowtie2', 'cufflinks')"
psql -c "COPY
(SELECT
   date_trunc('month', job.create_time  AT TIME ZONE 'UTC')::date as month,
   round(sum((a.metric_value * b.metric_value) / 3600 ), 2) as cpu_hours,
   regexp_replace(regexp_replace(job.tool_id, '/[0-9.a-z+-]+$', '')::TEXT, '.*toolshed.*/repos/[^/]*/[^/]*/', '') as tool_name,
   COUNT (DISTINCT job.user_id) as nb_users
FROM
   job_metric_numeric a,
   job_metric_numeric b,
   job
WHERE
   b.job_id = a.job_id
   AND a.job_id = job.id
   AND a.metric_name = 'runtime_seconds'
   AND b.metric_name = 'galaxy_slots'
   AND job.create_time < NOW() - interval '1 year'
   AND regexp_replace(regexp_replace(job.tool_id, '/[0-9.a-z+-]+$', '')::TEXT, '.*toolshed.*/repos/[^/]*/[^/]*/', '') in $tool_list
GROUP BY
   month, tool_name
ORDER BY
   month DESC)
TO '$PWD/info_tool.csv' WITH (FORMAT CSV, HEADER) ;"

I am not an expert in psql so this request is not optimal.
I don't know if this is interesting to go to gxadmin.

iquery should not output empty tags

e.g. with __DATA_FETCH__ running as a celery task, there is no destination_id, so with gxadmin iquery queue --by destination you can get the following error from telegraf, which results in no collection of any of the measurements output:

E! [inputs.exec] Error in plugin: metric parse error: expected tag at 25:51: "queue_by_destination,state=running,destination_id= count=0"

This has happened in a few other cases as well. But basically, the iquery formatter should probably just fill in any null tags as unknown or something even more obvious like __NULL__.

Add dalibo support for explain

https://explain.dalibo.com/about

Functions to add

Given a list of job ids (newline separated)
- why they are in new state
- which host did it run on (maybe extend to other metrics? so can aggregate info for queries like "for these specific jobs (which I've obtained from some other list), runtime / max mem usage)
job info needs to display job ID for input jobs in table
hda-info, dataset-info commands
quota info
history contents should display time + file size
maybe a user-history list command with last modified / size
ips → potential users

track special tool usage

Something like that would be useful to have, especially as iquery:

gxadmin query queue-details | awk -F '|' '$4 ~ "interactive_"' | wc -l

`gxadmin query jobs-max-by-cpu-hours` seems to be broken

galaxy@sn06:~$ gxadmin query jobs-max-by-cpu-hours
ERROR: column "cpu_hours" does not exist
LINE 10: ORDER BY cpu_hours desc

"gxadmin query job-state" seems to be broken

(venv) galaxy@sn06:~/server$ gxadmin query job-state 60057230
ERROR:  syntax error at end of input
LINE 4: WHERE id =

time queries?

./gxadmin time meta whatsnew

doesn't seem to work currently

The function report_user-info() retrieves the wrong user sometimes

When querying with an email address that it's not in the database and contains numbers, it retrieves the wrong user.

Example: [email protected]

If I understood well the WHERE clause in the query:
galaxy_user.email = '$1' -> FALSE because it's not in the database
or
galaxy_user.username = '$1' -> FALSE because it's not in the database
or
galaxy_user.id = CAST(REGEXP_REPLACE(COALESCE('$1','0'), '[^0-9]+', '0', 'g') AS INTEGER) -> Here I think that it would find the user whose id contains '123', but this is not the user id, because the user doesn't exist.

Does anyone know what is the use case for the third condition?

Thanks,

Bea

Update short syntax for default vs options

@natefoo wanted to do [foo|bar|baz] (very reasonable.) but we're using | for assignment which is odd to say the least. We should change this out for := in a desire to keep default assignment 'shellish'.

2432:query_tool-popularity() { ##? [months|24]: Most run tools by month (tool_predictions)
4514:query_tool-new-errors() { ##? [weeks|4] [--short-tool-id]: Summarize percent of tool runs in error over the past weeks for "new tools"
4562:query_tool-errors() { ##? [--short-tool-id] [weeks|4]: Summarize percent of tool runs in error over the past weeks for all tools that have failed (most popular tools first)
4610:query_tool-likely-broken() { ##? [--short-tool-id] [weeks|4]: Find tools that have been executed in recent weeks that are (or were due to job running) likely substantially broken
4659:query_user-recent-aggregate-jobs() { ##? <user> [days|7]: Show aggregate information for jobs in past N days for user (by email/id/username)

Fix server-groups-allocated-[cg]pu

@selten your new queries are taking more than 40 hours to execute, I can start to see why you added the feature to select which slurps you wanted.

We should try and find a way to fix this query so it can scale

Add Linter for Tabs (EOF problem)

It took some time for me to understand that the command failed, because of spaces instead of tabs for the EOF block.
Would be nice to have a linter here :)

include a few invocation querys/reports

Given an invocation ID, it would be nice to have:

user
workflow_id
since when it is scheduled
in which history it is running, maybe even with link to this history

InfluxDB V2 support

Hi,

I was wondering, as InfluxDB OSS V1 is no more under active maintenance, is there any plan to support InfluxDB OSS V2 on the meta influx part ?

Report needs to support ##?

galaxy@sn06:~$ gxadmin report data-info 110124855
ERROR:  syntax error at or near "OR"
LINE 18: AND (d.id= OR hda.id=)) to STDOUT with CSV DELIMITER E'\t'

find heavily used tools with large runtime

I would like to start optimizing some tools. I think we can find some BA and MA students to optimize our tool stack and with that waste fewer resources.

To prepare a query like gxadmin query tool-usage but with a min/avg/max runtime column would be super useful.

possible issue with jobs-ready-to-run

Trying this on our instance I get:

$ gxadmin query jobs-ready-to-run
ERROR:  column job.stdout does not exist
LINE 66: job.stdout AS job_stdout,

histogram.py needs to be updated to python2

histogram.py is used by gxadmin a bit and is python2 only atm.

@sanjaysrikakulam ported it to python3 here https://gist.github.com/sanjaysrikakulam/ba5b7ea35c34e1ee6c325d57de8eb717

I guess it would be nice to ship this newer version that works with latest Galaxy's

purge job metrics table from time to time

# Delete text job metrics for successful or old jobs (not useful for debugging).
# Delete numeric job metrics for old unsuccessful jobs (not useful for statistics or debugging).
DELETE FROM job_metric_text USING job WHERE job_id=job.id AND (job.state='ok' OR CURRENT_TIMESTAMP - job.update_time > '1 year');
VACUUM FULL job_metric_text;
DELETE FROM job_metric_numeric USING job WHERE job_id=job.id AND job.state IN ('deleted', 'error') AND CURRENT_TIMESTAMP - job.update_time > '1 year';
VACUUM FULL job_metric_numeric;

From @nsoranzo

Display user quota in "user-info"

When users request quota increase I check their actual disk usage with this very useful command:

gxadmin report user-info EMAIL

It would be cool to see also their current quota, which would help to check whether the quota was properly increased. Is it something that could be easily added? Or maybe: Is there a better way to check the quota for a specific user?

Thank you :)

Command gxadmin query tool-memory-per-inputs not working

I get ERROR: division by zero when trying to run it.

query queue-detail uses hardcoded timezone..

When using gxadmin query queue-detail in Australia, I noticed that jobs just created were being displayed as having run for 8 hours.. I realised this was the difference between our two timezones..

The timezone for Germany is hardcoded into the query.

SELECT job.state, job.id, job.job_runner_external_id as extid, job.tool_id, $username, date_trunc('hour', (now() - job.create_time - '2 hours'::interval)) as time_since_creation
FROM job, galaxy_user
WHERE state in ('running', 'queued'$d) and job.user_id = galaxy_user.id
ORDER BY state desc, time_since_creation desc

Maybe we need to detect the timezone modifier and do the appropriate math in the query..

It was easy enough to edit my local copy and change the - "2 hours"::interval to - "10 hours"::interval

This occurs as the job creation time in the database is stored in UTC whilst now() returns the current time in the local timezone.

Just as an aside: I've also changed the resolution to minute and added job destination, handler and runner to my version of the query so I can see finer grain times and where each job is ending up..

        handler       | job_runner_name |   destination_id
 ---------------------+-----------------+--------------------
  main.job-handlers.4 | slurm           | slurm_2slots
  main.job-handlers.3 | slurm           | slurm_2slots
  main.job-handlers.3 | pulsar_rest     | pulsar-nci_mid
  main.job-handlers.1 | slurm           | slurm_1slot_upload

Gets added on to the end of my table.. :)

query_user-disk-usage() size differs from galaxy_user.disk_usage

I am looking at the disk usage by user. For that, I've used the function query_user-disk-usage() that computes the size. I have also seen a field in the table galaxy_user called disk_usage.
I was expecting that size and disk_usage would show the same value, but this is not happening. Should they match? Is disk_usage the value displayed in Galaxy to the users? I'm a bit confused :) Any help? Thanks!

"Can you check if these jobs are running for me"

Given history

find history id by name (or from list of user's history | grep)
find HDAs in that history
find jobs for those HDAs
find job cluster ID
ssh to job
strace -p -f

galaxyproject / gxadmin Goto Github PK

gxadmin's Issues

Recommend Projects

Recommend Topics

Recommend Org