drpowell / degust Goto Github PK
View Code? Open in Web Editor NEWAn interactive web-tool for RNA-seq analysis
Home Page: http://degust.erc.monash.edu/
License: GNU General Public License v3.0
An interactive web-tool for RNA-seq analysis
Home Page: http://degust.erc.monash.edu/
License: GNU General Public License v3.0
It would be nice to have easy information on how to cite this, preferably on the main page.
We all know that 1 vs 1 isn't ideal, but hey, sometimes its what we have to analyse. Clean error messaging when voom/edge fails due to this would be good. And ideally a version that also allows visualisation of such data (with all the caveats that you would expect).
Enhancement: (In http://degust.erc.monash.edu/visited)
Degust uses glmFit, while glmQFit is a more stringent. To quote one of the authors of edgeR
:
glmQLFit
will provide more accurate type I error rate control as it accounts for the uncertainty of the dispersion estimates in a more rigorous manner than the glmFit pipeline
An option to choose which model to fit would be great.
This would allow counts files from featureCounts
to be used directly without modification - which is not always ideal, but useful for some users (eg, when RNAsik Degust-compatible count file generation fails).
Since there is a chance that the CSV column labels row may begin with a #
, this should be a checkbox option or text entry ('Ignore header lines beginning with #', or 'Skip X lines [header region]') on the configuration page.
Hi,
I was wondering what is the meaning of the color scheme in KEGG pathways - they don't match the colors given to different treatments in the MDS plot, for example. I am also wondering if it is possible to customize and save the parallel coordinates plot (e.g. select only a custom set of genes to be displayed).
Degust is a great tool, and I apologize in advance if the answers to my questions are in the Degust website, I was unable to find them.
Thank you in advance!
all normalised in R, but it would be nice to present microarray expression data through degust.
This would be a nice to have.
Hi Dave,
When you make a selection on the parallel co-ordinates plot by mouse, you can't then select and copy information out of the gene table. Even after removing the selection on the plot, until you reload the page, you still can't select the gene table. This only occurs with the parallel co-ordinate plot, the MA and the Volcano plots don't have this problem.
When looking at the 3d MDS plot, the colours mapping the sample groups are unable to be seen, making the image hard to interpret.
Being able to adjust the % of the screen width to see more data (in the genes table), and to allow for more space for conditions labels would be useful. For those with bigger screens only about 50% of the screen width is actually used.
If more than ten conditions are added in Degust, then the last samples use the same colours as the first samples
Hi guys,
Would be useful for the user to be able to
We find that we want a lot of annotation, yet being able to hide it quickly would really help. Export would still include all annotation columns.
Cheers
/Alisatir
ie. FC is the average, but some genes (eg. low expressed) the conf interval would be large
we typically don't report the confidence interval - we should though. Obvjiously p-value is related.
-- you
Firstly, the parallel coordinates part being interactive is really great. Some minor comments that would increase usability:
@drpowell Hi, Is there a way to adjust the row height in the degust html report? We've got genes with annotation from multiple DBs but only the first item was displayed. For example,
The full content of Desc for the first gene is acutally
[GO] GO:0055114 Biological Process:oxidation-reduction process
GO:0050660 Molecular Function:flavin adenine dinucleotide binding
GO:0016491 Molecular Function:oxidoreductase activity
GO:0016614 "Molecular Function:oxidoreductase activity, acting on CH-OH group of donors"
[InterPro] IPR000172 "Glucose-methanol-choline oxidoreductase, N-terminal"
IPR007867 "Glucose-methanol-choline oxidoreductase, C-terminal"
IPR012132 Glucose-methanol-choline oxidoreductase
IPR023753 FAD/NAD(P)-binding domain
[UniPro] Protein HOTHEAD OS=Arabidopsis thaliana GN=HTH PE=1 SV=1
[TREMBL] Uncharacterized protein OS=Prunus persica GN=PRUPE_ppa003422mg PE=4 SV=1
Thank you.
I uploaded a counts file in which I accidentally had two duplicate column names (but had different counts across the two columns). Degust didn't chuck an error at that, I was able to put two samples into the same condition and still load a Degust session. In the session however, it seems to pick one of the columns and duplicates it as when I checked to show counts, I had identical numbers for the two samples.
https://reactome.org/what-is-reactome
open source replacement for KEGG?
This may well be overkill, however it is still the case that we sometimes end up with sex differences in our samples. The option of being able to label the samples as M/F, and having that go into the fit would be really nice.
Bit of a strange error this one. Using chrome.
settings=%7B%22csv_format%22%3Atrue%2C%22replicates%22%3A%5B%5B%22KI%22%2C%5B%22S232_KIr%22%2C%22S233_KIr%22%2C%22S242_KIr%22%5D%5D%2C%5B%22WT%22%2C%5B%22S176_WTr%22%2C%22S225_WTr%22%2C%22S228_WTr%22%5D%5D%5D%2C%22fc_columns%22%3A%5B%5D%2C%22info_columns%22%3A%5B%22Gene+ID%22%2C%22gene_biotype%22%2C%22name%22%2C%22entrezgene%22%2C%22chromosome_name%22%2C%22start_position%22%2C%22end_position%22%2C%22strand%22%2C%22Length%22%2C%22description%22%5D%2C%22analyze_server_side%22%3Atrue%2C%22link_column%22%3A%22Gene+ID%22%2C%22name%22%3A%22SVI_SRSF2_HsclCreYFP_STAR_mm10_rmdup_counts_annotated%22%2C%22primary_name%22%3A%22%22%2C%22init_select%22%3A%5B%22KI%22%2C%22WT%22%5D%2C%22hidden_factor%22%3A%5B%5D%2C%22min_counts%22%3A10%2C%22min_cpm%22%3A1%2C%22min_cpm_samples%22%3A2%7DHTTP/1.1 200 OK
Date: Wed, 30 Aug 2017 05:24:26 GMT
Server: Apache/2.4.18 (Ubuntu)
X-Frame-Options: SAMEORIGIN
X-XSS-Protection: 1; mode=block
X-Content-Type-Options: nosniff
Content-Type: text/html
Content-Disposition: inline; filename="compare.html"
Content-Transfer-Encoding: binary
Cache-Control: private
X-Request-Id: 90e1ff14-3615-46b4-9dad-d40abcb724a3
X-Runtime: 0.002492
Set-Cookie: _degust-jobs_session=SHVIU2xJQXEyVVMwOW9nL01DamZySGtpZzhqN3RCNzQ2N3R1UFZJQlQ0Rno0L1U1OUV0M28zVTVURUg2Q1F3MVIza0FnNkR2K2w5WFhJQThkd1krN1ZCT3hIcU1yU01TSXV3VDNTOWlhK3FkTFYwL3dxZm9sQ3dVYXRINHVpaWExeUl6dFZvdEFqQUk2Q09ibTROcElha2VtNVpjWWtoVm1DWDlRRjZSNlFFPS0tWG5uV1NEZ1RNaEszT3FQZktpRjF6dz09--0187109c89db3122880ea03ef96dfd4b1f35b5f4; path=/; HttpOnly
Content-Encoding: gzip
Keep-Alive: timeout=5, max=99
Content-Length: 423
Accept-Ranges: none
Connection: keep-alive
�‹���������SA’Ó0�¼ç�ƒ.>YJ ¸€í�ð‚}�b�íÉÊ–JšÄ›¢ø;²„�PÅ.'«¦Õ=3v5ðh›�@5 6àƒëÈb-�fÿI©yžåüAºÐ«÷ûýGuØ«|E,¤D³4=C@[�jÝ$€¯>ÑiÔ=?õ�†€].DÕéËrM.ˆjv›Ä»²„o/ŒaÒ6—§'(Ë?Ô#_-Æ�‘³d!U�£²t”é[À]îßœÖ�c�b¥mó2¾ð�‰Eæõ^�ü› üZ.¶�<?â'}Ñ[U@ í*·
tŠES©
kÞ@�–Úç>�ùo…Û.¿ó+µÄb=��¹fIC� S‹€Þê�Ë��´VÇX‹ô¸¬iÂ�ó‘ogôt��ŽCÊÉ
]‚wh¾b�Žœš��꾩t6<§Ñ�ïf´VöÄÃù(É)³2Uq—Ð
PL9Æ´�M=\Ý9€Ñ¬¥”�ë��KÅ•¡™’óRVÊ?ô¦±¿YsKnššKë´Á�úwÅ}?•�ÌÆlÇWø~#Ï4�7ˈÌiØ�5|‡�Ÿ³Øã3lî'‡Ößu÷�âO°m·���
Would be more meaningful to have cpm scale as default in gene plots in our opinion. And/or a check box in the config for this?
Cheers and thanks as always
Its useful to have full naming to reduce the "where did that file come from".
When saving filename (MA, Volcano, MDS) - start filename with dataset name would be a useful option.
So when one has a count table of over 20 samples for instance, is it possible to upload a pdata table with sample name, group as a csv file, rather than checking the sample names?
Particularly good for scRNASeq, but also with this dataset I am using at the moment. which is 157 samples of RNA Seq.
Hi,
I would like to use v3.1.1 or above for Degust, but I have access to v3.1.0 only.
It would be useful to have a text box where we can include a full description of the processing done on the dataset - a materials and methods if you like. Useful to keep track of documentation within degust for easy access.
Great to see cpm plotting now in the dev version!
I suggest moving the label outside of the plot, or changing the auto scale to be +10%.
It would be useful when saving MDS plots to png/svg, for the information used in making the plot to be saved at the same time:
This could be:
Dataset name + params for mds + dimension. (or at least for that option to be the default).
For gene expression this could be:
Dataset name + Gene info cols + cpm/count
Changing the default sorting of http://degust.erc.monash.edu/visited to last visited (most recent first), or last created (most recent first), would be great. Remembering this sort option wouldn't hurt either.
Defining/using these at the start of the R code would be helpful for people reproducing things in R.
input_file
output_dir
output_file
Another potential future feature request.
Pasting in a custom list of genes/gene id's for presentation in degust plots and tables.
--Hi,
after complete installation with docker, i can't reach authentification with google_oauth2 from localhost:8001
is there a solution to use degust without auth step ?
thank you --
Some possible "nice to have" ideas
Add option for:
Exporting gene list as text to clipboard (for pasting into enrichr or other tools).
Create .gmt file (using gene id) to create a molsigdb style .gmt gene set for up/downregulated genes
Hi there, love the ability to show/hide heatmap from the tab views, just noticed this in the MDS tab. possible to have this across the tabs, and then also a hyperlink to hide the heatmap again (at the moment not obvious you have to right click on the heatmap to reveal the hide toggle).
chur!
Not sure if it's worth fixing but I came across a case where someone had a trailing space in one of their sample names. This ended up breaking the R script when it tried to subset the data using a non-existent colname.
When viewing a 3D MDS plot, it is possible to right click to save as PNG/SVG. This causes a session crash with a unresponsive box - clicking the menu options will escape but it does loose memory of what samples were selected e.t.c.
I suspect this shouldn't be an option at all.
Also occurs if animation has been toggled off.
Ideally the output from degust/R code would pipe directly into the EGSEA sample example (or something very similar to it for downstream pathway/gene set analysis), so updating the y object so that it contains all the necessary parts would be quite useful. (https://bioconductor.org/packages/release/bioc/html/EGSEA.html)
This may not be complete but I found these two to start with.
More a feature request, possible to present a legend for the 3D mds plot? (again on behalf of an interested end-user).
It would be handy to be able to download the input data as it was originally input, as it makes the receiver of the degust link more autonomous.
This could be from the datasets / your data page, or from the configure page (or both).
Hi, now that there's a Dockerfile to allow docker builds, is it possible to push the current release to Docker Hub (https://docs.docker.com/docker-cloud/builds/push-images/)? It would be helpful for me to have this (and I'd rather have an official docker image than for me to push it myself)
duplicate rownames not allowed caused by input with Gene ID, name not in first columns. Would be more robust to not rely on the order being correct, but find Gene ID column and put it first.
Alternatively, a better error message thown: "Gene ID expected as first column of input".
I expect a bunch of additional format checks of the input would be useful including parts to strip potentially malicious code from the input file.
We would like to use login/password based authentication.
Please integrate the OmniAuth Identity Provider as an alternative to Twitter/Google.
Minor useful addition:
While deleting a page from the list of uploads is great, sometimes you want to do it from within the view dataset part (because you are reviewing what you wish to delete).
For the online server, while it now displays a p-value column, this is lost upon export of the table to csv. (issue raised on behalf of an interested stakeholder/enduser).
Copied a session and created a new one, the hidden factor becomes another group in the new session (at least it is unchecked) so have to go back into config and select.
It would be useful if batch effects could be included in the DGE.
The new edgeR quasi-likelihood method is throwing an R error on the example data on the server. I haven't tested on other data.
That is, I reach this error via
http://degust.erc.monash.edu/ -> Play with a demo -> edgeR quasi-likelihood -> Apply
Referring to https://www.genome.jp/kegg/kegg1a.html, the EC number is no longer used as an identifier in KEGG. The KEGG Orthology (KO) system is the basis for genome annotation and KEGG mapping.
Why not update the EC Number column to KO to display genes on Kegg pathways?
Hi I have been trying to replicate the edgeR quasi-likelihood analysis with Degust. I managed to get the same values for logFC and logCPM but not the p-values. R-code looks alright to me. Any possible explanation to that?
Thank you.
When the method is set to 'Voom (sample weights)' and long condition names have been given, the condition names conceal the sample weights and makes them unreadable.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.