tiagofilipe12 / patlas Goto Github PK
View Code? Open in Web Editor NEWPlasmid Atlas - A web interface to browse for plasmids and their associated genes. Visit us at:
Home Page: http://www.patlas.site
License: GNU General Public License v3.0
Plasmid Atlas - A web interface to browse for plasmids and their associated genes. Visit us at:
Home Page: http://www.patlas.site
License: GNU General Public License v3.0
Currently, duplicated links are being removed using js front end, however this could be done more efficiently using python back end. While creating json file, with all entries, something like this gist can be done.
hash()
can be used to improve script efficiency but maybe it is not worth given that strings are small (needs testing).
Also, it should be considered if json file should follow a structure more similar to database: {acc: { length: x, links: [a, b, c]}}
. This would be nicer for js to parse but it will require more refactoring from the front end side.
Order filters are not working properly, all orders are being appended to the same entry and not plotting the colour (branch:taxa).
Before releasing full database, it should be updated from NCBI, given that this database is suffering updates every 3 months, which often breaks fasta parsing.
should implement a different method for returning to previous color of each node
Multi-level selection of taxa has an issue when all 4 levels are selected, rendering no selection at all.
Add new taxa_tree.json file to populate the taxa menus within the app.
When uploading .json files to the application, the handling function is called twice.
currently the visualization has no support for more than 20 colors for each taxa. In future versions this should be addressed.
Plasmid names are retrieving something like pLMG9303
instead of pLMG930.3
. Database needs to be re-worked in order to correct this issue.
Future implementations should consider including options to specify the p-value, mash distances and maximum number of links between sequences to which the user want to define a cutoff.
When many sequences are given as input pairwise comparisons can became very intensive and function mash_distance_matrix is storing a lot of entries which might be consuming a lot of memory.
One way to quickly visualize metadata such as accession number could be displayed in a label next to the corresponding node. However this might be very confused... But perhaps there is other way.
This would be very useful to display images outside patlas as png or jpg.
After filtering with a given set of taxa, cannot properly remove color from legend and graph.
Add a zoom in and zoom out slider to vivagraph output
When comparing reads diff add each read coverage length of reference sequences.
Taken the results from samtools depth file generated by PlasmidCoverage it would be nice generate a plot with coverage depth of all positions of a given plasmid.
However, this should be done only for the results under the defined cutoff of PlasmidCoverage script, in order to avoid an overload of information.
We should check if plotly or any other js library has implemented any kind of circular histogram that we can re-use.
When selecting a taxa some of the child taxa will not be processed.
Distance filters after re-run currently doesn't have the actual distance value (it just has the accession in the database), therefore it would be important to populate the database with the accession numbers + distances.
Currently this has the following structure:
{"significantLinks": ["NC_010869_1", "NC_025192_1"], .... }
However a more nested structure with name and distance linked together, e.g. accession|distance
instead of accession
. This would be easier to implement in a first instance.
Right now filter iterates through all nodes and removes the nodes that doesn't have a color attributed or a link to a colored node. However, this behavior renders a slow loading time and thus should be replaced by queries to database that retrieve the information on the nodes and generates a new json to render a new instance of the graph (smaller than the initial).
Add a dark mode to visualization.
In the example provided in modules/dict_temp_005_l4.json, four additional links are being created and linking to every node. From a total of 5384 sequences retrieved in python, 5388 nodes are being created in which 4 nodes connect to every other node.
Note that, currently only 4 links are being stored in json file, so visualization.html should not have nodes with more than 4 links and should have 5384 nodes instead of 5388.
When clicking in cancel selection in file modals, reader variable is not defined, which makes the button useless.
A way to cycle between clusters should be implemented and then there is already a way to search for accessions that could help to find a given cluster associated with a given sequence.
linked with #74 . Plots should benefit from a loading information where the user can see the queries that are being made and the ones that have already been made.
progress bar became broken after inserting a pool.join() to wait for the mp process to finish.
Ui control graph for vivagraph display may help to establish a better visualization. Therefore, add a div
that allows to specify and change parameters for vivagraph layout
.
Remove example gifs that are not used anymore.
Metadata box could be displayed on some event click (button or something else). This metadata could show:
already available - check listGiFilter
variable
still needing implementation
Coverage results could have a slider similar to length filters, that enable the user to select and unselect previous nodes with a certain coverage.
Also legend should be updated while interacting with this slider, but only on submit definitive range of coverage percentages
Add a filter that only shows a given coverage threshold in reads mode.
Two entries of the same taxa are being added to the html legend.
For some reason last NCBI database (plasmid) from 20/7/2017 has genes mixed with plasmid sequences. To remove them search for the header CDS and match string using .lower()
, because there "CDS" and "cds".
The different elements of modals overlap in small window sizes.
README needs to be updated according with api branch.
Graph should be re-centered after removing nodes and links with re run button.
When read filter legend is triggered, and taxa filters are then appended to the legend, the lists of all species present in legend is not removed until next instance of taxa filters.
If we choose a color scheme for distances it will be appended to taxa filters modal body.
add a mini-map to the bottom-right corner
When triggering Re run button the removal of nodes and links is odd, and only after several clicks on the Re run button all the removals are performed.
While trying to submit a function when no taxa filters are applied an error message is raised:
Uncaught ReferenceError: assocFamilyGenusGenus is not defined
at HTMLButtonElement.<anonymous> (visualization_functions.js:855)
at HTMLButtonElement.dispatch (jquery-3.1.1.js:5201)
at HTMLButtonElement.elemData.handle (jquery-3.1.1.js:5009)
Although this doesn't affect the final result and a proper warning is raised for the user, error messages to console should be avoided and thus handling instances where assocFamilyGenus
, assocOrderGenus
and assocGenus
are undefined should be done.
When two nodes are selected on mouse click, after deselecting one, the linked node is deselected also despite the initial node is still selected.
A check has to be implemented in order to see if the linked node is still selected in another node.
taxa_fetch.py
should be refactored in order to be loaded by MASHix.py
instead of running separately. This will imply that doc
dictionary will have information regarding the taxa and committed just once, rather than removing previous entry and adding a new entry each time we want to add taxa information to the psql database.
Nodes being added async is rendering the browser to freeze in firefox and in pcs with less resources.
Tried to implement a concurrency like this:
const limit = 10
let running = 0
const scheduler = () => {
while(running < limit && json.nodes.length > 0) {
const array = json.nodes.shift()
console.log(array)
addAllNodes(array, () => {
running--
if (json.nodes.length > 0) {
scheduler()
}
})
running++
}
}
scheduler()
This returns too much recursion because scheduler
is being executed inside scheduler
.
Add report as csv to coverage results.
A way to implement this could be to get the relative position of all selected nodes in relation to each other and then the dragged node would get a new position and all the others would get set by the relative position to the dragged node.
Requirements have a lot of unused packages and versioning should be handled more loosely than it is atm.
Add legend to reads plotting with color scale.
csv file could be used to make multiple selections, on taxa filters, antibiotic resistance, plasmid families and table view mode.
Add an option to export current visualization (as pdf, png and jpg).
Dropdown menus should be populated while scrolling. Something like recyclerview for bootstrap. Maybe check bootstrap-select options.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.