Giter VIP home page Giter VIP logo

dentas's Issues

Blast - to do list

  • Convert code to run using Biopython rather than exporting commands to the os.system? - this would be the more correct way of doing things
  • Decide on an approach to manage the issue of Transcript to gene ID mapping redundancy
  • Incorporate an E-value threshold into our analysis
  • Discuss with Adrian the possibility of running blast remotely on Apocrita - If we can't we'll have to run it locally on a laptop & reduce the size of the input data
  • Tackle the issue of transcripts not identified by Blast but present in multiple samples/at high FPKM?

Flagged duplicates issue in module.py

See comments in module.py: the duplicates are being deleted without directly considering the reason as to why: the evalues are all the highest for each respective duplicate but this is another layer of optimising and may be supererogatory.

Improving the Apocrita functionality

Hi,

I'm going to implement some changes to imrpove the functionality with apocrita; but James as we're using your apocrita access only you can change the script thats being called within apocrita.

for the BLAST job i've increased the number of threads to be in line with what's being asked for in apocrita -> to 4 threads.

Furthermore, the default for qsub jobs is 1 core,
I was reviewing the apocrita guide (http://docs.hpc.qmul.ac.uk/using/): you need to add this line to the script:

#$ -pe smp 4       # Request 4 CPU cores

and you can lower the RAM per core to 1 GB instead of 3 ( so we get 1 GB per core)

Oh yeah, the BLAST OS workaround is totally fine, but I would suggest maybe rescripting the FPKM implemntation part after the BLAST section to run in python and not use the cmd functions.

Suggestion to overcome groups issue

Could have raw_input from user for number of columns and use this object throughout so it overcomes the 3 groups barrier? Alternatively if this proves to be too difficult; to meet the deadline we can have a set of hardcoded R scripts for various group numbers and call the appropriate script according to the number of groups the end-user picks in FLASK

Flask/html - to do list

  • Get the user to input the number of experimental groups and their names
  • User login -> we can then email a pdf of the results to the user on completion of analysis
  • make some sort of loading page/giff the show whilst the results are being computed
  • how to manage multiple users?

Widen Species Selection

Instead of it being limited to just Pteropus Alecto, we will give the user a choice of species from a drop down list: they can pick Pteropus Alecto or Mice or Human (Homo sapiens).

  1. I will try and set up each major DB
  2. We need a FLASK incorperated dropdown list
  3. Once the user picks the species we then need to specify that THIS is the db we will carry out the BLAST against.

Multi user functionality

In order to enable this, we can incorporate the intrinsic session ID present in FLASK.

We just need to assign " user session " ID to each of the the files uploaded and carry this through each step of the temp files generated in the analysis etc.

Documentation - we can use github / pandoc

Hi all,
google pandoc
Install pandoc.
Edit your Word document as needed.
Run pandoc from the linux or Windows command line. ...
Update the ChangeLog.
Commit both files with git git add file.docx file.md git commit

we can use this to collectively write the documentation :)

OR a group google doc I don't mind!

R - to do list

  • make the script fully soft-coded
  • derive names for graphs/variables etc from myArgs group list
  • make work with different number of input groups?
  • introduce new functionality? perhaps gene ontology visualisations

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.