Giter VIP home page Giter VIP logo

jaspar-ucsc-tracks's People

Contributors

oriolfornes avatar robinvanderlee avatar tixii avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

jaspar-ucsc-tracks's Issues

About installation

Hi,
Thank you for the development of JASPAR-UCSC-tracks

I am very interested in using this program. However, I try to install the program by the bash execution of the script install-pwmscan.sh. However, the following error shows every time:

gcc -fPIC -O3 -std=gnu99 -W -Wall -o hashtable.o -c hashtable.c
make: gcc: Command not found
make: *** [Makefile:42: hashtable.o] Error 127

Could you help me?

Thank you!

mm10 build?

It would be great if you could also provide tfbs prediction tracks for mm10

fetch -p parameter

@oriolfornes

Shouldn't this part from the fetch* script not be "from jaspar2pfm.py" rather than from "jaspar2meme.py"?

parser.add_option("-p", action="store", type="string", dest="profiles_dir", help="Profiles directory (from jaspar2meme.py)", metavar="<profiles_dir>")

p_value clarification needed on their scaling and interpretation

In JASPAR UCSC tracks I read that scores in the bigbed files are p-values which have benn

(scaled between 0-1000, where 0 corresponds to p-value = 1 and 1000 to p-value ≤ 10-10)

Can I use R's rescale function to recover the p-values from the scores? For instance, a score of 950 comes from a p-value of .05

library(scales)
rescale(950,c(1,10**-10),c(0,1000))
.05

In any case, I am having trouble effectively interpreting the p_value.

In PWMScan: a fast tool for scanning entire genomes with a position-specific weight matrix I read

The P-value of a PWM score x is defined as the probability that a random k-mer sequence of the length of the PWM has a binding score ≥ x given the base composition of the genome.

I would hope to find some measure of how well the sequence at a candidate (or putative) binding site identified by PWMScan matches the motif PWM. This does not seem to provide that. Or am I mistaken?

I would like possibility of being more stringent in selection of candidates from this trace by setting a threshold on the score. However, I am hesitant to adopt this approach as thresholding on the scaled P_value could introduce a bias toward a subset of the universe of motifs. Is my reasoning suspect here?

edit: Perhaps another way of getting at this is to ask: do the motifs have the same distribution of P_values as each other?. If they do, then thresholding across the board at any given P_value should remove an equal fraction of each motif's hits. Do you know if they do?

Error cannot convert float NaN to integer

Hello! thank you for always answering the Issues.

I have had some problems with the installation of JASPAR, but I was thinking that I finally covered it. However, when I was trying to run the example command:
./scan-sequence.py genomes/sacCer3/sacCer3.fa profiles/ -o tracks/sacCer3/ --threads 4 --taxon fungi

The error ValueError: cannot convert float NaN to an integer.

My process of installation was:
Git clone https://github.com/wassermanlab/JASPAR-UCSC-tracks
bash install-pwmscan.sh
conda env create -f ./conda/environment.yml
mv pwmscan/* JASPAR-UCSC-tracks/ (I did this due to there was an error with where is matrix_scan

run ./scan-sequence.py genomes/sacCer3/sacCer3.fa profiles/ -o tracks/sacCer3/ --threads 4 --taxon fungi

I'm using JASPAR version 1.0

Thank you so much!

Provided binaries do not work

The provided binaries of matrix_scan and matrix_probe do not work on any of my Linux systems. (RH7 and Ubuntu 16.0x-20.0x

Do you have a working Linux binary available?

mismatched taxon

Hi, thank you for the great resources.

I'm looking at hg38 http://expdata.cmmt.ubc.ca/JASPAR/downloads/UCSC_tracks/2022/hg38/
and found some motifs that seem unrelated to Vertebrata, for example, MA2020 is from Arabidopsis thaliana. MA1879 is from Ciona intestinalis (these are just examples and there are more).

I see the option --taxon vertebrate for hg38 in your code https://github.com/wassermanlab/JASPAR-UCSC-tracks/blob/master/scan-sequences.sh
, so I assumed that only motifs linked to vertebrates are included.

Since other genome versions like hg19 or mm10 also include the non-vertebrate motif annotation, I wondered if it's intended or if there are some mistakes.

Thank you!
Nana

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.