Giter VIP home page Giter VIP logo

pgltools's People

Contributors

billgreenwald avatar niemasd avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

pgltools's Issues

Add strand information to each pair

Don't know if this is useful, but there is some RNA-seq pairing data coming out and it might be useful to apply these tools to that field as well. But if you want that to work we need a way to specify strand information in the base file format.

Sorting command is not available

Hi I am trying to use
PyGLtools/intersect.py

but I keep getting the follow error:
File A is not sorted. Please use pgltools sort [FILE]

the top of file A is shown below.

and I cannot find the sort or fomartbedpe command:
init.py closest.pyc condense.py coverage.pyc formatTripSparse.py intersect1D.pyc pgltools_library.py subtract.pyc
init.pyc closest1D.py condense.pyc expand.py formatTripSparse.pyc juicebox.py pgltools_library.pyc subtract1D.py
browser.py closest1D.pyc conveRt.py expand.pyc intersect.py juicebox.pyc samTopgl.py subtract1D.pyc
browser.pyc column_flip.py conveRt.pyc findLoops.py intersect.pyc merge.py samTopgl.pyc window.py
closest.py column_flip.pyc coverage.py findLoops.pyc intersect1D.py merge.pyc subtract.py window.pyc

#chrA startA stopA chrB startB stopB AnnotationA AnnotationB
chr1 6405000 6420000 chr1 6430000 6445000 chr1:6405000:6420000 chr1:6430000:6445000
chr1 6340000 6355000 chr1 6470000 6485000 chr1:6340000:6355000 chr1:6470000:6485000
chr1 6470000 6485000 chr1 6485000 6500000 chr1:6470000:6485000 chr1:6485000:6500000
chr1 6485000 6500000 chr1 6545000 6560000 chr1:6485000:6500000 chr1:6545000:6560000
chr1 6630000 6645000 chr1 6695000 6710000 chr1:6630000:6645000 chr1:6695000:6710000
chr1 6720000 6735000 chr1 6735000 6750000 chr1:6720000:6735000 chr1:6735000:6750000
chr1 6720000 6735000 chr1 7740000 7755000 chr1:6720000:6735000 chr1:7740000:7755000
chr1 9575000 9590000 chr1 9860000 9875000 chr1:9575000:9590000 chr1:9860000:9875000
chr1 10205000 10220000 chr1 10295000 10310000 chr1:10205000:10220000 chr1:10295000:10310000

intersect1D "UnboundLocalError: local variable 'Aannots' referenced before assignment" error triggered by Inter-chromosomal interactions

Hi,

Thanks for making the tool available to the community. I noticed a bug when running intersect1D when the -b file contains a target that intersects with a inter-chrom interaction event. A workaround I noticed was if you add a dummy intra-chrom interaction into -a and -b file this error wouldn't be triggered. [pgltools 2.2.0]

  • intra-chrom interactions only bedpe
cat toy1.bedpe 
chr1	710000	715000	chr1	935000	940000	
chr1	825000	830000	chr1	935000	940000
  • intra and inter chrom interactions bedpe
cat toy2.bedpe 
chr1	710000	715000	chr1	935000	940000	
chr1	825000	830000	chr1	935000	940000
chr1	920000	925000	chr2	935000  940000
  • bed region corresponds to intra-chrom region
cat targ1.bed 
chr1    935000  940000
chr2    935000  940000
  • bed region corresponds to inter-chrom region
cat targ2.bed 
chr2    935000  940000
  • intersect intra-chrom with intra-chrom only file -> works
pgltools intersect1D -a toy1.bedpe -b targ1.bed -wa
chr1	710000	715000	chr1	935000	940000	B	
chr1	825000	830000	chr1	935000	940000	B	
  • intersect intra-chrom target with intra and inter chrom file -> works
pgltools intersect1D -a toy2.bedpe -b targ1.bed -wa
chr1	710000	715000	chr1	935000	940000	B	
chr1	825000	830000	chr1	935000	940000	B	
chr1	920000	925000	chr2	935000	940000	B	
  • intersect inter-chrom target with intra and inter chrom file -> error
pgltools intersect1D -a toy2.bedpe -b targ2.bed -wa
Traceback (most recent call last):
  File "intersect1D.py", line 403, in <module>
    intersect1D(A,B,args,header,headerB)
  File "intersect1D.py", line 272, in intersect1D
    res=_overlap1D(A,B,args['bA'],args['allA'],args['wa'],args['d'],args['v'],args['wb'],args['u'])
  File "intersect1D.py", line 258, in _overlap1D
    newPeaks.append([[chr1,start1,end1,chr2,start2,end2,"B"],Aannots])
UnboundLocalError: local variable 'Aannots' referenced before assignment

pgltools formatbedpe add trailing tab in the end

cat -A myfile.pgl
chrX^I100040000^I100041800^IchrX^I99892988^I99896988$
chrX^I100046800^I100048000^IchrX^I99892988^I99896988$
chrX^I100128800^I100130000^IchrX^I99892988^I99896988$
chrX^I99749000^I99751000^IchrX^I99892988^I99896988$
chrX^I99851000^I99853000^IchrX^I99892988^I99896988$
chrX^I99854000^I99856200^IchrX^I99892988^I99896988$
chrX^I99858800^I99859600^IchrX^I99892988^I99896988$
chrX^I99863600^I99865600^IchrX^I99892988^I99896988$
chrX^I99866800^I99867800^IchrX^I99892988^I99896988$
chrX^I99868400^I99868600^IchrX^I99892988^I99896988$

pgltools formatbedpe | cat -A
chrX^I99749000^I99751000^IchrX^I99892988^I99896988^I$
chrX^I99851000^I99853000^IchrX^I99892988^I99896988^I$
chrX^I99854000^I99856200^IchrX^I99892988^I99896988^I$
chrX^I99858800^I99859600^IchrX^I99892988^I99896988^I$
chrX^I99863600^I99865600^IchrX^I99892988^I99896988^I$
chrX^I99866800^I99867800^IchrX^I99892988^I99896988^I$
chrX^I99868400^I99868600^IchrX^I99892988^I99896988^I$
chrX^I99892988^I99896988^IchrX^I100040000^I100041800^I$
chrX^I99892988^I99896988^IchrX^I100046800^I100048000^I$
chrX^I99892988^I99896988^IchrX^I100128800^I100130000^I$

Thanks,
Tommy

SyntaxError: Missing parentheses in call to 'print'.

File "/software/pgltools/sh/../Python/intersect.py", line 265
print "stdin can only be used for either a or b"
^
SyntaxError: Missing parentheses in call to 'print'. Did you mean print("stdin can only be used for either a or b")?

I wonder is it because I cannot successfully install the package?

Migrate to python 3, add test suite and real test data using pytest, maybe add CI/CD (maybe unneeded

If anyone wants to do this they are welcome to, otherwise itll come slowly over time

  • Upgrade to python 3
  • Add pytest support for existing test files
  • Add pytest support for actual functionalized refactor
  • Add pytests to cover all edge cases of package by creating better test files
  • Rename from master to dev for actual use branch
  • Tag some versions nicely instead of just having massive push logs
  • Add a changelog
  • Protect the dev branch and use PRs, even if self-approved, to ensure tests are run properly.
  • Add github actions for running tests on PRs
  • Change tests to source module locally so they pass github actions

formatbedpe error

Hello,

I'm a new user of pgltools. Thanks for developing this tool.

I tried 'pgltools formatbedpe myfile.bedpe > myfile.pgl, but I got the following error message

Traceback (most recent call last):
File "/home/kate146/pgltools-master/sh/../Python/column_flip.py", line 93, in
header,A=_processFile(args['a'],args['stdInA'])
File "/home/kate146/pgltools-master/sh/../Python/column_flip.py", line 31, in _processFile
return header[:-1], [[x[0],int(x[1]),int(x[2]),x[3],int(x[4]),int(x[5]), x[6:]] for x in lines]
ValueError: invalid literal for int() with base 10: 'x1'

myfile.bedpe is
chr10 97990000 97995000 chr10 98160000 98165000
chr10 95360000 95365000 chr10 95495000 95500000
chr10 82275000 82280000 chr10 82325000 82330000
chr10 79285000 79290000 chr10 79335000 79340000
chr10 80460000 80465000 chr10 80505000 80510000
chr10 60245000 60250000 chr10 60285000 60290000

Would you please help me out for this issue? I'd appreciate it very much.

Kate

TypeError: formatbedpe() missing 1 required positional argument: 'header'

Hello!
I meet a problem when convert the bedpe format to pgl format using pgltools formatbedpe. Can you help me ?
The error is as followed:
(base) [zhanglp@bogon test_pgl]$ pgltools formatbedpe test.bedpe
Traceback (most recent call last):
File "/43t/zhanglp/biosoft/pgltools/pgltools-3.0.1/sh/../Module/PyGLtools/column_flip.py", line 130, in
formatbedpe(A)
TypeError: formatbedpe() missing 1 required positional argument: 'header'

And the test.bepe is the same as your Proper Formatting.

Question about pgltools merge

Hi :
I'm going to try the methods of loop calling in your article https://doi.org/10.1038/s41467-019-08940-5 , and this method for loop calling is very useful and accurate !
First, I called loops for each resolution using juicer (5kb to 25kb, 1 kb bin-size interval) , then I combine loop calls from all resolutions using cat all_res_loop/*/merged_loops.bedpe | grep -v "#" > allres.loop , and merge loops within 20kb using pgltools formatbedpe allres.loop > allres_loop.pgl and pgltools merge -a all_loop.pgl -noH -d 20000 > merged_all_loop.pgl.
We can see the lcoation of merged loop is accurate in the picture but the size of some loop anchors are so long. I downloaded the merged loop in your article, and found the max size of these loop anchors are 60 kb. So, counld you give me some suggestions?
image

Thanks in advance !
Best wishes
Qianzhao

Problems with pgltools merge

Hi, I'm new in the field, so sorry if I am missing something. I have some problems with pgltools merge. In particular, with pgltools intersect1D, I have obtained the following file:
cat file.bed | head -10

chr1 950000 975000 chr1 1200000 1225000 A
chr1 950000 975000 chr1 1200000 1225000 B
chr1 950000 975000 chr1 1200000 1225000 B
chr1 950000 975000 chr1 1200000 1225000 B
chr1 1050000 1075000 chr1 1150000 1175000 A
chr1 1050000 1075000 chr1 1150000 1175000 A
chr1 1050000 1075000 chr1 1150000 1175000 B
chr1 1050000 1075000 chr1 1150000 1175000 B
chr1 1050000 1075000 chr1 1150000 1175000 B
chr1 1075000 1100000 chr1 1150000 1175000 B

However, once I use pgltools merge I obtain the following error:
programs/pgltools/sh/pgltools merge -a file.bed -o collapse -c 7 -noH > new_file.bed

Traceback (most recent call last):
File "programs/pgltools/sh/../Python/merge.py", line 355, in
merge(A,args,header)
File "programs/pgltools/sh/../Python/merge.py", line 270, in merge
reMerge=_merge(sorted(res),[x-1 for x in cols],commands,args['d'])
TypeError: '<' not supported between instances of 'list' and 'str'

Question about pgltools merge

Hi :
I'm going to try the methods of loop calling in your article https://doi.org/10.1038/s41467-019-08940-5 , and I think this method for loop calling is very useful and accurate !
First, I called loops for each resolution using juicer (5kb to 25kb, 1 kb bin-size interval) , then I combine loop calls from all resolutions using cat all_res_loop/*/merged_loops.bedpe | grep -v "#" > allres.loop , and merge loops within 20kb using pgltools formatbedpe allres.loop > allres_loop.pgl and pgltools merge -a all_loop.pgl -noH -d 20000 > merged_all_loop.pgl.
We can see the lcoation of merged loop is accurate in the picture but the size of some loop anchors are so long. I downloaded the merged loop in your article, and found the max size of these loop anchors are 60 kb. So, counld you give me some suggestions?
image

Thanks in advance !
Best wishes
Qianzhao

performance for pgltools merge

I have a pgl file with over 1.5 million rows, pgltools merge is taking very long time to run.
Any way to speed it up?

Thanks for this useful tool!

Tommy

No "pgltools" executable

I installed pgltools as instructed in the README:

pip install PyGLtools

I can import the PyGLtools package within Python like in the example in the README:

import PyGLtools as pygl

However, in the Methods section, there are references to a pgltools executable, e.g.:

pgltools formatbedpe myFile.bedpe > output.pgl
pgltools formatTripSparse [options]
pgltools formatTripSparse -ts myFile.tripSparse -an myFile.annotations > output.pgl
pgltools browser myFile.pgl > output.bed
etc.

However, I get bash: pgltools: command not found when I try to run the pgltools command. How do I get the pgltools executable?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.