Giter VIP home page Giter VIP logo

cgat-apps's People

Contributors

acribbs avatar andreasheger avatar andreashegergenomics avatar antoniojbt avatar charlie-george avatar iansudbery avatar jaime11 avatar jscaber avatar katybrown avatar kevinrue avatar logust79 avatar nickilott avatar reshu23 avatar sebastian-luna-valero avatar snsansom avatar tim-hu avatar tomsmithcgat avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

cgat-apps's Issues

quicksect has no attribute before_interval

I see that the use of bx-python has been replaced with quicksect in IndexedGenome. However, this is causing gtf2table to raise a quicksect.IntervalTree has no attribute before_interval error. This is coming from the Quicksect class in IndexedGenome.

This is because the API to quicksect's IntervalTree is different to that of bx-python's. I think the equivalent to before_interval and after_interval are left and right. But because the documentation of both packages is a bit ropey, I can't be sure.

GO.py still using Rpy2

GO.py is still using rpy2 to adjust p values. Which means that rpy2, R et all are still dependencies.

gtf2gtf --help error

After installing cgat-apps from conda in a fresh env, I get

Traceback (most recent call last):
  File "/shared/sudlab1/General/projects/test_project/env/bin/cgat", line 11, in <module>
    sys.exit(main())
  File "/shared/sudlab1/General/projects/test_project/env/lib/python3.7/site-packages/cgat/cgat.py", line 132, in main
    module.main(sys.argv)
  File "/shared/sudlab1/General/projects/test_project/env/lib/python3.7/site-packages/cgat/tools/gtf2gtf.py", line 540, in main
    (args) = E.start(parser, argv=argv)
  File "/shared/sudlab1/General/projects/test_project/env/lib/python3.7/site-packages/cgatcore/experiment.py", line 1213, in start
    global_args, unknown = parser.parse_known_args(argv[1:])
  File "/shared/sudlab1/General/projects/test_project/env/lib/python3.7/argparse.py", line 1787, in parse_known_args
    namespace, args = self._parse_known_args(args, namespace)
  File "/shared/sudlab1/General/projects/test_project/env/lib/python3.7/argparse.py", line 1993, in _parse_known_args
    start_index = consume_optional(start_index)
  File "/shared/sudlab1/General/projects/test_project/env/lib/python3.7/argparse.py", line 1933, in consume_optional
    take_action(action, args, option_string)
  File "/shared/sudlab1/General/projects/test_project/env/lib/python3.7/argparse.py", line 1861, in take_action
    action(self, namespace, argument_values, option_string)
  File "/shared/sudlab1/General/projects/test_project/env/lib/python3.7/argparse.py", line 1043, in __call__
    parser.print_help()
  File "/shared/sudlab1/General/projects/test_project/env/lib/python3.7/argparse.py", line 2481, in print_help
    self._print_message(self.format_help(), file)
  File "/shared/sudlab1/General/projects/test_project/env/lib/python3.7/argparse.py", line 2465, in format_help
    return formatter.format_help()
  File "/shared/sudlab1/General/projects/test_project/env/lib/python3.7/argparse.py", line 284, in format_help
    help = self._root_section.format_help()
  File "/shared/sudlab1/General/projects/test_project/env/lib/python3.7/argparse.py", line 215, in format_help
    item_help = join([func(*args) for func, args in self.items])
  File "/shared/sudlab1/General/projects/test_project/env/lib/python3.7/argparse.py", line 215, in <listcomp>
    item_help = join([func(*args) for func, args in self.items])
  File "/shared/sudlab1/General/projects/test_project/env/lib/python3.7/argparse.py", line 215, in format_help
    item_help = join([func(*args) for func, args in self.items])
  File "/shared/sudlab1/General/projects/test_project/env/lib/python3.7/argparse.py", line 215, in <listcomp>
    item_help = join([func(*args) for func, args in self.items])
  File "/shared/sudlab1/General/projects/test_project/env/lib/python3.7/argparse.py", line 531, in _format_action
    help_text = self._expand_help(action)
  File "/shared/sudlab1/General/projects/test_project/env/lib/python3.7/argparse.py", line 620, in _expand_help
    return self._get_help_string(action) % params
TypeError: %i format: a number is required, not dict

New release

Hi @sebastian-luna-valero I am in the process of packaging an environment and I require some of the newer functionality of cgat-apps in bioconda. How easy would it be to make new release and update the conda package?

I ask because I know your part of the bioconda team

Understanding `min_kmer_matches` in `fastqtools.filter_by_sequence`

I am trying to understand the default value for min_kmer_matches in fastqtools.py : filter_by_sequence. The default value for min_kmer_matches is 20 whereas the default kmer_size is 10. As far as I follow the code, the maximum number of bases that can match the k-mer can hence also only be 10. Am I missing something, or do the defaults need adjustment? I would appreciate a second opinion. The relevant file is here.

KeyError: 'ID' on gff32gtf utility

I am trying to convert two gff3 files to gtf using CGAT's gff32gtf utility:
ftp://ftp.ensemblgenomes.org/pub/plants/release-43/gff3/zea_mays (25798 KB)
ftp://ftp.ensemblgenomes.org/pub/plants/release-43/gff3/sorghum_bicolor/ (7098 KB)

input:

cgat gff32gtf -I Sorghum_bicolor.Sorghum_bicolor_NCBIv3.43_15-13-2019.chr.gff3 -S Sorghum_bicolor.Sorghum_bicolor_NCBIv3.43_15-13-2019.chr_cgatConverted.gtf

output:

2019-04-30 10:21:11,000 INFO output generated by gff32gtf -I Sorghum_bicolor.Sorghum_bicolor_NCBIv3.43_15-13-2019.chr.gff3 -S Sorghum_bicolor.Sorghum_bicolor_NCBIv3.43_15-13-2019.chr_cgatConverted.gtf
job started at Tue Apr 30 10:21:11 2019 on XPS-MFL -- 0f207b8b-7463-4ff8-bf55-dc86ba574ad8
pid: 5319, system: Linux 4.15.0-47-generic #50 16.04.1-Ubuntu SMP Fri Mar 15 16:06:21 UTC 2019 x86_64
2019-04-30 10:21:11,000 INFO by_chrom : False
discard : True
gene_field_or_pattern : ID
gene_type : gene
log_config_filename : None
loglevel : 1
method : hierarchy
missing_gene : True
parent : Parent
random_seed : None
read_twice : False
short_help : None
stderr : <_io.TextIOWrapper name='' mode='w' encoding='UTF-8'>
stdin : <_io.TextIOWrapper name='Sorghum_bicolor.Sorghum_bicolor_NCBIv3.43_15-13-2019.chr.gff3' mode='r' encoding='utf-8'>
stdlog : <_io.TextIOWrapper name='' mode='w' encoding='UTF-8'>
stdout : <_io.TextIOWrapper name='Sorghum_bicolor.Sorghum_bicolor_NCBIv3.43_15-13-2019.chr_cgatConverted.gtf' mode='w' encoding='utf-8'>
timeit_file : None
timeit_header : None
timeit_name : all
tracing : None
transcript_field_or_pattern : ID
transcript_type : mRNA
Traceback (most recent call last):
File "/home/mfl/miniconda3/envs/cgat-apps[v0.5.3]/bin/cgat", line 11, in
sys.exit(main())
File "/home/mfl/miniconda3/envs/cgat-apps[v0.5.3]/lib/python3.6/site-packages/cgat/cgat.py", line 132, in main
module.main(sys.argv)
File "/home/mfl/miniconda3/envs/cgat-apps[v0.5.3]/lib/python3.6/site-packages/cgat/tools/gff32gtf.py", line 352, in main
convert_hierarchy(chunk, second_gff_chunk, options)
File "/home/mfl/miniconda3/envs/cgat-apps[v0.5.3]/lib/python3.6/site-packages/cgat/tools/gff32gtf.py", line 193, in convert_hierarchy
options.gene_field_or_pattern, gff['ID']),
File "/home/mfl/miniconda3/envs/cgat-apps[v0.5.3]/lib/python3.6/site-packages/cgat/GTF.py", line 1043, in getitem
return self.attributes[key]
KeyError: 'ID'
(cgat-apps[v0.5.3]) mfl@XPS-MFL:~/Documents/genome_info/sorghum_bicolor$

The error message doesn't tell me much. Should I have any added criteria in order for this to work properly?

Remove R dependencies

There is a strong case to remove R dependencies from cgat-apps to simplify installation.

Below is a review where R is currently used:

GWAS.py: plotting: Manhattan plot and QQplot, hierarchical clustering.
=> GWAS.py used by several tools

WrapperIDR.py: performing IDR analysis, packacked in WrapperIDR.r
=> could be replaced by python version of IDR
=> only used by obsolete pipelines

Stats.py: FDR computation R.p_adjust, doFDR (qvalue reimplemation), R.wilcox_test,
=> doFDR could be replaced by doFDRPython (double check)
=> R.p_adjust could be replaced by methods in statsmodels
=> wilcox_test is not widely used in pipelines, could be deprecated or replaced (scipy)

Expression.py: used to run DESeq, Edger, more legacy, has been partially superceded by Counts.py.
used my runExpression and runMEDIPS as well as counts2counts and counts2table

Counts.py: used to run DESeq, Edger, PCA and plotting. Used only by counts2table and counts2counts.

Biomart.py: used to collect data from biomart using the bioconductor module.
Biomart.py is used by counts2table and various obsolete pipeline (annotations, variants, KEGG)

A suggestion to resolve this is:

  1. make a cgat-gwas repository with the GWAS module and associated tools

  2. replace FDR computation with statsmodel equivalents

  3. move Expression.py, Counts.py and Biomart.py module as well as the dependent tools:
    runExpression, counts2counts counts2table into the cgat-flow repository OR the cgat-scrnaseq repository.

cannot install cgat-apps as a CGAT developer

The installation as a cgat-developer is currently failing for me due to the curl commands in install.sh trying download conda environment files cgat-*.yml taht do not exist (yet?).

In short, there seem to be some YAML files of environment specifications missing from GitHub.

Specifically, in this else block

else

all the CONDA_INSTALL_TYPE_APPS= and CONDA_INSTALL_TYPE_CORE= instances seem to point to files (e.g., *-production/yml, *-devel.yml) that are not available yet when resolving the commands here (
curl -o env-apps.yml -O https://raw.githubusercontent.com/cgat-developers/cgat-apps/${TRAVIS_BRANCH}/conda/environments/${CONDA_INSTALL_TYPE_APPS}
)

For instance for cgat-apps, the --devel flags set CONDA_INSTALL_TYPE_APPS=apps-devel.yml
with TRAVIS_BRANCH=master, this resolves to

https://raw.githubusercontent.com/cgat-developers/cgat-apps/master/conda/environments/apps-devel.yml

while only cgat-apps.yml currently exists at the following link.
https://raw.githubusercontent.com/cgat-developers/cgat-apps/master/conda/environments/cgat-apps.yml

Those raw.githubusercontent.com point to the following folder in the GitHub repository
https://github.com/cgat-developers/cgat-apps/tree/master/conda/environments

That said, separately, I have managed to install cgat-apps as a regular user (i.e., not CGAT developer) using the recommended installation (that requires a pre-existing Miniconda installation, obviously)

conda install -c conda-forge -c bioconda cgat-apps

Any feedback welcome.

bam_vs_bed error if bashrc outputs anything

bam_vs_bed starts a subshell, excutes a command and grabs the standard output. It assumes that the output in the stdout is only the output from the bedtools intersect command it runs, but it your ~/.bashrc outputs anything to stdout (which might not be under your control on a shared system), then this screws up the output parsing and causes bam_Vs_bed to out the following cryptic message:

Traceback (most recent call last): \
    File "/shared/sudlab1/General/apps/conda/conda-install/envs/cgat-f/bin/cgat", line 11, in <module> \
       load_entry_point('cgat', 'console_scripts', 'cgat')() \
    File "/shared/sudlab1/General/apps/conda/cgat-apps/cgat/cgat.py", line 132, in main \
         module.main(sys.argv) \
    File "/shared/sudlab1/General/apps/conda/cgat-apps/cgat/tools/bam_vs_bed.py", line 256, in main \
          iterate(iotools.force_str(proc.stdout)), key=sort_key): \
    File "/shared/sudlab1/General/apps/conda/cgat-apps/cgat/tools/bam_vs_bed.py", line 253, in iterate \
          yield data._make(line[:-1].split()[:take_columns]) \
    File "<string>", line 21, in _make \
 TypeError: Expected 17 arguments, got 3 \

It there a way to avoid this althogether (change the way the enviornment is passed to the subshell? Use a named pipe rather than stdout, or use a temporary file?). If not, is there a way to at least give a better error message?

Wrong columns in `--sanitize="ucsc" --assembly-report=FILE`?

gff2gff2 --method=sanitize --assembly-report=FILE has wrong default columns as far as I can tell. As far as I can tell the ensembl name is in column 0 and the ucsc name in column 9. Why does gff2gff use column 4 by default? This contains the GenBank-Accn:

# Sequence-Name Sequence-Role   Assigned-Molecule       Assigned-Molecule-Location/Type GenBank-Accn    Relationship    RefSeq-Accn     Assembly-Unit   Sequence-Length UCSC-style-name
1       assembled-molecule      1       Chromosome      CM000663.2      =       NC_000001.11    Primary Assembly        248956422       chr1
2       assembled-molecule      2       Chromosome      CM000664.2      =       NC_000002.12    Primary Assembly        242193529       chr2
3       assembled-molecule      3       Chromosome      CM000665.2      =       NC_000003.12    Primary Assembly        198295559       chr3
4       assembled-molecule      4       Chromosome      CM000666.2      =       NC_000004.12    Primary Assembly        190214555       chr4
5       assembled-molecule      5       Chromosome      CM000667.2      =       NC_000005.10    Primary Assembly        181538259       chr5
6       assembled-molecule      6       Chromosome      CM000668.2      =       NC_000006.12    Primary Assembly        170805979       chr6
7       assembled-molecule      7       Chromosome      CM000669.2      =       NC_000007.14    Primary Assembly        159345973       chr7
8       assembled-molecule      8       Chromosome      CM000670.2      =       NC_000008.11    Primary Assembly        145138636       chr8
9       assembled-molecule      9       Chromosome      CM000671.2      =       NC_000009.12    Primary Assembly        138394717       chr9
10      assembled-molecule      10      Chromosome      CM000672.2      =       NC_000010.11    Primary Assembly        133797422       chr10
11      assembled-molecule      11      Chromosome      CM000673.2      =       NC_000011.10    Primary Assembly        135086622       chr11
12      assembled-molecule      12      Chromosome      CM000674.2      =       NC_000012.12    Primary Assembly        133275309       chr12
13      assembled-molecule      13      Chromosome      CM000675.2      =       NC_000013.11    Primary Assembly        114364328       chr13
14      assembled-molecule      14      Chromosome      CM000676.2      =       NC_000014.9     Primary Assembly        107043718       chr14
15      assembled-molecule      15      Chromosome      CM000677.2      =       NC_000015.10    Primary Assembly        101991189       chr15
16      assembled-molecule      16      Chromosome      CM000678.2      =       NC_000016.10    Primary Assembly        90338345        chr16
17      assembled-molecule      17      Chromosome      CM000679.2      =       NC_000017.11    Primary Assembly        83257441        chr17
18      assembled-molecule      18      Chromosome      CM000680.2      =       NC_000018.10    Primary Assembly        80373285        chr18
19      assembled-molecule      19      Chromosome      CM000681.2      =       NC_000019.10    Primary Assembly        58617616        chr19
20      assembled-molecule      20      Chromosome      CM000682.2      =       NC_000020.11    Primary Assembly        64444167        chr20
21      assembled-molecule      21      Chromosome      CM000683.2      =       NC_000021.9     Primary Assembly        46709983        chr21
22      assembled-molecule      22      Chromosome      CM000684.2      =       NC_000022.11    Primary Assembly        50818468        chr22
X       assembled-molecule      X       Chromosome      CM000685.2      =       NC_000023.11    Primary Assembly        156040895       chrX
Y       assembled-molecule      Y       Chromosome      CM000686.2      =       NC_000024.10    Primary Assembly        57227415        chrY

This works for human/mouse, because we first copy the content of column 0 to column 4 for lines with "assembled_molecule", but it doesn't work for genomes without assembled molecules.

Bigwig files in bam2geneprofile

If I try to do a metagene profile over a Bigwig file using, for example:

cgat bam2geneprofile --bigwigfile=bigwig.bw --gtf-file=geneset.filtered.gtf.gz --reporter=gene                         --method=intervalprofile  --normalize-transcript=total-sum  --normalize-profile=area  

I get the error:

Traceback (most recent call last):
  File "/shared/sudlab1/General/CGAT-FLOW/cgat-flow/conda-install/envs/cgat-flow/bin/cgat", line 11, in <module>
    load_entry_point('cgat', 'console_scripts', 'cgat')()
  File "/shared/sudlab1/General/CGAT-FLOW/cgat-flow/cgat-apps/cgat/cgat.py", line 132, in main
    module.main(sys.argv)
  File "/shared/sudlab1/General/projects/CGAT-FLOW/cgat-flow/cgat-apps/cgat/tools/bam2geneprofile.py", line 679, in main
    wigfiles = [BigWigFile(x) for x in options.infiles]
  File "/shared/sudlab1/General/CGAT-FLOW/cgat-flow/cgat-apps/cgat/tools/bam2geneprofile.py", line 679, in <listcomp>
    wigfiles = [BigWigFile(file=open(x)) for x in options.infiles]
NameError: name 'BigWigFile' is not defined

This is because BigWigFile is never imported. In fact, there is no BigWigFile class in pyBigWig (which is what I am assuming you are using from looking at the conda environment).

If I add import pyBigWig to the imports and change line 679 to

wigfiles = [pyBigWig.open(x) for x in options.infiles]

which is the correct way to open a bigwig in pyBigWig, then I get the error:

Traceback (most recent call last):
  File "/shared/sudlab1/General/CGAT-FLOW/cgat-flow/conda-install/envs/cgat-flow/bin/cgat", line 11, in <module>
    load_entry_point('cgat', 'console_scripts', 'cgat')()
  File "/shared/sudlab1/General/CGAT-FLOW/cgat-flow/cgat-apps/cgat/cgat.py", line 132, in main
    module.main(sys.argv)
  File "/shared/sudlab1/General/CGAT-FLOW/cgat-flow/cgat-apps/cgat/tools/bam2geneprofile.py", line 817, in main
    gtf_iterator)
  File "cgat/BamTools/geneprofile.pyx", line 1751, in cgat.BamTools.geneprofile.countFromGTF
  File "cgat/BamTools/geneprofile.pyx", line 685, in cgat.BamTools.geneprofile.IntervalsCounter.update
  File "cgat/BamTools/geneprofile.pyx", line 1610, in cgat.BamTools.geneprofile.RegionCounter.count
  File "cgat/BamTools/geneprofile.pyx", line 1046, in cgat.BamTools.geneprofile.GeneCounter.count
  File "cgat/BamTools/geneprofile.pyx", line 63, in cgat.BamTools.geneprofile.RangeCounter.getCounts
  File "cgat/BamTools/geneprofile.pyx", line 473, in cgat.BamTools.geneprofile.RangeCounterBigWig.count
AttributeError: 'pyBigWig.bigWigFile' object has no attribute 'get'

I'm not sure where to go from here?

Exception: Problem running command: conda info --json

Currently running Linux Mint 18.3 on XFCE. I believe the error message is self-explanatory.

Installing cgat script to /home/sand/cgat-install/conda-install/envs/cgat-a/bin

Installed /home/sand/cgat-install/cgat-apps
Processing dependencies for cgat==0.5.3
Finished processing dependencies for cgat==0.5.3
Traceback (most recent call last):
File "scripts/conda.py", line 47, in
(out, err) = run_command(statement)
File "scripts/conda.py", line 41, in run_command
raise Exception(issue)
Exception: Problem running command: conda info --json
Stderr was: b'/bin/bash: /home/sand/cgat-install/conda-install/envs/cgat-a/lib/libtinfo.so.5: no version >information available (required by /bin/bash)\n'

##########################################################

An error occurred in:

  • line number: 418
  • exit status: 1
  • command: python scripts/conda.py

The script will abort now. User input was:

./install.sh --devel

Please copy and paste this error and report it via Git Hub:
https://github.com/cgat-developers/cgat-apps/issues

Debugging:
CFLAGS: -I/home/sand/cgat-install/conda-install/envs/cgat-a/include -L/home/sand/cgat-install/conda-install/envs/cgat-a/lib
CPATH: -I/home/sand/cgat-install/conda-install/envs/cgat-a/include -L/home/sand/cgat-install/conda-install/envs/cgat-a/lib
C_INCLUDE_PATH: :/home/sand/cgat-install/conda-install/envs/cgat-a/include
CPLUS_INCLUDE_PATH: :/home/sand/cgat-install/conda-install/envs/cgat-a/include
LIBRARY_PATH: :/home/sand/cgat-install/conda-install/envs/cgat-a/lib
LD_LIBRARY_PATH: :/home/sand/cgat-install/conda-install/envs/cgat-a/lib:/home/sand/cgat-install/conda-install/envs/cgat-a/lib/R/lib
CGAT_HOME: /home/sand/cgat-install
CONDA_INSTALL_DIR: /home/sand/cgat-install/conda-install
CONDA_INSTALL_TYPE_APPS: apps-devel.yml
CONDA_INSTALL_TYPE_CORE: core-devel.yml
CONDA_INSTALL_ENV: cgat-a
PYTHONPATH:
INSTALL_BRANCH: master
CORE_BRANCH: master
RELEASE:
CODE_DOWNLOAD_TYPE: 0

##########################################################

Error when installing CGAT

Hi,

I am trying to install CGAT and I'm getting the following error. Can you please help?

My previous code was:
qrshx
cd ~
wget https://github.com/cgat-developers/cgat-apps/archive/master.zip
mkdir /data/$USER/cgat_install
unzip master.zip
cd cgat-apps-master
./install.sh --production --location
#########################################################

An error occurred in:

  • line number: 879
  • exit status: 1
  • command: shift 2

The script will abort now. User input was:

./install.sh --production --location

Please copy and paste this error and report it via Git Hub:
https://github.com/cgat-developers/cgat-apps/issues

Debugging:
CFLAGS:
CPATH:
C_INCLUDE_PATH:
CPLUS_INCLUDE_PATH:
LIBRARY_PATH:
LD_LIBRARY_PATH:
CGAT_HOME:
CONDA_INSTALL_DIR:
CONDA_INSTALL_TYPE_APPS:
CONDA_INSTALL_TYPE_CORE:
CONDA_INSTALL_ENV:
PYTHONPATH:
INSTALL_BRANCH: master
CORE_BRANCH: master
RELEASE:
CODE_DOWNLOAD_TYPE: 0

##########################################################

bam2stats error - AttributeError: module 'pysam.libcalignmentfile' has no attribute 'PileupColumn'

Hi,

I installed cgat-apps in a fresh Conda environment as follows:

conda create -n cgat-apps
conda activate cgat-apps
conda install -c conda-forge -c bioconda cgat-apps

I did not use mamba as this gave me an error (mamba installs python 3.9.7 by default):

Looking for: ['cgat-apps']

bioconda/noarch          [====================] (00m:00s) Done
pkgs/main/linux-64       [====================] (00m:00s) Done
bioconda/linux-64        [====================] (00m:01s) Done
pkgs/r/noarch            [====================] (00m:00s) Done
pkgs/r/linux-64          [====================] (00m:00s) Done
pkgs/main/noarch         [====================] (00m:00s) Done
conda-forge/noarch       [====================] (00m:01s) Done
conda-forge/linux-64     [====================] (00m:05s) Done

Pinned packages:
  - python 3.9.*

Encountered problems while solving:
  - package cgat-apps-0.6.4-py38hc5b2f15_0 requires python >=3.8,<3.9.0a0, but none of the providers can be installed

When trying to run bam2stats, I get the following error:

Traceback (most recent call last):
  File "/home/lucy/conda/envs/cgat-apps/bin/cgat", line 11, in <module>
    sys.exit(main())
  File "/home/lucy/conda/envs/cgat-apps/lib/python3.8/site-packages/cgat/cgat.py", line 129, in main
    module = imp.load_module(command, file, pathname, description)
  File "/home/lucy/conda/envs/cgat-apps/lib/python3.8/imp.py", line 234, in load_module
    return load_source(name, filename, file)
  File "/home/lucy/conda/envs/cgat-apps/lib/python3.8/imp.py", line 171, in load_source
    module = _load(spec)
  File "<frozen importlib._bootstrap>", line 702, in _load
  File "<frozen importlib._bootstrap>", line 671, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 843, in exec_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "/home/lucy/conda/envs/cgat-apps/lib/python3.8/site-packages/cgat/tools/bam2stats.py", line 315, in <module>
    from cgat.BamTools.bamtools import bam2stats_count
  File "cgat/BamTools/bamtools.pyx", line 1, in init cgat.BamTools.bamtools
AttributeError: module 'pysam.libcalignmentfile' has no attribute 'PileupColumn'

Do I require a different version of pysam?

conda list:

# packages in environment at /home/lucy/conda/envs/cgat-apps:
#
# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                 conda_forge    conda-forge
_openmp_mutex             4.5                       1_gnu    conda-forge
aiohttp                   3.7.4.post0      py38h497a2fe_0    conda-forge
alignlib-lite             0.3              py38h9b08285_4    bioconda
apsw                      3.36.0.r1        py38he41bca5_0    conda-forge
async-timeout             3.0.1                   py_1000    conda-forge
attrs                     21.2.0             pyhd8ed1ab_0    conda-forge
bcrypt                    3.2.0            py38h497a2fe_1    conda-forge
bedtools                  2.30.0               h7d7f7ad_2    bioconda
biopython                 1.79             py38h497a2fe_0    conda-forge
boto3                     1.19.2             pyhd8ed1ab_0    conda-forge
botocore                  1.22.2             pyhd8ed1ab_0    conda-forge
brotlipy                  0.7.0           py38h497a2fe_1001    conda-forge
bzip2                     1.0.8                h7f98852_4    conda-forge
c-ares                    1.17.2               h7f98852_0    conda-forge
ca-certificates           2021.10.8            ha878542_0    conda-forge
cachetools                4.2.4              pyhd8ed1ab_0    conda-forge
certifi                   2021.10.8        py38h578d9bd_0    conda-forge
cffi                      1.14.6           py38h3931269_1    conda-forge
cgat-apps                 0.6.4            py38hc5b2f15_1    bioconda
cgatcore                  0.6.9              pyhdfd78af_0    bioconda
chardet                   4.0.0            py38h578d9bd_1    conda-forge
charset-normalizer        2.0.0              pyhd8ed1ab_0    conda-forge
coreutils                 8.31                 h516909a_0    conda-forge
cryptography              35.0.0           py38h3e25421_1    conda-forge
cycler                    0.10.0                     py_2    conda-forge
drmaa                     0.7.9                   py_1000    conda-forge
freetype                  2.10.4               h0708190_1    conda-forge
ftputil                   5.0.1              pyhd8ed1ab_0    conda-forge
future                    0.18.2           py38h578d9bd_3    conda-forge
gevent                    21.8.0           py38h497a2fe_0    conda-forge
google-api-core           1.31.2             pyhd8ed1ab_0    conda-forge
google-auth               1.35.0             pyh6c4a22f_0    conda-forge
google-cloud-core         1.7.1              pyhd3eb1b0_0  
google-cloud-sdk          361.0.0          py38h578d9bd_0    conda-forge
google-cloud-storage      1.19.0                     py_0    conda-forge
google-crc32c             1.1.2            py38h8838a9a_0    conda-forge
google-resumable-media    2.0.3              pyh6c4a22f_0    conda-forge
googleapis-common-protos  1.53.0           py38h578d9bd_0    conda-forge
greenlet                  1.1.2            py38h709712a_0    conda-forge
grep                      3.4                  h9d02d08_1    bioconda
grpcio                    1.41.0           py38hdd6454d_0    conda-forge
htslib                    1.12                 h9093b5e_1    bioconda
idna                      3.1                pyhd3deb0d_0    conda-forge
jbig                      2.1               h7f98852_2003    conda-forge
jinja2                    3.0.2              pyhd8ed1ab_0    conda-forge
jmespath                  0.10.0             pyh9f0ad1d_0    conda-forge
joblib                    1.1.0              pyhd8ed1ab_0    conda-forge
jpeg                      9d                   h36c2ea0_0    conda-forge
kiwisolver                1.3.2            py38h1fd1430_0    conda-forge
krb5                      1.19.2               hcc1bbae_2    conda-forge
lcms2                     2.12                 hddcbb42_0    conda-forge
ld_impl_linux-64          2.36.1               hea4e1c9_2    conda-forge
lerc                      2.2.1                h9c3ff4c_0    conda-forge
libblas                   3.9.0           12_linux64_openblas    conda-forge
libcblas                  3.9.0           12_linux64_openblas    conda-forge
libcrc32c                 1.1.2                h9c3ff4c_0    conda-forge
libcurl                   7.79.1               h2574ce0_1    conda-forge
libdeflate                1.7                  h7f98852_5    conda-forge
libedit                   3.1.20191231         he28a2e2_2    conda-forge
libev                     4.33                 h516909a_1    conda-forge
libffi                    3.4.2                h9c3ff4c_4    conda-forge
libgcc-ng                 11.2.0              h1d223b6_11    conda-forge
libgfortran-ng            11.2.0              h69a702a_11    conda-forge
libgfortran5              11.2.0              h5c6108e_11    conda-forge
libgomp                   11.2.0              h1d223b6_11    conda-forge
liblapack                 3.9.0           12_linux64_openblas    conda-forge
libnghttp2                1.43.0               h812cca2_1    conda-forge
libnsl                    2.0.0                h7f98852_0    conda-forge
libopenblas               0.3.18          pthreads_h8fe5266_0    conda-forge
libpng                    1.6.37               h21135ba_2    conda-forge
libprotobuf               3.18.1               h780b84a_0    conda-forge
libsodium                 1.0.18               h36c2ea0_1    conda-forge
libssh2                   1.10.0               ha56f1ee_2    conda-forge
libstdcxx-ng              11.2.0              he4da1e4_11    conda-forge
libtiff                   4.3.0                hf544144_1    conda-forge
libuuid                   2.32.1            h7f98852_1000    conda-forge
libuv                     1.42.0               h7f98852_0    conda-forge
libwebp-base              1.2.1                h7f98852_0    conda-forge
libzlib                   1.2.11            h36c2ea0_1013    conda-forge
lz4-c                     1.9.3                h9c3ff4c_1    conda-forge
markupsafe                2.0.1            py38h497a2fe_0    conda-forge
matplotlib-base           3.4.3            py38hf4fb855_1    conda-forge
multidict                 5.2.0            py38h497a2fe_0    conda-forge
mysql-connector-c         6.1.11            h6eb9d5d_1007    conda-forge
ncurses                   6.2                  h58526e2_4    conda-forge
nomkl                     1.0                  h5ca1d4c_0    conda-forge
numpy                     1.21.3           py38he2449b9_0    conda-forge
olefile                   0.46               pyh9f0ad1d_1    conda-forge
openjpeg                  2.4.0                hb52868f_1    conda-forge
openssl                   1.1.1l               h7f98852_0    conda-forge
packaging                 21.0               pyhd8ed1ab_0    conda-forge
pandas                    1.3.4            py38h43a58ef_0    conda-forge
paramiko                  2.8.0              pyhd8ed1ab_0    conda-forge
pcre                      8.45                 h9c3ff4c_0    conda-forge
pillow                    8.3.2            py38h8e6f84c_0    conda-forge
pip                       21.3.1             pyhd8ed1ab_0    conda-forge
protobuf                  3.18.1           py38h709712a_0    conda-forge
pyasn1                    0.4.8                      py_0    conda-forge
pyasn1-modules            0.2.7                      py_0    conda-forge
pybedtools                0.8.2            py38h69e0bdc_1    bioconda
pybigwig                  0.3.18           py38h5ebd311_1    bioconda
pycparser                 2.20               pyh9f0ad1d_2    conda-forge
pynacl                    1.4.0            py38h497a2fe_2    conda-forge
pyopenssl                 21.0.0             pyhd8ed1ab_0    conda-forge
pyparsing                 2.4.7              pyh9f0ad1d_0    conda-forge
pysam                     0.17.0           py38hf7546f9_0    bioconda
pysftp                    0.2.9                      py_1    conda-forge
pysocks                   1.7.1            py38h578d9bd_3    conda-forge
python                    3.8.12          hb7a2778_2_cpython    conda-forge
python-dateutil           2.8.2              pyhd8ed1ab_0    conda-forge
python_abi                3.8                      2_cp38    conda-forge
pytz                      2021.3             pyhd8ed1ab_0    conda-forge
pyu2f                     0.1.5              pyhd8ed1ab_0    conda-forge
pyyaml                    6.0              py38h497a2fe_0    conda-forge
quicksect                 0.2.2            py38h4a8c8d9_4    bioconda
readline                  8.1                  h46c0cb4_0    conda-forge
requests                  2.26.0             pyhd8ed1ab_0    conda-forge
rsa                       4.7.2              pyh44b312d_0    conda-forge
ruffus                    2.8.4              pyh864c0ab_1    bioconda
s3transfer                0.5.0              pyhd8ed1ab_0    conda-forge
scikit-learn              1.0              py38hacb3eff_1    conda-forge
scipy                     1.7.1            py38h56a6a73_0    conda-forge
setuptools                58.2.0           py38h578d9bd_0    conda-forge
six                       1.16.0             pyh6c4a22f_0    conda-forge
sortedcontainers          2.4.0              pyhd8ed1ab_0    conda-forge
sqlalchemy                1.4.26           py38h497a2fe_0    conda-forge
sqlite                    3.36.0               h9cd32fc_2    conda-forge
threadpoolctl             3.0.0              pyh8a188c0_0    conda-forge
time                      1.8                  h516909a_0    conda-forge
tk                        8.6.11               h27826a3_1    conda-forge
tornado                   6.1              py38h497a2fe_1    conda-forge
typing-extensions         3.10.0.2             hd8ed1ab_0    conda-forge
typing_extensions         3.10.0.2           pyha770c72_0    conda-forge
ucsc-bedgraphtobigwig     377                  h0b8a92a_2    bioconda
ucsc-bedtobigbed          377                  h0b8a92a_2    bioconda
ucsc-wigtobigwig          377                  h0b8a92a_2    bioconda
urllib3                   1.26.7             pyhd8ed1ab_0    conda-forge
wheel                     0.37.0             pyhd8ed1ab_1    conda-forge
xz                        5.2.5                h516909a_1    conda-forge
yaml                      0.2.5                h516909a_0    conda-forge
yarl                      1.7.0            py38h497a2fe_0    conda-forge
zlib                      1.2.11            h36c2ea0_1013    conda-forge
zope.event                4.5.0              pyh9f0ad1d_0    conda-forge
zope.interface            5.4.0            py38h497a2fe_0    conda-forge
zstd                      1.5.0                ha95c52a_0    conda-forge

bed2fasta - extend_by doesn't work

ALthough --extend-by is listed as an option for bed2fasta, it doesn't actually appear to do anything. The option extend_by is not actually referenced anywhere in the bed2fasta code, other than to define the option.

interrupted installation - error on $_CONDA_EXE "$cmd" "$@"

I attempted installing CGAT from this new repository and encountered the following error:

[...]
# conda environments:
#
                        /home/mfl/cgat-install/conda-install
                        /home/mfl/cgat-install/conda-install/envs/cgat-s
                        /home/mfl/miniconda3
                        /home/mfl/miniconda3/envs/cgat-apps[v0.5.0]
base                  *  /home/mfl/miniconda3/envs/cgat-apps[v0.5.0]/cgat-install/conda-install
                        /home/mfl/miniconda3/envs/cgat[v0.3.3]
                        /home/mfl/miniconda3/envs/fastqc[v0.11.7]
                        /home/mfl/miniconda3/envs/rcorrector[v1.0.3]
                        /home/mfl/miniconda3/envs/trim_galore[v0.5.0]

sys.version: 3.7.0 | packaged by conda-forge | (defau...
sys.prefix: /home/mfl/miniconda3/envs/cgat-apps[v0.5.0]/cgat-install/conda-install
sys.executable: /home/mfl/miniconda3/envs/cgat-apps[v0.5.0]/cgat-install/conda-install/bin/python
conda location: /home/mfl/miniconda3/envs/cgat-apps[v0.5.0]/cgat-install/conda-install/lib/python3.7/site-packages/conda
conda-build: None
conda-env: /home/mfl/miniconda3/envs/cgat-apps[v0.5.0]/cgat-install/conda-install/bin/conda-env
user site dirs: 

CIO_TEST: 
CONDA_DEFAULT_ENV: base
CONDA_EXE: /home/mfl/miniconda3/envs/cgat-apps[v0.5.0]/cgat-install/conda-install/bin/conda
CONDA_PREFIX: /home/mfl/miniconda3/envs/cgat-apps[v0.5.0]/cgat-install/conda-install
CONDA_PROMPT_MODIFIER: (base) 
CONDA_PYTHON_EXE: /home/mfl/miniconda3/envs/cgat-apps[v0.5.0]/cgat-install/conda-install/bin/python
CONDA_ROOT: /home/mfl/miniconda3/envs/cgat-apps[v0.5.0]/cgat-install/conda-install
CONDA_SHLVL: 1
DEFAULTS_PATH: /usr/share/gconf/ubuntu.default.path
MANDATORY_PATH: /usr/share/gconf/ubuntu.mandatory.path
PATH: /home/mfl/miniconda3/envs/cgat-apps[v0.5.0]/cgat-install/conda-install/bin:/home/mfl/miniconda3/bin:/home/mfl/bin:/home/mfl/.local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin
REQUESTS_CA_BUNDLE: 
SSL_CERT_FILE: 
XDG_SEAT_PATH: /org/freedesktop/DisplayManager/Seat0
XDG_SESSION_PATH: /org/freedesktop/DisplayManager/Session0


WARNING: could not import _license.show_info
# try:
# $ conda install -n root _license
# install-CGAT-tools.sh log | XPS-MFL | Sex Set 21 14:21:52 BRT 2018 | installing CGAT environment 
 % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                Dload  Upload   Total   Spent    Left  Speed
100   916  100   916    0     0   1287      0 --:--:-- --:--:-- --:--:--  1286
 % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                Dload  Upload   Total   Spent    Left  Speed
100   497  100   497    0     0   1628      0 --:--:-- --:--:-- --:--:--  1629
Solving environment: ...working... done

At which point the processing was kept with no feedback and using the 'top' process manager did not seem to indicate
anything running under cgat or cgat-apps names, so I interrupted the process with CTRL+C. I somehow needed to do it twice, and then the process encountered the errors, also apparently twice.


^C^C
##########################################################

An error occurred in:

- line number: 1
- exit status: 0
- command: $_CONDA_EXE "$cmd" "$@"

The script will abort now. User input was: 

./install-CGAT-tools.sh --devel --location /home/mfl/miniconda3/envs/cgat-apps[v0.5.0]/cgat-install

Please copy and paste this error and report it via Git Hub: 
https://github.com/cgat-developers/cgat-apps/issues 

Debugging: 
CFLAGS: 
CPATH: 
C_INCLUDE_PATH: 
CPLUS_INCLUDE_PATH: 
LIBRARY_PATH: 
LD_LIBRARY_PATH: 
CGAT_HOME: /home/mfl/miniconda3/envs/cgat-apps[v0.5.0]/cgat-install
CONDA_INSTALL_DIR: /home/mfl/miniconda3/envs/cgat-apps[v0.5.0]/cgat-install/conda-install
CONDA_INSTALL_TYPE_APPS: apps-devel.yml
CONDA_INSTALL_TYPE_CORE: core-devel.yml
CONDA_INSTALL_ENV: cgat-a
PYTHONPATH: 
INSTALL_BRANCH: master
CORE_BRANCH: master
RELEASE: 
CODE_DOWNLOAD_TYPE: 0

########################################################## 
Solving environment: ...working... done
^C
CondaError: KeyboardInterrupt


########################################################## 

An error occurred in:

- line number: 1
- exit status: 1
- command: $_CONDA_EXE "$cmd" "$@"

The script will abort now. User input was: 

./install-CGAT-tools.sh --devel --location /home/mfl/miniconda3/envs/cgat-apps[v0.5.0]/cgat-install

Please copy and paste this error and report it via Git Hub: 
https://github.com/cgat-developers/cgat-apps/issues 

Debugging: 
CFLAGS: 
CPATH: 
C_INCLUDE_PATH: 
CPLUS_INCLUDE_PATH: 
LIBRARY_PATH: 
LD_LIBRARY_PATH: 
CGAT_HOME: /home/mfl/miniconda3/envs/cgat-apps[v0.5.0]/cgat-install
CONDA_INSTALL_DIR: /home/mfl/miniconda3/envs/cgat-apps[v0.5.0]/cgat-install/conda-install
CONDA_INSTALL_TYPE_APPS: apps-devel.yml
CONDA_INSTALL_TYPE_CORE: core-devel.yml
CONDA_INSTALL_ENV: cgat-a
PYTHONPATH: 
INSTALL_BRANCH: master
CORE_BRANCH: master
RELEASE: 
CODE_DOWNLOAD_TYPE: 0

########################################################## 

########################################################## 

An error occurred in:

- line number: 97
- exit status: 1
- command: $_CONDA_EXE "$cmd" "$@"

The script will abort now. User input was: 

./install-CGAT-tools.sh --devel --location /home/mfl/miniconda3/envs/cgat-apps[v0.5.0]/cgat-install

Please copy and paste this error and report it via Git Hub: 
https://github.com/cgat-developers/cgat-apps/issues 

Debugging: 
CFLAGS: 
CPATH: 
C_INCLUDE_PATH: 
CPLUS_INCLUDE_PATH: 
LIBRARY_PATH: 
LD_LIBRARY_PATH: 
CGAT_HOME: /home/mfl/miniconda3/envs/cgat-apps[v0.5.0]/cgat-install
CONDA_INSTALL_DIR: /home/mfl/miniconda3/envs/cgat-apps[v0.5.0]/cgat-install/conda-install
CONDA_INSTALL_TYPE_APPS: apps-devel.yml
CONDA_INSTALL_TYPE_CORE: core-devel.yml
CONDA_INSTALL_ENV: cgat-a
PYTHONPATH: 
INSTALL_BRANCH: master
CORE_BRANCH: master
RELEASE: 
CODE_DOWNLOAD_TYPE: 0

########################################################## 

pip install cgat does not install dependencies

It appears the pypi hosted version is not automatically checking for missing dependencies

Here is an example:

jordan-berg:tests jordan$ cgat gtf2gtf
Traceback (most recent call last):
  File "/Users/jordan/miniconda/bin/cgat", line 10, in <module>
    sys.exit(main())
  File "/Users/jordan/miniconda/lib/python3.7/site-packages/cgat/cgat.py", line 129, in main
    module = imp.load_module(command, file, pathname, description)
  File "/Users/jordan/miniconda/lib/python3.7/imp.py", line 234, in load_module
    return load_source(name, filename, file)
  File "/Users/jordan/miniconda/lib/python3.7/imp.py", line 171, in load_source
    module = _load(spec)
  File "<frozen importlib._bootstrap>", line 696, in _load
  File "<frozen importlib._bootstrap>", line 677, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 728, in exec_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "/Users/jordan/miniconda/lib/python3.7/site-packages/cgat/tools/gtf2gtf.py", line 285, in <module>
    import cgat.GTF as GTF
  File "/Users/jordan/miniconda/lib/python3.7/site-packages/cgat/GTF.py", line 41, in <module>
    from cgat import IndexedGenome as IndexedGenome
  File "/Users/jordan/miniconda/lib/python3.7/site-packages/cgat/IndexedGenome.py", line 43, in <module>
    import quicksect
ModuleNotFoundError: No module named 'quicksect'

There are several other cases too, such as with cgatcore,cython, and paramiko to name a few

Tests failing

The tests for the following scripts are currently failing:

  • bam2geneprofile
  • bam2peakshape
  • runGO
  • runGSEA

Error with cgat bam2stats (ImportError: libhts.so.2: cannot open shared object file: No such file or directory)

Hi,

I installed cgat-apps using Conda. When I run cgat --help this works as expected, however when I run cgat bam2stats --help, I get the following error:

File "/user/lucy/conda-install/envs/rna-seq_2021/lib/python3.6/site-packages/pysam/init.py", line 5, in
from pysam.libchtslib import *
ImportError: libhts.so.2: cannot open shared object file: No such file or directory

This occurs even in a fresh Conda environment where I explicitly installed only cgat-apps and the error is not specific to bam2stats.

Any thoughts on what might be causing this issue?

Best wishes,
Lucy

# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                 conda_forge    conda-forge
_openmp_mutex             4.5                       1_gnu    conda-forge
aiohttp                   3.7.4            py36h8f6f2f9_0    conda-forge
alignlib-lite             0.3              py36h9f6a78e_4    bioconda
apsw                      3.34.0.r1        py36hb6bac25_0    conda-forge
async-timeout             3.0.1                   py_1000    conda-forge
attrs                     20.3.0             pyhd3deb0d_0    conda-forge
bcftools                  1.8                  h4da6232_3    bioconda
bcrypt                    3.2.0            py36h8f6f2f9_1    conda-forge
bedtools                  2.30.0               h7d7f7ad_1    bioconda
biopython                 1.78             py36h8f6f2f9_2    conda-forge
boto3                     1.17.51            pyhd8ed1ab_0    conda-forge
botocore                  1.20.51            pyhd8ed1ab_0    conda-forge
brotlipy                  0.7.0           py36h8f6f2f9_1001    conda-forge
bzip2                     1.0.8                h7f98852_4    conda-forge
c-ares                    1.17.1               h7f98852_1    conda-forge
ca-certificates           2020.12.5            ha878542_0    conda-forge
cachetools                4.2.1              pyhd8ed1ab_0    conda-forge
certifi                   2020.12.5        py36h5fab9bb_1    conda-forge
cffi                      1.14.5           py36hc120d54_0    conda-forge
cgat-apps                 0.6.0            py36h448cc22_1    bioconda
cgatcore                  0.6.7                      py_0    bioconda
chardet                   4.0.0            py36h5fab9bb_1    conda-forge
coreutils                 8.25                          1    bioconda
cryptography              3.4.7            py36hb60f036_0    conda-forge
curl                      7.76.0               h979ede3_0    conda-forge
cycler                    0.10.0                     py_2    conda-forge
drmaa                     0.7.9                   py_1000    conda-forge
freetype                  2.10.4               h0708190_1    conda-forge
ftputil                   3.2                      py36_0    bioconda
future                    0.18.2           py36h5fab9bb_3    conda-forge
gevent                    1.1rc4                   py36_0    bioconda
google-api-core           1.26.2             pyhd8ed1ab_0    conda-forge
google-auth               1.28.0             pyh44b312d_0    conda-forge
google-cloud-core         1.5.0              pyhd3deb0d_0    conda-forge
google-cloud-sdk          336.0.0          py36h5fab9bb_0    conda-forge
google-cloud-storage      1.19.0                     py_0    conda-forge
google-crc32c             1.1.2            py36h0208b43_0    conda-forge
google-resumable-media    1.2.0              pyhd3deb0d_0    conda-forge
googleapis-common-protos  1.53.0           py36h5fab9bb_0    conda-forge
greenlet                  1.0.0            py36hc4f0c31_0    conda-forge
grep                      3.4                  h9d02d08_1    bioconda
grpcio                    1.37.0           py36h8e87921_0    conda-forge
htslib                    1.7                           0    bioconda
idna                      2.10               pyh9f0ad1d_0    conda-forge
idna_ssl                  1.1.0           py36h9f0ad1d_1001    conda-forge
importlib-metadata        3.10.1           py36h5fab9bb_0    conda-forge
jmespath                  0.10.0             pyh9f0ad1d_0    conda-forge
jpeg                      9d                   h36c2ea0_0    conda-forge
kiwisolver                1.3.1            py36h605e78d_1    conda-forge
krb5                      1.17.2               h926e7f8_0    conda-forge
lcms2                     2.12                 hddcbb42_0    conda-forge
ld_impl_linux-64          2.35.1               hea4e1c9_2    conda-forge
libblas                   3.9.0                8_openblas    conda-forge
libcblas                  3.9.0                8_openblas    conda-forge
libcrc32c                 1.1.1                h9c3ff4c_2    conda-forge
libcurl                   7.76.0               hc4aaa36_0    conda-forge
libdeflate                1.2                  h516909a_1    bioconda
libedit                   3.1.20191231         he28a2e2_2    conda-forge
libev                     4.33                 h516909a_1    conda-forge
libffi                    3.3                  h58526e2_2    conda-forge
libgcc                    7.2.0                h69d50b8_2    conda-forge
libgcc-ng                 9.3.0               h2828fa1_19    conda-forge
libgfortran-ng            9.3.0               hff62375_19    conda-forge
libgfortran5              9.3.0               hff62375_19    conda-forge
libgomp                   9.3.0               h2828fa1_19    conda-forge
liblapack                 3.9.0                8_openblas    conda-forge
libnghttp2                1.43.0               h812cca2_0    conda-forge
libopenblas               0.3.12          pthreads_h4812303_1    conda-forge
libpng                    1.6.37               h21135ba_2    conda-forge
libprotobuf               3.15.8               h780b84a_0    conda-forge
libsodium                 1.0.18               h36c2ea0_1    conda-forge
libssh2                   1.9.0                ha56f1ee_6    conda-forge
libstdcxx-ng              9.3.0               h6de172a_19    conda-forge
libtiff                   4.2.0                hdc55705_0    conda-forge
libuuid                   2.32.1            h7f98852_1000    conda-forge
libwebp-base              1.2.0                h7f98852_2    conda-forge
lz4-c                     1.9.3                h9c3ff4c_0    conda-forge
lzo                       2.10              h516909a_1000    conda-forge
matplotlib-base           3.3.4            py36hd391965_0    conda-forge
multidict                 5.1.0            py36h8f6f2f9_1    conda-forge
mysql-connector-c         6.1.6                         2    bioconda
ncurses                   6.2                  h58526e2_4    conda-forge
nomkl                     1.0                  h5ca1d4c_0    conda-forge
numpy                     1.19.5           py36h2aa4a07_1    conda-forge
olefile                   0.46               pyh9f0ad1d_1    conda-forge
openjpeg                  2.4.0                hf7af979_0    conda-forge
openssl                   1.1.1k               h7f98852_0    conda-forge
packaging                 20.9               pyh44b312d_0    conda-forge
pandas                    1.1.5            py36h284efc9_0    conda-forge
paramiko                  2.7.2              pyh9f0ad1d_0    conda-forge
pcre                      8.44                 he1b5a44_0    conda-forge
pillow                    8.1.2            py36ha6010c0_1    conda-forge
pip                       21.0.1             pyhd8ed1ab_0    conda-forge
protobuf                  3.15.8           py36hc4f0c31_0    conda-forge
pyasn1                    0.4.8                      py_0    conda-forge
pyasn1-modules            0.2.7                      py_0    conda-forge
pybedtools                0.7.10           py36ha92aebf_3    bioconda
pybigwig                  0.3.18           py36h0c3496a_1    bioconda
pycparser                 2.20               pyh9f0ad1d_2    conda-forge
pynacl                    1.4.0            py36h8f6f2f9_2    conda-forge
pyopenssl                 20.0.1             pyhd8ed1ab_0    conda-forge
pyparsing                 2.4.7              pyh9f0ad1d_0    conda-forge
pysam                     0.14.1           py36hae42fb6_1    bioconda
pysftp                    0.2.9                    py36_0    bioconda
pysocks                   1.7.1            py36h5fab9bb_3    conda-forge
python                    3.6.13          hffdb5ce_0_cpython    conda-forge
python-dateutil           2.8.1                      py_0    conda-forge
python-lzo                1.12            py36hbaba66d_1003    conda-forge
python_abi                3.6                     1_cp36m    conda-forge
pytz                      2021.1             pyhd8ed1ab_0    conda-forge
pyyaml                    5.4.1            py36h8f6f2f9_0    conda-forge
quicksect                 0.2.2            py36hc5360cc_4    bioconda
readline                  8.0                  he28a2e2_2    conda-forge
requests                  2.25.1             pyhd3deb0d_0    conda-forge
rsa                       3.1.4                    py36_0    bioconda
ruffus                    2.8.4              pyh864c0ab_1    bioconda
s3transfer                0.3.7              pyhd8ed1ab_0    conda-forge
samtools                  1.7                           1    bioconda
scipy                     1.5.3            py36h9e8f40b_0    conda-forge
setuptools                49.6.0           py36h5fab9bb_3    conda-forge
six                       1.15.0             pyh9f0ad1d_0    conda-forge
sortedcontainers          2.3.0              pyhd8ed1ab_0    conda-forge
sqlalchemy                1.4.7            py36h8f6f2f9_0    conda-forge
sqlite                    3.35.4               h74cdb3f_0    conda-forge
time                      1.8                  h516909a_0    conda-forge
tk                        8.6.10               h21135ba_1    conda-forge
tornado                   6.1              py36h8f6f2f9_1    conda-forge
typing-extensions         3.7.4.3                       0    conda-forge
typing_extensions         3.7.4.3                    py_0    conda-forge
ucsc-bedgraphtobigwig     357                           1    bioconda
ucsc-bedtobigbed          357                           1    bioconda
ucsc-wigtobigwig          357                           1    bioconda
urllib3                   1.26.4             pyhd8ed1ab_0    conda-forge
wheel                     0.36.2             pyhd3deb0d_0    conda-forge
xz                        5.2.5                h516909a_1    conda-forge
yaml                      0.2.5                h516909a_0    conda-forge
yarl                      1.6.3            py36h8f6f2f9_1    conda-forge
zipp                      3.4.1              pyhd8ed1ab_0    conda-forge
zlib                      1.2.11            h516909a_1010    conda-forge
zstd                      1.4.9                ha95c52a_0    conda-forge

PeakShapeResult has no attribute `hist`

The PeakShapeResult class is defined in bamtools.peakshape thus:

PeakShapeResult = collections.namedtuple(
    "PeakShapeResult",
    "interval_width npeaks "
    "peak_center peak_width peak_height peak_relative_pos "
    "nreads "
    "median closest_half_height furthest_halfheight "
    "bins counts" )

however, in several places bam2peakshape.py refers to the .hist property of a PeakShapeResult or PeakShapeCounts object. Looking at the code, I'm getting the feeling that what is now counts was once called hist, but when it was renamed, not all instances of hist were caught.

This means that if bam2peakshape is called with the argument --use-strand the error:

AttributeError: 'PeakShapeResult' object has no attribute 'hist'

will be thrown. The correction should be simple, but this really should also be covered by a test. I would do it now, but I've got a butt load of lectures to write before next week. Maybe I'll get round it later.

missing sortedcontainers module

It seems that when you install cgat-apps from conda it is missing the sortedcontainers module.

It is within the yml file but isnt picked up in the environment. Is this a conda issue or something to do with cgat-apps yml?

bam2bam and log redirection

bam2bam throws an error if the log files is not redirected, but the output is sent to stdout.

This is presumably because a bam file with a log in the front is not a valid bam.

However, this happens even if the log level is set to 0.

This is how bam2bam is used in cgatflow.cgatpipelines.tasks.addPseudoSequenceQuality

Should probably throw an error if log is not redictered and verbosity > 0

ImportError: /lib/python3.6/site-packages/cgat/BamTools/bamtools.cpython-36m-x86_64-linux-gnu.so: undefined symbol: bam_read1

Hi,

I am trying to make a new release for cgat-apps in bioconda where I include sortedcontainers, following up with #42

The updated recipe is in a PR here bioconda/bioconda-recipes#15752

I have now added a new test in the recipe to check for cgat bam2bed -h after building the cgat-apps package, and I am getting the error message:
ImportError: /lib/python3.6/site-packages/cgat/BamTools/bamtools.cpython-36m-x86_64-linux-gnu.so: undefined symbol: bam_read1

I have tried a few options to make it work but I don't find a solution and I was wondering whether I could ask @AndreasHeger to have a look.

The issue seems to be when cythonizing bamtools. The strange thing is that cython does not give any errors in compilation time, but it fails at runtime. Moreover, it works on our Jenkins instance but it does not work on the bioconda container, and I think the only differences between the two environments are the C/C++ compilers (i.e. pysam, cython, etc. are on the same version)

For our reference, I had a look at:

Any ideas?

Best regards,
Sebastian

Problem while trying to filter gtf file by longest transcript

First of all, i downloaded Araport11 from www.arabidopsis.org
Then I used cgat-apps to generate a gtf file out of the file i downloaded
$ cat Araport11_GFF3_genes_transposons.201606.gff | cgat gff32gtf > Araport.gtf
and had no problems.

I saw that input gtf files must be sorted. so i sorted Araport.gtf:
cat Araport.gtf | cgat gtf2gtf > Araport_sort.gtf --method=sort --sort-order=gene+transcript

and then, I filtered the sorted file by longest_transcript:
cat Araport_sort.gtf | cgat gtf2gtf > Araport_longest.gtf --method=filter --filter-method=longest-transcript

the problem is that gtf2gtf is not always selecting the transcript with the bigger length. On gene AT1G01030, for example, the sum of the length of all exons in transcript AT1G01030.1 is bigger than that of AT1G01030.2, but on Araport_longest.gtf the AT1G01030.2 transcript is selected.

Here is the AT1G01030 gene in the original Araport_sort.gtf:

Chr1 Araport11 three_prime_UTR 11649 11863 . - . gene_id "AT1G01030"; transcript_id "AT1G01030.1"; ID "AT1G01030:three_prime_UTR:1"; Parent "AT1G01030.2,AT1G01030.1"; Name "NGA3:three_prime_UTR:1";
Chr1 Araport11 exon 11649 13173 . - . gene_id "AT1G01030"; transcript_id "AT1G01030.1"; ID "AT1G01030:exon:3"; Parent "AT1G01030.1"; Name "NGA3:exon:3";
Chr1 Araport11 CDS 11864 12940 . - 0 gene_id "AT1G01030"; transcript_id "AT1G01030.1"; ID "AT1G01030:CDS:2"; Parent "AT1G01030.1"; Name "NGA3:CDS:2";
Chr1 Araport11 five_prime_UTR 12941 13173 . - . gene_id "AT1G01030"; transcript_id "AT1G01030.1"; ID "AT1G01030:five_prime_UTR:2"; Parent "AT1G01030.2,AT1G01030.1"; Name "NGA3:five_prime_UTR:2";
Chr1 Araport11 exon 13335 13714 . - . gene_id "AT1G01030"; transcript_id "AT1G01030.1"; ID "AT1G01030:exon:1"; Parent "AT1G01030.2,AT1G01030.1"; Name "NGA3:exon:1";
Chr1 Araport11 five_prime_UTR 13335 13714 . - . gene_id "AT1G01030"; transcript_id "AT1G01030.1"; ID "AT1G01030:five_prime_UTR:1"; Parent "AT1G01030.2,AT1G01030.1"; Name "NGA3:five_prime_UTR:1";
Chr1 Araport11 exon 11649 12354 . - . gene_id "AT1G01030"; transcript_id "AT1G01030.2"; ID "AT1G01030:exon:4"; Parent "AT1G01030.2"; Name "NGA3:exon:4";
Chr1 Araport11 three_prime_UTR 11649 11863 . - . gene_id "AT1G01030"; transcript_id "AT1G01030.2"; ID "AT1G01030:three_prime_UTR:1"; Parent "AT1G01030.2,AT1G01030.1"; Name "NGA3:three_prime_UTR:1";
Chr1 Araport11 CDS 11864 12354 . - 2 gene_id "AT1G01030"; transcript_id "AT1G01030.2"; ID "AT1G01030:CDS:3"; Parent "AT1G01030.2"; Name "NGA3:CDS:3";
Chr1 Araport11 CDS 12424 12940 . - 0 gene_id "AT1G01030"; transcript_id "AT1G01030.2"; ID "AT1G01030:CDS:1"; Parent "AT1G01030.2"; Name "NGA3:CDS:1";
Chr1 Araport11 exon 12424 13173 . - . gene_id "AT1G01030"; transcript_id "AT1G01030.2"; ID "AT1G01030:exon:2"; Parent "AT1G01030.2"; Name "NGA3:exon:2";
Chr1 Araport11 five_prime_UTR 12941 13173 . - . gene_id "AT1G01030"; transcript_id "AT1G01030.2"; ID "AT1G01030:five_prime_UTR:2"; Parent "AT1G01030.2,AT1G01030.1"; Name "NGA3:five_prime_UTR:2";
Chr1 Araport11 exon 13335 13714 . - . gene_id "AT1G01030"; transcript_id "AT1G01030.2"; ID "AT1G01030:exon:1"; Parent "AT1G01030.2,AT1G01030.1"; Name "NGA3:exon:1";
Chr1 Araport11 five_prime_UTR 13335 13714 . - . gene_id "AT1G01030"; transcript_id "AT1G01030.2"; ID "AT1G01030:five_prime_UTR:1"; Parent "AT1G01030.2,AT1G01030.1"; Name "NGA3:five_prime_UTR:1";

And here is the same gene on the Araport_longest.gtf:

Chr1 Araport11 exon 11649 12354 . - . gene_id "AT1G01030"; transcript_id "AT1G01030.2"; ID "AT1G01030:exon:4"; Parent "AT1G01030.2"; Name "NGA3:exon:4";
Chr1 Araport11 CDS 11864 12354 . - 2 gene_id "AT1G01030"; transcript_id "AT1G01030.2"; ID "AT1G01030:CDS:3"; Parent "AT1G01030.2"; Name "NGA3:CDS:3";
Chr1 Araport11 CDS 12424 12940 . - 0 gene_id "AT1G01030"; transcript_id "AT1G01030.2"; ID "AT1G01030:CDS:1"; Parent "AT1G01030.2"; Name "NGA3:CDS:1";
Chr1 Araport11 exon 12424 13173 . - . gene_id "AT1G01030"; transcript_id "AT1G01030.2"; ID "AT1G01030:exon:2"; Parent "AT1G01030.2"; Name "NGA3:exon:2";
Chr1 Araport11 exon 13335 13714 . - . gene_id "AT1G01030"; transcript_id "AT1G01030.2"; ID "AT1G01030:exon:1"; Parent "AT1G01030.2,AT1G01030.1"; Name "NGA3:exon:1";

Bed2psl addition

Seems like bed2psl is a used script outside of Cgat so we should probably add it back into the repository

module 'cgatcore.experiment' has no attribute 'ArgumentParser'

From a user on biostars:

Thanks very much for this! However, I'm struggling to find documentation on the gtf2gtf tool in cgat, and in particular the flags that the tool accepts. --help gives an error:
cgat gtf2gtf --help Traceback (most recent call last): 
  File "/home/oates_binf_1/software/miniconda3/bin/cgat", line 11, in <module> sys.exit(main()) 
  File "/home/oates_binf_1/software/miniconda3/lib/python3.7/site-packages/cgat/cgat.py", line 132, in main module.main(sys.argv) 
  File "/home/oates_binf_1/software/miniconda3/lib/python3.7/site-packages/cgat/tools/gtf2gtf.py", line 362, in main parser = E.ArgumentParser(description=__doc__) 
AttributeError: module 'cgatcore.experiment' has no attribute 'ArgumentParser'

From someone that has just done conda install -c bioconda cgat-apps

https://www.biostars.org/p/426276/#426468

Is this another casualty of switching from ArgumentParser to OptionParser?

Is there a version mismatch between cgatcore and cgat-apps?

RuntimeError: generator raised StopIteration

Hi,

After releasing version 0.5.4, it looks like Jenkins tests are failing for:

  • cgat/tools/bed2bed.py, line 263
  • cgat/IndexedFasta.py, line 275

Both throwing: RuntimeError: generator raised StopIteration

The major change with this release is that we are using Python 3.7 instead of Python 3.6.

I think we are hitting this:
https://stackoverflow.com/questions/51700960/runtimeerror-generator-raised-stopiteration-every-time-i-try-to-run-app

Any suggestions?

Best regards,
Sebastian

pandas 1.0

Hi,

After upgrading pandas from 0.25 to 1.0 tests are failing for cgat-apps.

Please check both Jenkins and Travis outputs to see the error messages.

Is there an easy fix or do we prefer to pin pandas to <1.0 in conda environments?

Best regards,
Sebastian

Argparser compatibility code error

File "***/cgat-developers-v2/cgat-apps/cgat/tools/bam2geneprofile.py", line 599, in main
infile, gtf = args
TypeError: 'Namespace' object is not iterable

cythoning CGAT/BamTools/bamtools.pyx fails

Hello,
I'm doing a manual installation with
python setup.py develop

which fails when trying to compile with:

running develop
running egg_info
writing dependency_links to CGAT.egg-info/dependency_links.txt
writing entry points to CGAT.egg-info/entry_points.txt
writing CGAT.egg-info/PKG-INFO
writing requirements to CGAT.egg-info/requires.txt
writing top-level names to CGAT.egg-info/top_level.txt
reading manifest file 'CGAT.egg-info/SOURCES.txt'
reading manifest template 'MANIFEST.in'
warning: no files found matching 'CGAT/tools/cgat.py'
warning: no files found matching 'CGAT/tools/version.py'
writing manifest file 'CGAT.egg-info/SOURCES.txt'
running build_ext
skipping 'CGAT/Components/Components.cpp' Cython extension (up-to-date)
skipping 'CGAT/NCL/cnestedlist.c' Cython extension (up-to-date)
skipping 'CGAT/Timeseries/cmetrics.c' Cython extension (up-to-date)
skipping 'CGAT/GeneModelAnalysis.c' Cython extension (up-to-date)
cythoning CGAT/BamTools/bamtools.pyx to CGAT/BamTools/bamtools.c
warning: CGAT/BamTools/bamtools.pyx:812:31: local variable 'alignment_details_view' referenced before assignment
warning: CGAT/BamTools/bamtools.pyx:813:31: local variable 'alignment_details_view' referenced before assignment
warning: CGAT/BamTools/bamtools.pyx:814:31: local variable 'alignment_details_view' referenced before assignment
warning: CGAT/BamTools/bamtools.pyx:815:31: local variable 'alignment_details_view' referenced before assignment
warning: CGAT/BamTools/bamtools.pyx:816:31: local variable 'alignment_details_view' referenced before assignment
warning: CGAT/BamTools/bamtools.pyx:817:31: local variable 'alignment_details_view' referenced before assignment
warning: CGAT/BamTools/bamtools.pyx:818:31: local variable 'alignment_details_view' referenced before assignment
warning: CGAT/BamTools/bamtools.pyx:819:31: local variable 'alignment_details_view' referenced before assignment
warning: CGAT/BamTools/bamtools.pyx:820:31: local variable 'alignment_details_view' referenced before assignment
warning: CGAT/BamTools/bamtools.pyx:821:31: local variable 'alignment_details_view' referenced before assignment
warning: CGAT/BamTools/bamtools.pyx:822:31: local variable 'alignment_details_view' referenced before assignment

Error compiling Cython file:
------------------------------------------------------------
...
    cdef int c_min_insert_size = min_insert_size
    cdef int start, end, xstart, xend
    cdef int take_columns = 6

    # point to array of contig lengths
    cdef uint32_t *contig_sizes = input_samfile.header.ptr.target_len
                                                     ^
------------------------------------------------------------

CGAT/BamTools/bamtools.pyx:1991:54: Object of type 'bam_hdr_t' has no attribute 'ptr'

Error compiling Cython file:
------------------------------------------------------------
...
    cdef int c_min_insert_size = min_insert_size
    cdef int start, end, xstart, xend
    cdef int take_columns = 6

    # point to array of contig lengths
    cdef uint32_t *contig_sizes = input_samfile.header.ptr.target_len
                                                         ^
------------------------------------------------------------

CGAT/BamTools/bamtools.pyx:1991:58: Cannot convert Python object to 'uint32_t *'
building 'CGAT.BamTools.bamtools' extension
gcc -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/Users/antoniob/anaconda/envs/cgat-core/include -arch x86_64 -I/Users/antoniob/anaconda/envs/cgat-core/lib/python3.5/site-packages/pysam -I/Users/antoniob/anaconda/envs/cgat-core/lib/python3.5/site-packages/pysam/include/htslib -I/Users/antoniob/anaconda/envs/cgat-core/lib/python3.5/site-packages/pysam/include/samtools -I/Users/antoniob/anaconda/envs/cgat-core/lib/python3.5/site-packages/numpy/core/include -I/Users/antoniob/anaconda/envs/cgat-core/include/python3.5m -c CGAT/BamTools/bamtools.c -o build/temp.macosx-10.9-x86_64-3.5/CGAT/BamTools/bamtools.o
CGAT/BamTools/bamtools.c:1:2: error: Do not use this file, it is the result of a failed Cython compilation.
#error Do not use this file, it is the result of a failed Cython compilation.
 ^
1 error generated.
error: command 'gcc' failed with exit status 1

Best,
Antonio

bam2geneprofile - intervals at the edge of contigs

geneprofile.GeneCounter.count includes a check that when you calculate the flanks of an interval/gene, you don't accidentally start asking for negative coordinates:

https://github.com/cgat-developers/cgat-apps/blob/master/cgat/BamTools/geneprofile.pyx#L1039..L1044

This is a problem for two reasons:

  1. If the GTF entry starts at 0, then the calculated range will be (0,0), which is a 0 length range and causes the dependent RangeCounters (e.g. RangeCounterBigWig) to error
  2. Upstream and downsteam flanks are treated differently - a range could fall off the 3' end of the contig and that wouldn't be caught, but if it falls off the 5' end it would.

What do people think is the best solution for this? I was thinking of putting in a check for 0 length ranges in the RangeCounters, or wen the RangeCounter is called, which would solve 1, but not 2 above, but would allow invalid GTF entries to pass through silently (maybe a good thing, maybe not. Maybe should pass with a warning?).

Alternatively zero length flanks could be skipped at the point flaks are calculated. This would mean that invalid GTF entries themselves would still cause an error (for better or worst).

All these cases suggest that there should be some sort of flanking interval where possible even when it can't be full length, and only skip when its unavoidable. But if you are normally takein 200 samples out of 1kb, you add a 10bp flank, you are that regions is going to have very different properties from all the others.

None of these solutions deal with flanks going off the end of contigs on the 3' end.

Error in cgat fasta2bed

Hi @Acribbs

I found that the output of fasta2bed always missed the first ungapped segment. For example;

cat test.fa 
>1
ATGCNATG
cat test.fa | cgat fasta2bed -m ungapped
1	5	8	1	0.0000

Best wishes,
Zheng zhuqing

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.